Advice on Database Design for Home Automation

Hi,

I have just discovered node red and it seems fantastic. I am trying to put together a smart home system and it seems really quick to do development. At the moment I am just in the process of setting up various sensors (dht22, bme280, ds18b20, etc) around the house (the smart bit will come later). I want to store the values to a db (I have been just using python up until now to insert into sqlite tables for each sensor). I am now ready to make the database a bit more coherent and I was wondering if anyone could give me some feedback on the design before I put it into action (I know it is not exactly a node red question but I think that lots of people here are probably doing similar stuff). Anyway here is the structure that I am proposing. It is basically a relational database to store all values in one table with references to regions and sensors in the others. I am pretty new to db design so would appreciated any feedback before going too much further.

Thanks in advance for reading,

Martyn

DB design:

-- Table: tbl_measurement
CREATE TABLE tbl_measurement (id INTEGER PRIMARY KEY AUTOINCREMENT, measurement VARCHAR (15) NOT NULL);

-- Table: tbl_region
CREATE TABLE tbl_region (id INTEGER PRIMARY KEY AUTOINCREMENT, region VARCHAR (20) NOT NULL);

-- Table: tbl_room
CREATE TABLE tbl_room (id INTEGER PRIMARY KEY AUTOINCREMENT, region_id INTEGER CONSTRAINT fk_region REFERENCES tbl_region (id) NOT NULL, room VARCHAR (20) NOT NULL);

-- Table: tbl_room_sensor_measurement
CREATE TABLE tbl_room_sensor_measurement (id INTEGER PRIMARY KEY AUTOINCREMENT, room_id INTEGER CONSTRAINT fk_room REFERENCES tbl_room (id) NOT NULL, sensor_measurement_id INTEGER NOT NULL CONSTRAINT fk_sensor_measurement REFERENCES tbl_sensor_measurement (id), instance INTEGER NOT NULL, CONSTRAINT un_room_sensor_instance UNIQUE (room_id, sensor_measurement_id, instance) ON CONFLICT ROLLBACK);

-- Table: tbl_sensor
CREATE TABLE tbl_sensor (id INTEGER PRIMARY KEY AUTOINCREMENT, sensor VARCHAR (15) NOT NULL);

-- Table: tbl_sensor_measurement
CREATE TABLE tbl_sensor_measurement (id PRIMARY KEY, sensor_id INTEGER CONSTRAINT fk_sensor REFERENCES tbl_sensor (id) NOT NULL, measurement_id INTEGER CONSTRAINT fk_measurement REFERENCES tbl_measurement (id) NOT NULL, CONSTRAINT un_sensor_measurement UNIQUE (sensor_id, measurement_id) ON CONFLICT ROLLBACK);

-- Table: tbl_value
CREATE TABLE tbl_value (id INTEGER PRIMARY KEY AUTOINCREMENT, room_sensor_measurement_id INTEGER CONSTRAINT fk_room_sensor_measurement REFERENCES tbl_room_sensor_measurement (id) NOT NULL, value NUMERIC NOT NULL, timestamp INTEGER (4) NOT NULL);

Before you go too far I suggest having a look at influxdb which is specifically designed for time series databases and (in conjunction with Grafana for sophisticated charts etc) is very commonly used for just the sort of system you are describing.
It has built in features for things like automatically downscaling old data, so maybe keep all data for a year, but for older ones keep just, for example, the max and min values for each day, or perhaps just once an hour, or whatever takes takes your fancy.
There is a bit of a learning curve though, it isn't quite the same as the traditional database.

1 Like

@Colin, thanks for the heads up. I will look into it.

@Colin - having done some reading I agree and influxdb is the way to go. It is a very different way of viewing things but seems to make a lot of sense. Before I go too far can I ask you a follow up question please?
is it better to split things into different measurements (temperature, humidity, pressure, etc) or group them by sensor like in this example (https://www.definit.co.uk/2018/07/monitoring-temperature-and-humidity-with-a-raspberry-pi-3-dht22-sensor-influxdb-and-grafana/)

If I went with breaking it down to measurement type it would give be one database for all rooms/areas and I could use the MQTT message to build tags.

e.g. home/outside/garage/bme280/1 would give me the tags
(region=outside,room=garage,sensor=bme280,instance=1)

which would give me three points

temperature,region=outside,room=garage,sensor=bme280,instance=1 value=27.0 1434067467000000000
humidity,region=outside,room=garage,sensor=bme280,instance=1 value=55.0 1434067467000000000
pressure,region=outside,room=garage,sensor=bme280,instance=1 value=1010.0 1434067467000000000

I can then query those by the tags later on.

Does this seem as sensible approach?

Thanks for pointing me towards influxdb,
Martyn

I tend to group by place and sensor since that is most likely how you will want to view/query the data later. Then each sensor type (temp, humidity, light, etc) are fields holding values. Here is a quick example of a query in Grafana:

Thanks for that. I haven't got into grafana yet, but will take that on board.

I use one measurement for all the Home stuff and another for Weather. Within the Home I use a tag to identify the room then fields for temperature, humidity and so on. I don't identify a sensor by id or type in the database because when looking at historical data, for example, you want to know bedroom temperature and don't care what sensor it came from. If you changed the sensor you wouldn't want to have to select a different field in the database to see it. One advantage of doing it that way is that (again in grafana) you can use auto-repeating dashboards so that the display for each room is identical, the only difference being the room tag.
I also have a simple mapping from MQTT topic to influx room and sensor so that the influx flow does not need knowledge of particular rooms or sensors, it just looks at the topic and knows what to do with it. So for example the topic for bedroom temperature is home/bedroom/temperature and that goes to tag bedroom, field temperature.
The sensors for different rooms are in some cases connected to different Pis, but that doesn't matter, the influx flow just has to subscribe to the appropriate topics and the rest is automatic.