Which database to choose for storing data (noob-friendly)

Hi guys,

I'm searching for a database to store measurements and time series data and wanted to hear if there is that one DB to choose for this purpose.

I'm not a programmer but I have some experiences with Node-RED, python and C++. The idea is to digitize process and production data from the machines we are building or modernizing. It will mostly be single measurements (states etc.) and sometimes also time series data.

I worked with a MongoDB in a former project where I generated datasets in form of JSON-files via MQTT and stored them in the DB. Because I couldn't implement a direct connection to Node-RED I asked a friend to program python-script for me to access MongoDB via MQTT.

That workaround wasn't the cleanest so I now installed InfluxDB 2.2 which seems to be quite common to use with Node-RED.

Before I spend a bunch of time to get into InfluxDB I would like to hear if Influx is the way to go. It still seems like Node-RED is a good way to collect and clean data before saving it in a DB. Also the possibility of visualization is nice to have.

I hope anyone has a recommendation for a noob-friendly database to go for and put some effort in.

Regards
Bastian

For a time series database Influx is the way to go. It will take some learning though.

I use node-red to receive data from mosquito and write this data to a sqlite database. The data are time series based, and if I was starting this again, I'd probably look at influx rather than sqlite, based on Influx being better suited.
But sqlite works for me and currently I have no reason to change.

The discussion here may be useful ...

I think the best choice really depends on the use case, with this I mean the data you're gathering/producing is important, but it's more important to know what are you planning to do with it...

In my view, there's no one size fits all answer to your question.

I'll give you my view on several choices I consider for a "typical" app:
1.- the node-red msgs have always a time/date component: if you need to treat them from that angle, I tend to store them on an influx instance.
2.- if some of the messages produced by node-red represent a state that has to be accessed on a "cache" way, I tend to store them on a redis instance.
3.- If I want to have a "transactional" view of the data (messages), I tend to save them on a postgres or a mongo instance. Which one of the two really depends on how much operations (joins) I am planning to do with the stored data.
4.- if I want #3, but I don't want to take care of another server, I tend to use sqlite.

And the list can go on....

The point is: think of the use-cases, then chose whatever fits best your needs. It might very well that you end up using a mix of the options mention above. You can always save a data twice, if needed :slight_smile:

2 Likes

Hi Basti,

I think you're absolutely right when it comes to the quality of Node-RED for data collection and cleaning.

For the visualization we work with Grafana. And therefore, we chose TimescaleDB over InfluxDB. If interested, you can find arguments on why we decided that way here: Why we chose TimescaleDB over InfluxDB

Regards
Anton

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.