Huge number of devices with nodered via mqtt

Need some expert guidance If I want to connect 10k or above device with node-red via MQTT what is the best approach. Also, I want to show statistics on the dashboard for each and every device. a device you can assume ESP8266 or RaspberryPi.

See Will there be any performance hit because of using many MQTT nodes

I wouldn't have thought it was practicable to have one dashboard showing statistics for 10000 devices

I'd thought that feeding 100 devices into one computer might be possible and then sending aggregate information from those 100 to one central one might be possible

But it would all depend on the frequency of messages being generated

Simon

Do you have 10k topics, one for each MQTT node?

  1. If so you need to configure 10k MQTT nodes in a flow which is not possible so you should use multiple flows.

  2. You will have very hard time to do the configuration of flows like drag and drop and give topics etc.

  3. You will face Will there be any performance hit because of using many MQTT nodes issue.

  4. You should be creating multiple tabs because you can't show 10k devices at once and browser could hang. I think will multiple tabs too it will not work because in some post I read all tabs will be loaded at once.

So, I think it is not possible to do this.

You have few like 200 to 500 topics from 10k devices?

This will be possible but still not easy to configure.

One solution could be:

Implement some lazy loading stuff where you can load limited topics in UI and on scroll or search load next topics and remove previous topics that are being displayed. Therefore there will be only few visible at once.(I am not sure can dynamic handling is possible using node-red dashboard. I not possible you should probably implement one of you own).

If there are 10k topics it is better to group them into multiple topics like 1 - 1000 as Topic1 and so on. So you will be having less configuration and the message you receive should have the original topic name which you can use in a function node to determine where the data need to be pushed in dashboard.

Then you need to filter the topics that you are receiving from MQTT to the only that are displayed in UI and send them to UI.

I would never even consider trying to create flows of this magnitude by hand... do you at least have a spreadsheet or database of the 10k devices? If so, you could use them to feed a master flow which generates the worker flows and deploys them to an array of PIs.

As for the dashboard, you will need custom code to effectively visualize that many devices. You may want to look at the node-red-contrib-ui-builder project, which allows you to connect your own UI to the node-red server backend. At least it will take some of the pain of handling the websocket communication between the browser and the server.

Personally, I would skip the dashboard initially and simply push all the readings into a time-series database like InfluxDb. Couple that with the Grafana graphing tool, and you can visualize your data very easily.

Hopefully the topic structure is designed a bit more sensibly then that and you can use wildcards in the topics to subscribe to multiple topics with a single node.

For example, (given we have absolutely no context for the original question), if you had devices publishing to:

light/0001/state
light/0002/state
light/0003/state
...

Then a single MQTT node subscribed to light/+/state would get all of those messages.

A well-designed topic structure is vital when designing an MQTT architecture that needs to scale.

2 Likes

Thanks for advice..
If I skip the dashboard with real-time data and store them in database first. still I need some expert knowledge to make scalable solution because data is huge and very frequent. please suggest best approach to push data in database via nodered.

@piyushbarua According to me kafka or cassandra could be good options for storing huge amount of data.

You will need to share a lot more information if you want specific help. Personally, I wouldn't use Node-RED at all for this. It has overheads that you will want to avoid. It would be better to use a tool that can consume/transform the MQTT topic data directly to a database.

But before you go anywhere, you need to work out what output data structures you want. So far, you've not given any clue as to what you want to see on your dashboard nor whether you want moment-in-time reporting or timeseries - this will also impact what database technology you want to use.

Also impacting choice of database and architecture will be the volume of data you actually need to store.

Finally, you need to think carefully about the presentation layer. Is Node-RED Dashboard really the right tool? Probably not if you really need to present a lot of data. Something like Grafana would likely be better suited. It is better at handling really large datasets for dashboards. Alternatively, something custom without the overheads of the Dashboard.

With some more info, we can possibly help you drill down to a smaller choice of technology and architecture. It is an interesting problem after all.

We deployed mobile signs on different locations. these signs have different sets of sensors. getting data of position, location, sun-light, wind and radar. all data coming in json format. we have different topics for data because size of data is large we can't send in single shot. we created a service that subscribing data and pushing it in database. till now this is working fine but number of device increasing over the time. we need some scalable solution and near to realtime monitoring. Node-RED is really fast enough so I want to move my current solution on that but here I also found some difficulties with limited features. also share some tools to save data from mqtt to sql or no-sql database.

Guys, need best approach as I said to handle this problem. Node-RED is priority..

We all like someone to solve our problems for us :slight_smile:

But you are proposing something that is way beyond most (if not all) people on this forums, use of NodeRED

What sort of project is this BTW - commercial? research?

Simon

As Nick suggested above - one of the main keys is a well chosen topic tree to allow you to make best use of wildcard subscriptions, so that you can aggregate the data from one type of thing (whether that be by device or by reading type). Then feed as directly as possible to an appropriate database.