Cluster_Node_Red

Hello everyone , this is my first post
I am currently working on a project that requires me to publish a stream of data through MQTT nodes and log them in a CSV file. This is currently the primary objective of the project. However, I am wondering if there are any ways to overcome the limitations of Node-RED, which currently uses a single thread on my resource.

I have researched the possibility of adding a cluster as a solution to separate the processes across all the cores of the CPU. I would greatly appreciate any help or guidance on this matter and eagerly await your response.

Hi @ziedbad, welcome to the forum.

Which 'limitations' are impacting your project - do you have hundreds/thousands of messages a second to process or hundreds of file writes, etc?

What is your 'resource'? Node-red seems to run comfortably (for me) on a Raspberry Pi zero and of course much more powerful platforms are available.

hi there @jbudd
To be more precise, I am currently reading 46,800 messages from a text file and sending them using a rate limit of 39 messages per second. This process is being handled within a JavaScript function block and the resulting output will be saved in a CSV file.

My resource for this task is an industrial PC running Windows, and I have also attempted to run the same flow on a PC running Linux. My goal is to utilize all available CPU threads to achieve the best possible time result with my hardware.

I don't understand. In node-red are you trying to send the stream out, or receive it, or both?
Also if you are transferring the whole file then can you transfer the whole file in one go? Preferably using scp or rsync or something similar.

1 Like

It is certainly possible to horizontally scale using multiple Node-RED instances. But personally, from what you've just described, I wouldn't be using Node-RED at all since you don't really appear to be using any of its actual features. It would be a trivial exercise to write a node.js app to do what you've described. You could then run that without the overheads of Node-RED.

Additionally, you imply that you have a single source file. If so, horizontal scaling is unlikely to help at all since it may be the source file that is the limiting factor.

As far as MQTT is concerned, depending on what broker you are using 39 messages per second is literally nothing to a broker unless you've made the messages very large. Even my home automation broker is handling a constant load of around 2000 messages per minute both inbound and outbound and that amounts to around 2-300k per minute in each direction. That isn't even hardly registering on CPU/memory load (an old i5 laptop running loads of other stuff too).

1 Like

Thanks for the explanation.

It is possible to run multiple Node-red instances on a single machine. I have never done so but I think you need to use different port numbers. Perhaps you can do it with Docker containers too.

Linux, without a GUI rather than Windows seems like an easy way to get the most processing power on the task.

I wonder if you can get around the single threadedness of NR by hiving tasks off with exec or daemon nodes:
Untitled 2

do you mean you are actually using a delay node in rate limit mode to do this ? Can you not just let it run full speed ? What is the function block doing that requires it to run that slow ?

hi , yes i'm using a rate limit to do so , and i don't want to run it a t full speed cuz that will be out of my objective
the main goal for me now is to find a solution of how to receive my data and save them in CSV files with a shorter time (PS : it takes 80 min to save 18 csv files for 18 publisher )

Sorry for saying, but reading this thread I am wondering if we all misunderstood what you are trying to achieve.

You want it run faster but impose a fairly strict rate limit?
Sounds like squaring the circle to me.

Care to elaborate why you want the rate limit?
And how quick is it without the rate limit?

it takes 80 min to save 18 csv files for 18 publisher

Where are the csv's in this picture ?
We see a text file and we see a bunch of mqtt nodes.
No csv's. All of these things can be done in memory. Pretty sure you can process all this in a matter of seconds, including the writing to multiple csv's.

Well 47000 lines at 39/sec would be about 20 minutes, so it does seem likely that there is some other bottleneck than the rate limiter.
Is there scope for tuning your function code?

And it looks like you are sending each message to 18 mqtt-out nodes. Why?

Also don't forget that doing that causes Node-RED to make 18 COPIES of every outbound message - this will always be slow.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.