Performance issues with large project – need optimization advice

Hi everyone,
I'm working on a fairly large Node-RED project (portable version), and I'm running into hardware limitations, especially in terms of CPU and memory usage.

My main concern is about flow optimization, particularly when it comes to subflows. I have a subflow that I need to reuse multiple times throughout the project. Currently, I copy and paste it every time I need it. What I'm not clear about is:

  • Does each instance of a subflow consume separate memory? Or does Node-RED reuse the definition and only keep lightweight instances?
  • Am I duplicating resources every time I copy and paste a subflow, or is it more efficient than repeating the same nodes manually?
  • For optimization purposes, is it better to reduce the total number of nodes, reduce the number of subflows, or avoid repeating subflows altogether?
  • What's more efficient: repeating a subflow multiple times, or having one more complex and centralized node (or set of nodes) that performs the same function?
  • Are there any best practices for these situations when a flow becomes very large and starts to impact performance?

Any advice on how to optimize or better understand how Node-RED handles memory and CPU would be greatly appreciated.

Thanks in advance!

PD: For example
This

its better than??

Yes. To avoid that, you would need to use a flow with a link-call node.

I think it is probably just about the same. But I've not tested that.

Yes. :wink:

It is hard to predict without going through the processing details. But I would generally expect a single central node executed using link-call nodes would be best.

Here are a few things that can cause issues with large flows:

  • Too many cases where you have multiple wires connected to a node's output port.
  • Over use of Debug nodes
  • Over use of JSONata

To understand that, you need to understand how Node.js works.

One final piece of advice. If the above doesn't help, you should probably look for alternative ways of processing things. For example, does everything HAVE to go through Node-RED? Are there more efficient tools? (e.g. either using Node-RED as a prototyping tool and then using more direct coding, or using Node-RED as an orchestrator with processing offloaded to other services). File handling would be a great example, Node.js is never going to be as efficient at bulk handling of files than a tool specifically designed to handle them.

1 Like

If you count all the nodes in each subflow and multiply that by how many instances you have in your project, what number do you get?

E.g. subflow 1 has 20 nodes and the project has 5 of these === 100

Subflow 2 has 10 nodes and the project has 10 instances=== 100

Subflow 3 has 25 nodes and there is only 1 instance inside subflow 2 === (10 x 25) === 250

Here is another thread...


If your subflows are not storing state (like context) you could likely replace them with link-call subroutines. They only consume the number of nodes that are written.

1 Like

Thanks a lot for your detailed response, I really appreciate the explanations and recommendations!

So just to confirm: if I have a subflow that I currently repeat many times, it's better to move that logic to a separate flow tab, and use a link-call node every time I need to trigger that logic, right? This sounds like a cleaner and more efficient approach than repeating the whole block multiple times.

A few follow-up questions:

  • If I use this approach, and the output from each use of the logic is different, will this create any issues? For example, if I call the same logic from two different places and each one needs a different result/output, how does Node-RED handle that?
  • If two different flows call the same logic at the same time using link-call, does it run in parallel safely? Or does one call wait for the other to finish? Just trying to understand how Node-RED processes this internally.
  • And yes, you are absolutely right — my real challenge is understanding how Node.js handles things under the hood, so I can make smarter optimization choices. I’ll also look into offloading some work to external scripts where it makes sense.

Thanks again for your help!

Example that i have

the LDC block i repeat 39 times in my flow, and that have 25 nodes internal.

I stopped using subflows when link-calls were introduced into Node-RED. A subflow is like a macro - each subflow instance creates a replica of its internal nodes & context memory, thus increasing flow size and memory consumption, and also introduces some other operational limitations.

Instead of a subflow, you can define a single instance of its internal nodes, serving as a "subroutine", which you can call from anywhere in the flow, either through a static link or using a dynamic "target" property. You will need, however, to pack the input arguments into the msg, and not rely on private context memory (which is better to avoid anyway).

1 Like

Thanks for your reply, that was super helpful!

I hadn’t realized that nested subflows also count towards the total node load — that’s a really interesting and important detail. Even though in my case I’m not using subflows inside other subflows, it’s still good to know how much that could impact performance.

My setup is quite large: I have 13 flow tabs, and just to give you an idea, one of those tabs has around 3000 nodes.

Based on what you said, do you think that replacing the repeated subflows with link-call nodes that reference logic stored in other tabs could significantly reduce this problem?

Thanks again for the insight!

Definitely so. when applying link-calls to other tabs, I recommend using dynamic (named) targets (as opposed to specified node id's), to protect you from node id changes (due to import, copy/paste etc.)

1 Like

Thanks again for your helpful response!

I have a follow-up question related to this approach:
What happens if some of the nodes inside the subflow I want to replace are using flow.get or global.get?

In that case, am I forced to keep it as a subflow? Or is there a specific reason why link-call wouldn’t work properly when reading from flow/global context?

Just trying to understand the limitation better. Thanks again for all the insights!

When you say "not storing state", are you referring to flow.get / flow.set and global.get / global.set?
Or are you only referring to the set operations, meaning that just reading from flow or global context wouldn't be a problem?

Subflows have their own private 'flow' context, which in my opinion create confusion. I do not recommend using context variables at all (other than in specific cases), and you have the risk of race conditions. Better pack the data in the msg, rather than .set it by one node and .get it by another node down the road.
Even the private flow context of subflows does not solve the issue. it allows concurrent subflows have their own space, but does not protect against races between multiple messages injected to the same subflow.

3 Likes

Thanks,

In our case, the main reason we’ve been using context (specifically with the filesystem storage) is to persist certain values across reboots, like when the PC restarts or there's a sudden power loss. We need some key data to survive these events, so we rely on context for that.

That said, we’ve also noticed that when the system shuts down unexpectedly (e.g. due to a power cut), the context file sometimes gets corrupted, which causes other issues when Node-RED restarts. It’s definitely not a perfect solution.

Thanks again for the insights!

It is supposed to be a file safe operation. However, note that you can fine-tune the save time, saves are not real-time. You may want to look at one of the alternative stores though such as the REDIS store. Not used it myself so not sure if it would be more robust but I suspect it might.

Another alternative would be to use MQTT with retained topics. A lot of us use these since any listener node will automatically get the last posted value upon restart.

To be clear, a link-call node connects to a link-in. You then have some flow and a link-out set to return. The link-call node adds a property so that the link-out knows where to return to. So it is safe to have many different link-call nodes connected to the same flow.

Of course, race conditions have already been mentioned and you always need to be mindful of this when working with flow-based tooling. Especially if any of the steps in the flow might involve asynchronous tasks. This usually isn't an issue unless you have a task that cannot handle inputs from multiple sources without getting confused. If that is the case, you will likely need some additional controls such as a gate or finite-state-machine.

Everything is node.js and so nothing ever really happens in parallel. Though it may appear to because of the clever way that JavaScript manages its loop. With the exceptions mentioned above, link-call processing should indeed be safe - at least as safe as any other flow segment that takes multiple input streams.

Scripts or services - in some cases you might want to have a daemon background service running permanently. A good example might be an analytical process that you have already coded in Python. Wrapping that in a simple Python service that provides an HTTP or Websockets service - or even an MQTT based messaging approach - would let you pass data to it easily from Node-RED. The reverse is also true of course, it is easy enough to pass data back with an appropriate listening node.

1 Like

Is your file system on something like an SD card? We have had a few users reporting such problems and my suspicion is that it may be to do with SD card storage.

2 Likes

My solution was to switch from raspi to Mini PC with Node Red Docker. If you have problems running stuff in parallel (JS / Node JS limits) just clone your container and split the project if possible.

2 Likes

@Prompt-Sopa this could help you if it gets implemented. I don't like the option of turning subflows into flows and use link call nodes. It disturbs the reading of a flow.

1 Like

Julian,

If i ran multiple instances of Node Red on the same machine (different ports) do they each use a different Node.JS instance and a different processor or is it still the same problem ?

In my case i usually fire up a different virtual machine so i know i have different memory, processor, file system etc - but i am interested in this ? Similarly i would imagine if i went lighter with multiple containers on the same VM i would get additional processor use ?

I currently use the seperate VM approach on a number of intensive processes that i run and it works well - with MQTT between them

Craig

1 Like

Yes, each will be a new process

2 Likes

Absolutely they are separate instances. They might not use a separate processor but they will generally use a separate core I believe. Certainly a separate thread.

Using multiple instances is certainly an option to spread the load. This is called "horizontal scaling". Of course, you might run out of other resources such as memory though. So not a silver bullet.

Multiple VM's is real overkill and uses up even more resources even if using a lightweight VM. Node.js is designed around a microservice architecture so horizontal scaling works pretty well for the most part (within the limitations of available memory and processor cores). No need for VM's.

MQTT is a good call though for inter-process comms.

1 Like

Thanks Julian, confirmed some of the areas where i was a little vague.

Previously as i have had a cluster of ESXi machines at home it was easy to break out more intensive processes (that i wanted to be rock stable and have no changes) and put them into their own VMs - for example - Modbus Energy meter Querying - which happens every 200ms for 4 different meters and every second for my 5 inverters - i broke all of that out into its own VM and it has not missed a beat in more than 2 years now.

Might come back and revisit this and bring them down to containers when i am due to refresh hardware at the end of this year

Craig

1 Like