Has anybody done research to compare resource usage when creating nodes using subflows vs writing the same functionality using a single node using js?
I'm wondering if it's possible to estimate the performance improvement—expressed as a percentage—for each Node-RED subflow that is replaced by a custom node written in pure JavaScript. The idea is to focus specifically on the overhead introduced by using subflows compared to native JS nodes, assuming all other logic remains the same.
If a reasonably consistent approximation exists—perhaps varying slightly depending on the target architecture or Node.js binary—we could use it in a Node-RED plugin to suggest flow optimizations. For instance, the plugin might say:
"This subflow could be rewritten as a JavaScript node, potentially reducing memory usage by 15%. Would you like to use an AI model to help generate it?"
It would be a plugin such as Google Lighthouse but for Node-RED flows
My hypothesis is:
Subflows introduce measurable overhead(memory, CPU, increase of time moving messages) compared to equivalent native JavaScript nodes, even when implementing identical functionality.
I would suspect also that subflows have an overhead, just as flows have an overhead compared to pure JS code. It's the trade off between visual and textual. How much overhead that is - don't know.
I have never looked but I wonder how much is optimised when a flow is executed, for example, are junction nodes removed by linking nodes directly before the flow is started?
How much of the structure of the flow is maintained, i.e., how are message sent between nodes - is there a lookup table that a node consults before sending or does a node send directly to other nodes because it knows what to call? This is something that I did away with in Erlang-Red - the flows send message directly to the nodes and the overall flow structure is dissolved on the server, i.e., I can't reconstruct the original flow because all nodes become processes and they send their messages to a process id and not a node id.
Are nodes "inlined" before execution? Is there a type of compile step where node-code might be inlined into other nodes. For example is the code for subflows "inserted" into the flow at the places where it is referenced?
I guess it would be a great research topic: building a compiler for visual code (not just NR but a more generalised form of visual code).
I stopped using subflows once Node-red introduced groups & link-calls, which gives the same reusability.
Subflows have various limitations & peculiarities, but to your point, they impact performance since they are actually just macros: each subflow instance is a replica of the subflow template, i.e. if a subflow is composed of 30 nodes, and is defined 10 times in the flow, in effect it creates 300 nodes (and this grows exponentially).This consumes memory and also increases the flow size and its load, initialization & save time
.
In contrast, when you define a reusable group and call it from multiple places with link calls, it is instantiated only once.
I wasn't considering the duplication of the nodes of a subflow when you duplicate the subflow node. Thank you for reminding me about that.
I was really just thinking about comparing 1 node with 1 subflow node, both with the same functionality, and measure the overhead the subflow implementation added.
I will try to do this research later once i finish my framework to write nodes and rewrite the salesforce nodes. Then I will use the results to convince people to give it a try to implement nodes using my framework.
I had never used subflows for that reason, however I had a lot of dashboard1 groups with the same ui nodes repeated many times. Creating a subflow for this reduced the clutter from around 16 nodes to 1 subflow node. The fact that duplicating the subflow increases resource use is not an issue as the ui nodes existed before anyway.
Could be tested by spamming a huge amount of messages, delta time time between first at start and last at finish. Do that for node and subflow individually. I think the overhead for such a simple case would be negligible. The overhead of duplication is perhaps more considerable. I really wish we could have a singleton subflow. I much prefer subflows over link calls because they are cleaner and easier to apply. For example I have most I/O in subflows with one success output and one error output. That's 3 link calls to place, instead of a single subflow node. Furthermore, I got automated logging to separate files based on flow name. That's much more cumbersome and manual to achieve with link calls. So I am happy paying the cost of performance for readability. This is javascript after all.
Funny you should say that since in Erlang-Red - since everything is an Erlang process and everything has a message queue - everything is a singleton. How does this work?
Each node becomes a separate process that lives somewhere in the Erlang machine. Each process is a equivalent to a linux process - they are completely independent and execute concurrently. Processes communicate by sending messages - Erlang messages not Node-RED messages.
Each process has a message queue which the process works off message by message. I.e. the process is an atomic process when it handles a message from the message queue. Messages are not handled concurrently by Erlang processes - their own messages of course. Processes are handling message concurrently, just their own messages are handled serially. With many 1000s of process, you have concurrency - Erlang processes are super lightweight and the intention is to spin up as many as you can.
But how does this not lead to chaos? Well that's where flow based programming is a such a good fit - because it visually organises the paths and the processes - crazy how well the two fit.
Also another thing to note. When a process sends another process a message, it can send either send it:
async - i.e. wait for a response - message is sent to the callee process, the process sends the caller a message back, the message works its way through the caller message queue and the caller processes has its answer, or
sync i.e. don't wait for an answer to the message - the message just lands in the message queue of the callee to be handled sometime in the future.
Erlang-Red sends node messages asynchronously between Erlang processes. Each Erlang-Red message becomes an Erlang messages (i.e. an object with msgid as single attribute is the smallest Erlang-Red message) and is passed between Erlang processes - not nodes.
So if I want a singleton, then I create a node. Done. That node can have state and will update that state atomically per message. There is only one process per node. Nodes can create more processes, for example the exec node creates three other processes to handle command execution. But each node has one process and the process does not go down.
Erlang also encourages processes to fail. It is better to fail than to cover everything single possible edge-case is the idea in Erlang. So you end up design strategies for recovery and not for error handling.
There are a number of other features that make Erlang and the concepts of Node-RED such a good fit - immutability of data structures for example. Erlang makes many things simpler. I will hopefully have a Erlang version of @cymplecy's MQTT broker done today - that might illustrate what I mean.
Erlang has no classes but modules. They are stateless collection of functions.
Erlang is also functional, so yes, functions can be passed around on messages.
Erlang also has transparent process distribution, I can literally build an Erlang system locally, test it and run it. I then setup multiple instances of the BEAM (Erlangs virtual machine, analog to JVM in Java) and the processes can be distributed across machines transparently, no change to the codebase. The BEAM does the heavy lifting of sending messages to wherever the process might be.
Now imagine how cool it would be to have a single node red flow but to be able to say: this flow tab should run on that instance but only if that instance is available. So that not only does one get concurrency for free (node flows) but also distributed computing (flow tabs).
And all that compatible with Node-RED
EDIT: I wrote somewhere about how Erlang-Red is a cross-pollination of ideas: there is nothing stopping someone from taking these ideas and implementing them in NodeJS. An example, the supervisor behaviour is inherent in Erlang, so I create a Supervisor node that can be installed in Node-RED and Erlang-Red but that is only usable in Erlang-Red. But there is no one stopping someone else from implementing that in NodeJS.
Yes and a collection of flows (a bit scrolling on that page) that test or better said verify the basis functionality of the core nodes. So that when I implement Erlang-Red I can verify that its behaviour matches that of Node-RED. For regression testing these tests aren't too bad either
Of course, it would be great if these test flows became an "official" Node-RED thing so that development both on Erlang-Red and Node-RED is testable and regression safe. Plus it would be a good way to "standardise" the set of core nodes that make up Node-RED. A kind of visual RFC/IEEE collection
Btw the unit test nodes are also available for Node-RED and should pretty much work as they do in Erlang-Red.
EDIT: for those that know my software license and dislike it, I did change the license of Erlang-Red to be gpl-3 so it's now "safe" for large corporation to use.
Cool! But I would love to have the test environment also, for example shadow copies of flows or otherwise set up test flows in a separate area of node red.
Sure test environment is nice but it also requires server changes that I’m not willing to make because I’m busy over at Erlang red.
If the community wants it then they can make it or buy my time - there is a sponsor button at GitHub that apparently also works - no one has tested it yet!
EDIT: I don't want to be rude but I see it like this: if you want it that's great, but I don't need it (for node red) so there is a parting of ways there. The question becomes why should I invest time and effort for something that someone might or might not use but definitely not pay for? Even gaining stars at flows.nodered.org is too difficult for most folks. So I do my things and share those and if folks what other things, well there is this stuff called money and that can also be shared.
Which does make me wonder for whom am I developing software?
I remember the days before software was defined but its license, the days were the GPL was an attempt to avoiding having licenses. Meant as protection against the corporate takeover of software ... and now look were we are? Users rent software and seem to be happy to - looking at you Adobe. Microsoft shows advertising in its operating system.
So let the lawyers twitch, I'm busy watching the ads on my windows box ....
It’s your software so you can use whatever license you like. But if we want to adopt it into the core of Node-RED then we know that there are other corporate users out there that may be less happy to use it. It’s hard enough to get them to accept open source so any extra reservations (perceived or otherwise) don’t help.