Cloning performance - using rfdc instead of clone

Hi all,

So I have a use case where on a single instance I have 18K+ nodes across multiple tabs. To modify a single flow I use the modify single flow API (using /docs/api/admin/methods/put/flow/). With that many nodes, the API was taking ~6 secs to return a success. Profiling lead to the conclusion that the clone library was the biggest culprit.

So I did an experiment & replaced the clone library with rfdc in the runtime module. This made quite an impact and the API (modify single flow) time reduced to ~2 seconds. My changes can be seen here on github.

My questions:

  • is this approach correct? i am no where close to an expert on node or node-red, thus the question.
  • and is this something that the community would be interested in, if I were to invest more time to make a proper PR out of it?

Thanks.

PS: as a newbie here, I can put more than 2 links hence the improper API link.

1 Like

Hi @asr9 , welcome to the forum.

Wow. I mean WOW.

have you not considered breaking them up across multiple instances? I assume all of the nodes+flows are a single related process? Or are you multi purposing one instance?

It is certainly interesting.

Do you have any flame charts or profiles that can be summarised/ posted for a look see?

Looking at your changes, it looks to be a drop in replacement - a PR would be fairly straightforward. And if you were to raise a PR, it should be based on/targeted to the Dev branch.

That might well be a record!

I can't help but feel that it is time for you to start optimising. :grin:

And certainly even higher in the runtime when using subflows!

We only got about 6-8k nodes in bigger projects. Peanuts compared to that. :nerd_face:

Thank you @Steve-Mcl

Blockquote
Or are you multi purposing one instance?

Yes, I am multi-purposing a single instance.
I will come back with profiling information. Glad to hear this can be of interest.

Blockquote
I can't help but feel that it is time for you to start optimising. :grin:

I hope to get there. After profiling, this (change to rfdc) looked like really low hanging fruit.

ah ha - we did a video just recently that might be of use you Single instance vs Multiple instances

1 Like

I think that the thing to remember with a tool such as Node-RED is that it has a LOT of overhead just to run the tool. This is great for prototyping and quick development but it will never be the most efficient. In many cases, this isn't an issue, but with many thousands of nodes, a visual programming tool looses a lot of its benefits. I would honestly firstly look for repeat patterns and rationalise those, then convert obvious processing groups into straight JavaScript code in function nodes. I'd also be looking to see whether some processing could be moved out of Node-RED altogether, perhaps into their own microservices.

1 Like

So I have some flame graphs here, before and after the changes, this time using the node-red dev branch as base. changes on github

Setup:

  • IDE VS code
  • JS debugger nightly (vscode-js-debug)
  • CPU profiling for 30 seconds
  • 2 requests, with distinct payloads, are made to modify a given flow (PUT \flow\:id)

First, the flame graph for dev branch ("version": "4.0.0-dev")

and after the change to using rfdc

I have raw CPU profile files in case they are needed.

The changes so far are only in the runtime folder, and the profiling was only for the PUT \flow\:id API.

2 Likes

I have opened a PR for the same https://github.com/node-red/node-red/pull/4352

2 Likes

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.