We are facing a very strange issue. In our case any deploy (modified / node level deploy) takes a long (20 to 50 mins). The deploy finishes relatively quick - few mins - but to get back the editor control takes time. We enabled the trace logging level. But unable to understand where the time is being spent. Seems like some core process and not something we implemented in our flows (not sure if my assessment is correct).
As can be seen in below pic - process took 30 mins before control returned to editor.
Till 20:00:04 loading of various flows was going on. And then it reach some runtime event at 20:29
What device are you running NR on?
How much memory does it have?
What else is running on it?
How big is your flow?
What version on NR and node.js?
What does your flows do?
What device are you running NR on? --> Windows Server 2016
How much memory does it have? --> 64 GB
What else is running on it? --> Other in house applications running . I can stop some of them if we want to run some test.
How big is your flow? --> Flow file is around 3 MB
What version on NR and node.js? --> 2.1.4 , Node JS is v14.17.3
What does your flows do? The flows are quite complicate and do multiple things - but none of the complicated flows are "inject at system start". They are all manually separately triggered. Even in the picture- i cliecked a simple flow (inject + debug ) - 10 mins post deploy click. Still the result was visible a good 30 min later.
You could als connect with a debugger (e.g. Chrome developer tools) to your Nodejs instance, and then do a cpu profiling. Then you see exactly where it is spending all his (cpu) time.
Bart
Yes. We have local DB on the same machine which we access. No access to internet.
Not clear on query.
The flows do multiple things connect to DB , file movement. update DB etc.
We have around 30 tabs. And 30 separate subflows. But the point here is that none of them starts on deploy.
Mostly modified flows , modified nodes - in this example - it was modified nodes
Let me check this.
Will do.
Checked this system is well around 50% CPU and 50% memory. So doubt if its resources.
Not that i am aware off. Will recheck.
The test for this post was - move 1 node. So yes.
But does anyone know what happens between - flushing the trace of loading and "runtime event".
I am trying to understand if its something at NR core level or something as part of auto-scheduled flows.
Cos the logis not showing anything in these 30 mins.. All my flows have some or other debugs.
Node JS runs on single core. If that single core is maxed out, this is your issue.
If I were to guess, then something in your flows is maxing out the CPU core it's running on. Probably a loop of some kind of perhaps you have something running upon deploy (e.g. Do any of your function nodes have code in "on start" or "on close"?)
Without having access to your flows it is mostly guess work however, you can make strides yourself. I would start by disabling various flow tabs until you get something sensible. And to save you waiting half an hour between deploys, I would do this in reverse. What I mean by that is start note red in safe mode, disable all but one tab, then deploy. If the deploy is snappy enable another tab, deploy and so on until it becomes unresponsive.
This is what i have started to do. Please note 50% CPU i took from Windows Task Manager.
(Other non Node Red processes continue to run on this server).
Btw - anyone knows how subflows impact the flow loading / deployment ? Cos we do have quite a few subflow calls.
Another thing you could try is disabling all but one of your tabs and see if that speeds things up. If it deploys quickly then enable 20 more tabs. If the problem shows up you know it is somewhere in those ten so you can eliminate them bit by bit till you find the issue.
Doing that.
Started with all tabs disbabled.
Life looked good. Deploy was almost instantaneous.
Started enabling them 1 by 1.
I finally started seeing a pattern. If the tab being enabled had a subflow call - that started adding to the deploy time. The initial tabs just had 1 subflow or few subflow calls (less than 5).
On the last tab enabled - the deploy time jumped from few seconds to over 5 mins. And that tab has around 60 injects each calling a subflow. (Many of these injects call the same subflow - but with different params)
(Each inject is connected to its own subflow to avoid instances of subflow interfering from each other)
It seems if we have more and more subflow calls - it impacts the deploy time. I am not yet clear why should it. Is this a known behaviour ? Anyone knows ?
Some background: If you have a subflow that contains 100 nodes, then you add 100 instances, this expands to 100 * 100 (10000) additional nodes to setup, destroy, re-create when a deploy happens. If each of those subflows have initialisation code - that has to run for every one of them.
This is what i think is happening. I have simple flows and complicated flows. Many of the flows were created before link call feature was available. I also need to check if indeed there are no STATEs. Mostly i dont have STATEs - as i prefer not to work with NR Flow/Node level contexts. DB or global contexts - which i try to avoid overlap. But need to check.
The challenge with link calls / link in etc. is some of subflows have parameters. And unfortunately today we dont have such mechanism . So significant portions of "re-write" may come into play. Lets see.
(Btw not related to above issue - but connected to linked call - and only because you answered this Steve - do you know if we ever move ahead with Link Call enhancements (timeout optional , pass through mechanism etc.) ?
Yes - but i would not expect it to impact "deploy" . For sure execution. Deploy should not care what goes in and what comes out. As that stage there is no message in the system.
and this gets worse if your subflows nest subflows.
each instance creates all its internal nodes and if any of those are subflow instances, they are created too. It really doesnt take much to get to huge numbers
Example:
Subflow1: 10 regular nodes, 5 Subflow2 nodes
Subflow2: 20 regular nodes
Main flows: use 100 Subflow1s and 100 regular nodes
this means 100 + (100 * (10 + (5 * 20))) == 11100 nodes to be created, setup, initialised etc etc.
Contrast that with link-calls (aka pure functions):
Subroutine1: 1 link-in, 10 regular nodes, 5 link calls to Subroutine2, 1 link-return
This was clear once you wrote the earlier note. As i said i need to check if indeed i can move entire subflow to link calls. I have some subflows which take parameters.