Flows taking long to load and save - current flow file 7.5 MB

My flow file (with more than 50 tabs) is now 7.5 MB and it takes a long time to load and save flows. Node-red is running on T3 AWS instance with 8 GB RAM. Memory used is 22%. SSD also is 1/3rd used vs capacity available. CPU usage is 30%. What can be done to improve this loading and saving time?

Are you saving everything when you deploy? Try saving only modified nodes:
Screenshot-51

Yes I save only modified nodes.

Can I resolve this by changing the solution architecture in some way (currently node-red and database are on the same server instance) or by changing certain settings in the settings.js or maybe by replacing a compute focussed ec2 instance? Although it made no difference in the behaviour when I changed from a t3.large to t3.xlarge

Have you looked into code optimization to reduce the size of your flow file?

  1. You can replace multiple switch nodes, change nodes with just one function node.
  2. Uninstall unused nodes
  3. Combine tabs, groups to simplify the flows

To give an example to the first item, the optimized flow uses 1 function node to replace 8 switch and change nodes.


The new flow reduces size significantly. It also looks better visually.

1 Like

First look for loops and other inefficiencies in your flows.

Then check for anywhere you are using JSONata and replace with function nodes. Then check for anywhere you are using multiple function nodes in series and reduce them down to a single function node.

Next, visually check your flows and look for anywhere where you have >1 wire coming out of a node's port. This forces a msg clone which is comparatively slow.

Also make sure that you don't have a DB loaded that is chewing up a load of memory and causing OS Paging to happen.

Also check for sub-flows - noting that each use of a sub-flow is a new instance. Replace with link call nodes.

And check for unnecessary use of context variables. These can occupy vast amounts of memory if you let them get out of hand. Using file storage excessively with short write times can cause slowdowns. Reduce the write frequency in settings.js if you can and reduce your dependencies on context variables as much as possible.

1 Like

7.5 MB seems surprisingly large for 50 tabs, of course there may be a lot on each tab.

I wonder if there's unnecessary stuff in there, such as seen fairly recently with ui spacer nodes.

It might be worth checking how many of each element type you have, this should do it:
grep '"type":' flows.json | awk '{ arr[$2] += 1 } END { for (x in arr) print arr[x], x} ' | sort -n

2 Likes

Thanks guys for pointing me in the right direction. There are in fact around 270 tabs. Code is repeated for every 5 tabs. Overall inputting and outputting to ~50 Siemens PLCs. 5 Tabs for each PLC. Code can definitely be improved. Will look for optimizations and come back with my findings.

Hi Julian, I am using influxdb to store the data incoming from PLCs. Not sure how can that be optimized. Otherwise I am not using JSONata. Will reduce msg cloning for sure. Will reduce the number of nodes if they can be clubbed together. Will also reduce use of context variables. Not using any subflows though.

Sounds like you can reduce these 270 tabs down to 5.

1 Like

@bakman2 It's an insightful thought. But there is too much data to be collected from each PLC. Maybe possible to club all PLCs on a single page using sub-flows but not sure if it will save much except the influx nodes. Every PLC has different IP address and each needs separate s7 comms input node. Each PLC data has to have a separate object with a separate id (which is created in the flow itself). On the output side, each tag needs to have a separate node for each PLC. That's where bulk of the nodes are used - almost 75 for each PLC, across 4 tabs.

This gives output as - '1 Daily'

This has been a useful input as well. Has led me to learn how to use node.send() and to use delay within a function node. Have clubbed quite a few nodes in a single function node now. Thanks.

Well that's not what I expected.
Don't think it's any use either.

I suppose if you have not specified pretty flows in settings.js then everything is on one line so the grep command is going to fail.

Changing the deploy type does not change what is sent from the editor to the backend.

No matter what deployment type the FULL flow will always be POSTed to the Admin API endpoint. The work done to determine what has changed and what should be restarted (tab or just modified nodes) is all done in the backend, not the editor.

2 Likes

This what I get after making changes to the settings.js and then editing the flow a bit.

1 "rbe",
1 "tls-config",
1 "ui_base",
1 "ui_template",
1 "users_config",
1 "websocket-listener",
2 "exec",
2 "influxdb",
2 "sqlitedb",
8 "csv",
9 "file",
9 "fs-file-lister",
9 "group",
12 "ui_tab",
13 "users_isloggedin",
19 "ui_group",
33 "http
53 "random",
67 "inject",
73 "sqlite",
79 "comment",
108 "template",
214 "catch",
273 "tab",
332 "influxdb
380 "link
431 "change",
783 "join",
1533 "switch",
1871 "delay",
2135 "debug",
5346 "s7
6439 "function",

OK.

Nothing leaps out from that, and while 7.5MB seems a lot for 50 tabs, it's not so unreasonable for 273.

You say "node-red and database are on the same server instance" but I see that you have both sqlite and influxdb nodes. So NR and two different databases on the same server?

Now I have combined lot of diverse nodes on a single tab into a single function node. This I have done for 4 different tabs. But to my surprise, the new file size for each tab flow has in fact increased by 2-3 KBs. How is combining of various nodes supposed to optimize the overall flow/load time? It might reduce cloning and execution time maybe but that's not my concern at this moment as most of the nodes are there on the 'PLC write' logic which does not have a very high frequency anyway.

Yes. sqlite is for static/master data and influxdb is for the timeseries data. And CPU usage is around 30%.

There are quite a lot of file-action nodes in your flows, I'm now wondering whether you have some flows that read and/or write files that are I/O bound? I think that's where I'd look next.

That surprises me as well. Though 2-3k not really significant and I suspect that comes from function nodes where the JavaScript has to be translated into serialised text which is fairly cumbersome.

Bear in mind that we are operating in the dark here and so can only make educated guesses as to what your flows look like.

Both of those can cause issues if not managed. Both require significant amounts of data to be held in memory to work correctly so if you let their DB's get too big, you will certainly see periodic pauses while systems try to recover memory or are forced to page to disk. I've definitely seen that with InfluxDB where I accidentally allowed one of my tables to get too large on a Pi and then noticed periodic flow stoppages because InfluxDB was having to page data (probably indexes) in and out of memory and regularly purge data from memory. Thankfully, InfluxDB has mechanisms you can use to keep that under control.