Node-Red on RPi restarts after ~2 weeks

I would start by removing the following nodes (in order, not at the same time)

  • calculate
  • smooth
  • counter

The first 2 store data in an internal array for later computation

In fact, looking at the src of the calculate node, it (for some reason) stores the entire msg object (even though it only appears to use payload in computing the averages!).

1 Like

Also, do you actually use play audio on all of them?

@Steve-Mcl i just deactived these three one after one. Ram was at 34%, after deactivating it rises up to 47% (in 3 minutes)

@Colin I don´t use play audio.

Could it theoretically be a hardware defect of the ram?

I use only one calculate node in this flow.
in my other node-red I use > 10 without problems.


the calculate gets every 2s this msg.object

in my other node-red the calculate gets much more crap in msg.object

I use only one smooth node in this flow, in my other node-red I use > 20 without problems.
it´s 100% the same input like in my other node-red:


every hour max. 35 msg

and I also use only one counter node, which has actually not been used since the reboot.

The only major difference to the other systems and my last idea is the following:

boot/config.txt add:

[all]
gpio=4,5,7,11,17,22,23,24,25,26,27=op,dl
gpio=6,8,9,10,12,13,16,21=ip,pu

and some gpio nodes:

with initialization

Can this be ruled out as a reason?

next step: 2nd RPi as a pure simulator (without modbus)

is it completely impossible to see a detailed RAM listing in the Node Red core?

It isn't going to be a fundamental config issue, it is what is being done with the data.

You showed an installed node list earlier, but said that some of them are unused. I would start by removing all those that are not used and making sure all the rest are up to date.

You showed earlier you were getting regular modbus timeouts. Do the other systems suffer from this too? It is certainly not impossible that something like this could find a memory leak that has not been noticed previously.

Next time you restart could you show us the full node red startup log please.

of course.

I´ve some other systems with much more modbus read errors ! (without ram problem)

the most similar system, which has no problems with ram, has no gpio and no modbus.

So yesterday I took a old RPi2 with bookworm + NR and insert my flow without modbus,
the ram rose and rose overnight !
today I deleted all gpio nodes and restarted the RPi2.
I´ll now delete one after one node to find the problem.

One more thing occurs to me: my other RPi with gpio nodes (and 5 modbus devices), wich has no problem with ram, has installed Node JS 18,x.

I could also test a downgrade from 20,x to 18,x.

Did you uninstall the unused node types and check the others are up to date?

I did not uninstall the pre-installed nodes (ping, serialport, audio, random) - every RPi got these, sorry, I see no sense.
the installation is 1 day old, of course all nodes are up to date.

today I compared my flow with the customer flow again.
I came across a function that only I have:


after a reboot/restart msg.payload.S is undefined:

the object enters:

let Z = msg.payload.Z;
let S = msg.payload.S;
if (Z < S || S == null) {
    return;
}
else if (Z >= S) {
    msg.payload = 0;
} 
msg.topic = "aus";
return msg;

could this be the reason for the rising ram ?

I admit, I programmed it really badly :confounded:

I changed it on my Test-RPi2 and it stays at 17%.

No they don't all have those nodes installed. The recommended install script does not install them. How did you instal node-red? The sense is that they consume valuable memory even if not used and slow down the browser startup time every time you open the editor.

That depends on how you installed node red. Some installations come with pre-installed old versions of nodes.

The code in that function will not cause a memory leak, but if you are sending erroneous data onwards to following nodes then I have no way of knowing what that might be doing.

... it does install them if you choose to add Pi nodes. They will only fractionally slow down startup of Node-RED - little or no effect on opening editor.

I would more suspect the modbus node not cleaning up properly after the errors it seems to get occasionally. But even those I would expect to be garbage collected eventually.

I did not realise that. Thanks for the correction.

1 Like

I installed NR with:

bash <(curl -sL https://raw.githubusercontent.com/node-red/linux-installers/master/deb/update-nodejs-and-nodered)

always

no pre-installed !

@dceejay on the RPi2 (simulator) is modbus completely deactivated.

I don´t understand NR behavior.

the other ("customer") node-red RPi3 runs over 20 days with constant 29% ram, after 20 days ram rose >40%. I didn´t change anything, just look "top".

my RPi3 now rose over 40% too and what I also don't understand are these cpu peaks:


every 3 min. I´ve no cron-plus with 3min.
in the first days / after restart the peaks are very low, barely visible and with the length of the uptime they increase. :face_with_monocle:
after reboot:

After reboot in system journal:


the cron-plus says:

System Time Change Detected - refreshing schedules! If the system time was not changed then this typically occurs due to blocking code elsewhere in your application

what does that mean? could this be reason for cpu peaks or rising memory?

last idea: delete all cron-plus nodes and replace with inject (much work for me)

Reason - no. Clue - yes. As it says, something is blocking/hogging the Node Event loop (i.e. high CPU usage) to the extent that nodejs didnt have a chance to see the system clock change for 5 seconds. That is highly indicative of a heavy synchronous process in your flows/nodes

Sure, you can remove the thing informing you that there is a problem - but it wont fix your problem. The graph is backing up what cron-plus is telling you - that the NodeJS event loop is being blocked by a synchronous process.

As for what is hogging the CPU, only you can determine by looking at the flows you have written. Hints:

  • Are you looping large data arrays?
  • Are you processing large things like images or audio or files?
  • Are you doing synchronous file reads or writes?
  • It could even be the garbage collection kicking in
  • It could be a badly written node. for example, the average or counter node I mentioned before - i.e. holding on to crap loads of memory for no reason.

I repeat :woozy_face: no audio, no images, no syn. file reads or writes, no larage arrays.

Now I deleted the average node (from simulator RPi2), and I have to wait for several days.

the average node gets every 5s:

the counter node gets some crap from cronplus node:

I came across a hot lead: I replaced all cronplus nodes by injects:


never saw such a low mem% and cpu load goes to 0 - 0,01

I think I have to talk to Maintainer steve-mcl :melting_face:

Could you restore the original flow and export all the cron nodes, then go back to the new flow and export all the equivalent Inject nodes please, so we can see what the difference is.

You are simply masking the real issue.

Yes, Cron-plus has additional data in its msg (for advanced usage) - its what is holding on to extra data that is your issue. What you have done could easily been done with an inline function node to clean up the msg!

Also, did you re-enable all the stuff you disabled/deleted before coming to your final conclusion?

two of several cronplus nodes for example:

[{"id":"f575aa0996438761","type":"cronplus","z":"5f6e5e5e2b5d2c03","name":"Daten","outputField":"payload","timeZone":"","storeName":"","commandResponseMsgOutput":"output1","defaultLocation":"","defaultLocationType":"default","outputs":1,"options":[{"name":"Modus","topic":"Modus","payloadType":"global","payload":"openwbModus","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Überschuss","topic":"U","payloadType":"global","payload":"Übers-ohne-LP","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Offset","topic":"offset","payloadType":"global","payload":"offset","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Sofort","topic":"sofort","payloadType":"flow","payload":"sofort","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"min I","topic":"minI","payloadType":"flow","payload":"minI","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"frc","topic":"frc","payloadType":"flow","payload":"frc","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"go-e","topic":"goe","payloadType":"global","payload":"LP2-Leistung","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Tibber","topic":"Tibber","payloadType":"flow","payload":"Tibber","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"I max","topic":"Imax","payloadType":"global","payload":"Imax","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Phasen","topic":"phasig","payloadType":"flow","payload":"Phasen","expressionType":"cron","expression":"*/5 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"}],"x":90,"y":1360,"wires":[["2bcf40094ea7e9ed"]]},{"id":"45417146841658f2","type":"cronplus","z":"5f6e5e5e2b5d2c03","name":"PV-laden","outputField":"payload","timeZone":"","storeName":"","commandResponseMsgOutput":"output1","defaultLocation":"","defaultLocationType":"default","outputs":1,"options":[{"name":"Überschuss","topic":"U","payloadType":"global","payload":"Übers-ohne-LP⌀","expressionType":"cron","expression":"*/10 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"PV-laden","topic":"PV","payloadType":"global","payload":"openwbModus","expressionType":"cron","expression":"*/10 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"SoC Freigabe","topic":"socfreigabe","payloadType":"global","payload":"socfreigabe","expressionType":"cron","expression":"*/10 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"},{"name":"Start PV","topic":"start","payloadType":"global","payload":"startpv","expressionType":"cron","expression":"*/10 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"}],"x":100,"y":1140,"wires":[["d63e39946820db0a"]]}]

(you can see all if you would import my flow #19)

replaced on my RPi2 simulator by for example only one:

[{"id":"23f8c7079d1db506","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"Modus","payload":"goeModus","payloadType":"global","x":130,"y":860,"wires":[["1a3302883aa58f9f"]]},{"id":"73f0b8ed0105ee74","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"U","payload":"Übers-ohne-LP","payloadType":"global","x":150,"y":900,"wires":[["1a3302883aa58f9f"]]},{"id":"4b6e611ea9c0e3ff","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"offset","payload":"offset","payloadType":"global","x":120,"y":940,"wires":[["1a3302883aa58f9f"]]},{"id":"efdb2f8d99abbf97","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"sofort","payload":"sofort","payloadType":"flow","x":110,"y":980,"wires":[["1a3302883aa58f9f"]]},{"id":"cd5d0229634d636b","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"minI","payload":"minI","payloadType":"flow","x":100,"y":1020,"wires":[["1a3302883aa58f9f"]]},{"id":"579ca51ceec4eb8e","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"openwb","payload":"LP1-Leistung","payloadType":"global","x":140,"y":1060,"wires":[["1a3302883aa58f9f"]]},{"id":"7f7c6bb8ad586aeb","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"Tibber","payload":"Tibber","payloadType":"flow","x":110,"y":1100,"wires":[["1a3302883aa58f9f"]]},{"id":"8c9a7ebdb43eae65","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"Imax","payload":"Imax","payloadType":"global","x":110,"y":1140,"wires":[["1a3302883aa58f9f"]]},{"id":"80f73b02af4e7186","type":"inject","z":"c09741270c88473d","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"5","crontab":"","once":false,"onceDelay":0.1,"topic":"phasig","payload":"Phasen","payloadType":"flow","x":120,"y":1180,"wires":[["1a3302883aa58f9f"]]},{"id":"1a3302883aa58f9f","type":"join","z":"c09741270c88473d","name":"Objekt","mode":"custom","build":"object","property":"payload","propertyType":"msg","key":"topic","joiner":"\\n","joinerType":"str","accumulate":false,"timeout":"","count":"9","reduceRight":false,"reduceExp":"","reduceInit":"","reduceInitType":"","reduceFixup":"","x":520,"y":1000,"wires":[["a8e803f8e13c7275","0cabbb1ba85a5b9b"]]}]

you mean delete:

[{"id":"d63e39946820db0a","type":"change","z":"5f6e5e5e2b5d2c03","name":"lösche","rules":[{"t":"delete","p":"cronplus","pt":"msg"},{"t":"delete","p":"scheduledEvent","pt":"msg"}],"action":"","property":"","from":"","to":"","reg":false,"x":90,"y":1200,"wires":[["86ef66a806be684d"]]}]

yes I added the delete node after every cronplus to be on the safe side!

the RPi3 runs now with cronplus + delete, I´ll see in a few days/weeks if mem is rising again.
since yesterday it is at 26% - much more than the RPi2.
the cpu peaks every 3min are still there at the RPi3.

I will let the RPi2 (simulator) run as well a few weeks (with the inject nodes instead of cronplus) - the graphical interface looks really bad without cronplus :frowning:

than I will decide, if I delete all cronplus on RPi3 (productiv system)