Memory leakage of Node-Red on RasPI

#1

Likely, no news here.
My RasPI runs mainly Node-RED, nothing else. Raspbian installed from scratch.

Details:
Starting with about 40% of RAM use on boot it goes to 90% or more in 7-10 days.
Restart of Node-RED goes back to ~40%.

History:
Before Node-RED i used different IoT managing hubs on the RasPI, every time starting from scratch. None of predecessors leaked the memory.

Questions:

  1. May the memory leak due to my bad flows?
  2. Is anybody seem memory leak with your flows?
  3. Please, help to debug/understand the root cause of the leakage.

Thank you.

0 Likes

#2

Yes - there are quite a few possibilities. It could be your flow. It could be a node you have added. You don’t mention which Pi you are using or how you are starting in. By default (if using node-red-start) - we set --max_old_space_size=256 - which mean that after it has used 256MB of memory it should start to garbage collect… - but if you only have a 512MB Pi and are starting at 40% then an extra 256MB would be over 90%… - and indeed the 256 is a soft target - if you are handling large objects or chunk of data it could well use more than that.

0 Likes

#3

Exactly what do you mean by using 90% RAM? Linux is designed to keep disc buffers in RAM unless the space is needed, so over a period it can look as if all the memory is in use. However the buffers can be released if necessary to free up space. For example one of my Pis has been running for 90 days and is showing 90% utilisation, but I suspect it has been at that level for 85 days or so. If you run top and add together the free, buffers and cached memory that will give you an idea of how much is actually available.

The crucial question is, does it actually fail if you continue to leave it running after it has reached 90%? Also if it starts using swap space then that is not good.

1 Like

#4

Thank you for the advice. I’ll try to add the ‘max_old_space_size’.
Is it possible to set the “max_old_space_size” in configs, not in command line?

I use Pi3 model B.

Just to clarify my worry: I don’t observe something running wrong, or stuck, or rebooting… nothing bad.
I’m just curious about a reason of increasing RAM use over time with Node-RED, while no such phenomenon with other frameworks in the class.

Thank you again.

0 Likes

#5

Without knowing the content of your flow, it is a guessing game

One thing I assume could look like a memory leak is if you during run time constantly add data to an object (that will increase in size)

0 Likes

#6

Edit: 14Jun18
I apologize, this response and the one below really belong in this thread: Memory leak, what am I doing wrong?
I wasn’t paying close enough attention when I saw the topic and krambris’s asking about my flow … I thought it was in my thread. _

I can post the flow easy enough, but without my hardware It can’t really do anything. The only objects I write or modify are with change nodes and most of this is in the second flow that does the notification if the AI has made a detection and the system is in notify mode. Its why I originally asked if I need to do anything more than return null when a msg.payload buffer message comes it to drop it and clean up in a function node.

The upgrade to nodejs v4.9.1 helped, but it died and restarted again about 15 minutes ago, this had run for a bit over two days and the time from the error to flow running again was something like 12 seconds, its not a very significant issue to have 12 seconds of missed monitoring in ~50 hours, especially considering that my Lorex DVR has about 5 seconds latency from a trigger to the first snapshot received.

From syslog it seems pretty clear what is happening – out of memory in node-red-contrib-ftp-server when a snapshot comes in, node-red dies, I get my MQTT “will” message and it restarts and all is well until the cycle repeats. Not perfect but usable. I was going to use the will message to restart node-red via another monitoring process but systemd seems to handle it automatically and very fast! I’m beginning to see the merits in the switch to systemd, up to now a “controversy” that seemed mostly philosophical to me – shell scripts vs binary blobs controlling the system, about with I could care less as long as it worked!

Here is the most recent relevant syslog:

Jun 8 10:34:59 alarmPi Node-RED[25524]: 8 Jun 10:34:58 - [red] Uncaught Exception:
Jun 8 10:34:59 alarmPi Node-RED[25524]: 8 Jun 10:34:59 - RangeError: Invalid array buffer length
Jun 8 10:34:59 alarmPi Node-RED[25524]: at new ArrayBuffer (native)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at new Uint8Array (native)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at createBuffer (buffer.js:25:17)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at allocate (buffer.js:166:12)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at new Buffer (buffer.js:56:12)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at Socket.dataHandler (/home/pi/.node-red/node_modules/ftpd/lib/ftpd.js:1413:27)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at emitOne (events.js:77:13)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at Socket.emit (events.js:169:7)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at readableAddChunk (_stream_readable.js:153:18)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at Socket.Readable.push (_stream_readable.js:111:10)
Jun 8 10:34:59 alarmPi Node-RED[25524]: at TCP.onread (net.js:540:20)
Jun 8 10:35:00 alarmPi systemd[1]: nodered.service: Main process exited, code=exited, status=1/FAILURE
Jun 8 10:35:00 alarmPi systemd[1]: nodered.service: Unit entered failed state.
Jun 8 10:35:00 alarmPi systemd[1]: nodered.service: Failed with result ‘exit-code’.
Jun 8 10:35:00 alarmPi systemd[1]: nodered.service: Service hold-off time over, scheduling restart.
Jun 8 10:35:00 alarmPi systemd[1]: Stopped Node-RED graphical event wiring tool.
Jun 8 10:35:00 alarmPi systemd[1]: Started Node-RED graphical event wiring tool.
Jun 8 10:35:04 alarmPi Node-RED[27073]: 8 Jun 10:35:04 - [info]
Jun 8 10:35:04 alarmPi Node-RED[27073]: Welcome to Node-RED
Jun 8 10:35:04 alarmPi Node-RED[27073]: ===================
Jun 8 10:35:04 alarmPi Node-RED[27073]: 8 Jun 10:35:04 - [info] Node-RED version: v0.18.7
Jun 8 10:35:04 alarmPi Node-RED[27073]: 8 Jun 10:35:04 - [info] Node.js version: v4.9.1
Jun 8 10:35:04 alarmPi Node-RED[27073]: 8 Jun 10:35:04 - [info] Linux 4.14.34-v7+ arm LE
Jun 8 10:35:05 alarmPi Node-RED[27073]: 8 Jun 10:35:05 - [info] Loading palette nodes
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [warn] ------------------------------------------------------
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [warn] [node-red-node-serialport/serialport] Error: Module version mismatch. Expected 46, got 57.
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [warn] ------------------------------------------------------
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [info] Settings file : /home/pi/.node-red/settings.js
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [info] User directory : /home/pi/.node-red
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [warn] Projects disabled : set editorTheme.projects.enabled=true to enable
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [info] Flows file : /home/pi/.node-red/flows_alarmPi.json
Jun 8 10:35:11 alarmPi Node-RED[27073]: 8 Jun 10:35:11 - [info] Server now running at http://127.0.0.1:1880/

I may try reducing the max_old_space_size parameter in hopes of maybe being more aggressive with garbage collection, and I may try upgrading nodejs to 6.x (the last that should work with the ftp-server node).

But actually I’ve bigger fish to fry at the moment with regards to they system that are Android issues – I’m not reliably getting timely Email notifications on my Android 7.0 phone, sometimes they are near instant, other times 10 to 20 or more minutes late which is potentially a much bigger issue than having the AI off for 12 seconds every ~50 hours. Another issue is my “backup” notification node-red on Termux-API sending SMS messages to prompt me to check Email running on a cheap Blu Android 6.x phone dies every 2-3 weeks and needs to be rebooted or it automatically reboots and hangs until I dismiss a dialog.

So I think we can put this to bed until node-red-contrib-ftp-server gets fixed, if ever, to work with 8.x and newer versions of nodejs

0 Likes

#7

I was just thinking, why do you have to push all images thru NR? Can’t you just handle them directly in your Python script, adding a FTP client? With Python, in opposite to NR, you have support for multi-threading and you could easily just inform NR about the results via MQTT. In the same way you could tell the script (via MQTT) what to do depending on your alarm system state.

As for sending info including (if using PIL, even further compressed) images to your phone, you should skip email and use Telegram, it is fast and I have not yet seen any message being delayed or missed

I have built such a system myself but based on a bit different HW and SW approach. As HW I have a number of Pi3’s, each equipped with 2 HD Nightvision Color cameras and latest version (4.1.1) of Motion SW for video analyze. Motion is just detecting movement and produces images when they happen.

Currently I use AWS Rekognition for object (person) detection. It is working very well but is of course a solution depending on the cloud. For a couple of cameras I am now also running/evaluating a local solution based on CV2 and MobileNetSSD, all written in a Python script without the NCS stick, handling MQTT and Telegram as well. So far looks very reliable and accurate.

0 Likes

#8

I am looking into telegram, but SMS & Email is working well enough at the moment. For now getting more real-world usage experience is more important than fine-tuning the notification. So far I couldn’t be happier.

The out of memory crashing of node-red is of little practical significance so far, since systemd restarts in a few seconds which is negligble considering the FLIR Lorex has about 4 second latency between trigger and a snapshot.

Why not do everything in python?

  1. I hate its significant whitespace “feature”.
  2. Using node-red in a separate process lets me always use at least two cores on the Pi3B+
  3. The ugly logic I needed to implement to work around the lameness of my FLIR Lorex DVR was trivial to do in node-red with the help of a couple of simple function nodes. It would take me a lot longer to get it running in Python. While I’ve more experience with python than node-js I’m liking nodejs better than Python.

I am looking at MobileNetSSD software only for a single camera stand-alone system using a Raspberry Pi and Pi camera module, great solution for interior space, for outdoors getting it “weatherproof” could be an issue, the FLIR Lorex cameras solved that issue :slight_smile:

I don’t want cloud anything, I see potential virtues, but don’t want the dependence on a third party. As part of my FLIR Lorex purchase I have “access” to their DDNS and web relay so I can stream my camera images, but I don’t want this, although I may enable it as a backup to Gmail/Telegram that I could open in response to an SMS – unfortunately their app running locally over WiFi is pretty lame which is why its never been part of my plan, I pretty much only use it when adjusting camera views. If they documented the protocols, Then I’g get interested in it.

SMS doesn’t depend on Google or anything but the cell phone network. As to Gmail or Telegram it all boils down to who do I trust more, Google or a Russian ExPat on the outs with Putin :slight_smile:

So far in my tests Telegram wins the speed race to get a notification, but not if my phone is asleep. I haven’t yet figured out how to get the priority notifications from Telegram the way I do from Gmail and Messenger. I’m sure that having three versions of Android to deal with is not helping.

I want notifications if the power and internet is down, at least until my UPS batteries die (~70 minutes for the FLIR Lorex DVR and AI) about 3 hours for the rest the system. I’m using node-red on a cheap Android cell phone to let me push out SMS and Email if the internet connection or AC power is down.

The biggest strength of my system is I can rip out the FLIR Lorex and replace it with something else with only minimal changes to the node-red and AI. Basically just the paths the DVR wants to create on the ftp server and rip out the crap supporting the Lorex lameness.

To return to the original topic. The author of node-red-contrib-ftp-server has upgraded it to work with nodejs 8.x and I’ve upgraded, If anything the leak is a bit worse with 8.x than it was with 4.9.2. I suspect the issue is in the underlying nodejs ftpd package used in node-red-contrib-ftp-server, apparently it won’t work at all with 10.x at present.

0 Likes

#9

Hi there,

After the following batch of messages in /var/log/messages the RAM utilization of RasPI jumps from 40 to 80%.

Mar 22 06:25:56 shm su[12994]: pam_unix(su:session): session closed for user nobody
Mar 22 06:25:56 shm systemd-logind[519]: Removed session c6.
Mar 22 06:25:56 shm systemd[1]: Stopping User Manager for UID 65534...
Mar 22 06:25:56 shm systemd[13000]: Stopping Default.
Mar 22 06:25:56 shm systemd[13000]: Stopped target Default.
Mar 22 06:25:56 shm systemd[13000]: Stopping Basic System.
Mar 22 06:25:56 shm systemd[13000]: Stopped target Basic System.
Mar 22 06:25:56 shm systemd[13000]: Stopping Paths.
Mar 22 06:25:56 shm systemd[13000]: Stopped target Paths.
Mar 22 06:25:56 shm systemd[13000]: Stopping Timers.
Mar 22 06:25:56 shm systemd[13000]: Stopped target Timers.
Mar 22 06:25:56 shm systemd[13000]: Stopping Sockets.
Mar 22 06:25:56 shm systemd[13000]: Stopped target Sockets.
Mar 22 06:25:56 shm systemd[13000]: Starting Shutdown.
Mar 22 06:25:56 shm systemd[13000]: Reached target Shutdown.
Mar 22 06:25:56 shm systemd[13000]: Starting Exit the Session...
Mar 22 06:25:56 shm systemd[13000]: Received SIGRTMIN+24 from PID 13013 (kill).
Mar 22 06:25:56 shm systemd: pam_unix(systemd-user:session): session closed for user nobody
Mar 22 06:25:56 shm systemd[1]: Stopped User Manager for UID 65534.
Mar 22 06:25:56 shm systemd[1]: Stopping user-65534.slice.
Mar 22 06:25:56 shm systemd[1]: Removed slice user-65534.slice.
Mar 22 06:26:01 shm rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="449" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar 22 06:26:01 shm rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="449" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar 22 06:26:01 shm rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="449" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar 22 06:26:01 shm rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="449" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar 22 06:26:01 shm CRON[12801]: pam_unix(cron:session): session closed for user root

It takes about 1.5 days of run of RasPI + Node-Red until this happens. Day-two later the RasPi gets stuck.
This may not be related to Node-Red. Maybe someone knows to translate this to human-readable hint what eats the memory?

Another question:
I'd like to add a flow which detects memory usage and reboots the RasPI. I know how to do this separately. I'm curious if there's ready implementation of such reboot? Did you encounter it?
Please, share.

Thank you.

0 Likes