Memory leak, what am I doing wrong?

I’m using a Raspberry Pi 3B+ to add AI “person detector” to a commercial video security DVR. Its streaming protocols are proprietary so I use its “ftp snapshots” feature as the source of images to analyse.

I use a simple node-red flow with node-red-contrib-ftp-server to get the images into a msg.payload buffer and pass them into my AI python script via MQTT. When the AI, running on a Movidius NCS, detects a person in the image the Python script writes the image with a box drawn around the person to a USB memory stick and posts a message via MQTT with the path to the file. A second node-red flow does one of three things with this MQTT input message depending on the alarm system state – ignore it, play an audio alert using espeak, or send a text message and Email with the photo attached

It all works well but because of the lameness of my security DVR I have to pass three or four times the number images through the node-red-contrib-ftp-server than I want passed into the AI. So I’m wondering if I need to do something to specifically free a buffer at the end of a flow or trigger garbage collection? The msg.payload buffer either terminates in the MQTT output node that passes it to the python script or it is discarded in my filter function which returns a null to the MQTT node instead of the msg.payload with the buffer.

After from typically 20-36 hours node-red dies, it restarts automatically and starts working again, no harm to the python script and the AI detection resumes after node-red restarts. Top shows my node-red flows start at about 15% memory usage and continues to increase, rarely settling back a few percent, eventually settling into about 85% where it hovers for many hours until there is no more RAM available and presumably the OOM killer stops it, after which systemd seems to restart it and the cycle repeats.

If you know that ram get full after certain amount of hours, before it gets clogged why you don’t connect to a console and use the command “top” to check the processes running and what is using each % of ram and CPU?
Just a note… have you disable the graphical desktop if not used? This will free you a lot of memory.

You will have to start with some more info I guess…

Regards

I do, that is how I know that its node-red that eventually consumes all the RAM. I also see that the python process is very well behaved. MQTT (mosquito) consumes so little resources that it hardly ever shows up in the top output.

Node red was killed and restarted (automatically) a few hours ago and is already up to ~35% memory usage Available memory is still mid six figures since it was all freed when node-red was killed.

Just a note… have you disabled the graphical desktop if not used? This will free you a lot of memory.

htop will give you also nice info, kill processes or services that really you don't use, check to avoid are auto started on boot too.

From other side when you speak about memory also check if is ram,swap &/or cache try to have enough available ram and avoid swap if possible.

How did you install node-red?

Have a look in /var/log/syslog at the time it fails and see what it says. If you are running out of memory then you should see the effects there.

Also post the node red log for node red starting up.

Graphical desktop was useful getting the AI working -- I used ssh -X to view input and output images. But once the AI was running well I configured startup to a command prompt with no autologin.

I didn't install it. It was part of the default Raspbian Stretch image 2018-04-18 and I did dist-upgrade to get the "latest" node-red v0.18.7 before starting this thread when nothing changed.

Here is /var/log/syslog from the most recent death and restart:

Blockquote
Jun 5 09:17:02 alarmPi CRON[14914]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jun 5 10:17:01 alarmPi CRON[14979]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jun 5 10:23:33 alarmPi dhcpcd[367]: eth0: hardware address ac:22:0b:1f:b2:f8 claims 192.168.2.213
Jun 5 10:33:43 alarmPi dhcpcd[367]: eth0: hardware address ac:22:0b:1f:b2:f8 claims 192.168.2.213
Jun 5 10:47:44 alarmPi Node-RED[13454]: 5 Jun 10:47:44 - [error] [function:Lorex Filter] RangeError: Invalid array buffer length
Jun 5 11:00:46 alarmPi Node-RED[13454]: <--- Last few GCs --->
Jun 5 11:00:46 alarmPi Node-RED[13454]: 69491125 ms: Mark-sweep 38.6 (63.6) -> 38.6 (63.6) MB, 335.9 / 0 ms [allocation failure] [GC in old space requested].
Jun 5 11:00:46 alarmPi Node-RED[13454]: 69491455 ms: Mark-sweep 38.6 (63.6) -> 38.6 (63.6) MB, 329.5 / 0 ms [allocation failure] [GC in old space requested].
Jun 5 11:00:46 alarmPi Node-RED[13454]: 69491942 ms: Mark-sweep 38.6 (63.6) -> 38.3 (63.6) MB, 487.2 / 0 ms [last resort gc].
Jun 5 11:00:46 alarmPi Node-RED[13454]: 69492424 ms: Mark-sweep 38.3 (63.6) -> 38.4 (63.6) MB, 482.3 / 0 ms [last resort gc].
Jun 5 11:00:46 alarmPi Node-RED[13454]: <--- JS stacktrace --->
Jun 5 11:00:46 alarmPi Node-RED[13454]: ==== JS stack trace =========================================
Jun 5 11:00:46 alarmPi Node-RED[13454]: Security context: 0x5096d6d9
Jun 5 11:00:46 alarmPi Node-RED[13454]: 2: /* anonymous */ [/usr/lib/node_modules/node-red/node_modules/node-red-node-rbe/rbe.js:35] [pc=0x40e78370] (this=0x4cf0da2d <a RbeNode with map 0x2bb28581>,msg=0x550ff4d9 <an Object with map 0x2bb49689>)
Jun 5 11:00:46 alarmPi Node-RED[13454]: 3: emit [events.js:~117] [pc=0x40e4e8e0] (this=0x4cf0da2d <a RbeNode with map 0x2bb28581>,type=0x50925425 <String[5]: input>)
Jun 5 11:00:46 alarmPi Node-RED[13454]: 4: arguments adaptor frame: 2->1
Jun 5 11:00:46 alarmPi Node-RED[13454]: 5: receive [/usr/lib...
Jun 5 11:00:46 alarmPi Node-RED[13454]: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Jun 5 11:00:46 alarmPi systemd[1]: nodered.service: Main process exited, code=killed, status=6/ABRT
Jun 5 11:00:46 alarmPi systemd[1]: nodered.service: Unit entered failed state.
Jun 5 11:00:46 alarmPi systemd[1]: nodered.service: Failed with result 'signal'.
Jun 5 11:00:46 alarmPi systemd[1]: nodered.service: Service hold-off time over, scheduling restart.
Jun 5 11:00:46 alarmPi systemd[1]: Stopped Node-RED graphical event wiring tool.
Jun 5 11:00:46 alarmPi systemd[1]: Started Node-RED graphical event wiring tool.
Jun 5 11:00:50 alarmPi Node-RED[15031]: 5 Jun 11:00:50 - [info]
Jun 5 11:00:50 alarmPi Node-RED[15031]: Welcome to Node-RED
Jun 5 11:00:50 alarmPi Node-RED[15031]: ===================
Jun 5 11:00:50 alarmPi Node-RED[15031]: 5 Jun 11:00:50 - [info] Node-RED version: v0.18.7
Jun 5 11:00:50 alarmPi Node-RED[15031]: 5 Jun 11:00:50 - [info] Node.js version: v4.8.2
Jun 5 11:00:50 alarmPi Node-RED[15031]: 5 Jun 11:00:50 - [info] Linux 4.14.34-v7+ arm LE
Jun 5 11:00:51 alarmPi Node-RED[15031]: 5 Jun 11:00:51 - [info] Loading palette nodes
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] Settings file : /home/pi/.node-red/settings.js
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] User directory : /home/pi/.node-red
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [warn] Projects disabled : set editorTheme.projects.enabled=true to enable
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] Flows file : /home/pi/.node-red/flows_alarmPi.json
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] Server now running at http://127.0.0.1:1880/
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [warn]
Jun 5 11:00:57 alarmPi Node-RED[15031]: ---------------------------------------------------------------------
Jun 5 11:00:57 alarmPi Node-RED[15031]: Your flow credentials file is encrypted using a system-generated key.
Jun 5 11:00:57 alarmPi Node-RED[15031]: If the system-generated key is lost for any reason, your credentials
Jun 5 11:00:57 alarmPi Node-RED[15031]: file will not be recoverable, you will have to delete it and re-enter
Jun 5 11:00:57 alarmPi Node-RED[15031]: your credentials.
Jun 5 11:00:57 alarmPi Node-RED[15031]: You should set your own key using the 'credentialSecret' option in
Jun 5 11:00:57 alarmPi Node-RED[15031]: your settings file. Node-RED will then re-encrypt your credentials
Jun 5 11:00:57 alarmPi Node-RED[15031]: file using your chosen key the next time you deploy a change.
Jun 5 11:00:57 alarmPi Node-RED[15031]: ---------------------------------------------------------------------
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] Starting flows
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] Started flows
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] [mqtt-broker:localhost:1883] Connected to broker: mqtt://localhost:1883
Jun 5 11:00:57 alarmPi Node-RED[15031]: 5 Jun 11:00:57 - [info] [mqtt-broker:9abf2118.d76ad] Connected to broker: mqtt://alarmbone:1883

Where do I find the node-red logs?

You are on an old release of node, you should update it to 8.9.
Also why didn’t you follow the directions for upgrading from the website ?
https://nodered.org/docs/hardware/raspberrypi

These instructions are "old" -- Raspbian Stretch has node-red pre-installed for quite some time, from the beginning if I remember correctly. These are the first problems I've had.

If my node-js is "out dated", my question is why didn't it get upgraded with the apt-get update ; apt-get dist-upgrade I did a couple of days ago?

I'm running the script now. I'll followup it it breaks everything or fixes the issue or has no effect. I like the idea of moving the globally installed nodes so palette manager works, -- its been frustrating that it works for some nodes and not others, although I haven't worried about upgrading nodes that I don't use.

Edit, the script completed the upgrade, Looks like it broke node-red-contrib-ftp-server which is about the worst possiblity as now I'm totally out pf business :frowning:

I get this error:
5 Jun 20:19:34 - [info] Installing module: node-red-contrib-ftp-server, version: 0.2.2
5 Jun 20:19:46 - [info] Installed module: node-red-contrib-ftp-server
5 Jun 20:19:47 - [info] Added node types:
5 Jun 20:19:47 - [info] - node-red-contrib-ftp-server:ftp-server : TypeError: The super constructor to "inherits" must not be null or undefined (line:29)

I uninstalled nodejs v8.11.2 and managed to find and reinstall nodejs v4.9.1 which seems to be a bit newer than the v4.8.2 that came with the Raspbian Stretch image.

At least I’m running again, time will tell if it made things better, worse, or no different, but v8.x broke a contrib node that is critical for my usage!!!

They are not old

Due to the slowness (deliberate) of the Debian software release cycles, and that node-RED is developing fast, they are the supported way of keeping it up-to-date on a Raspberry Pi

If a contrib node doesn't work, then that needs fixing

Nodejs v4 is no longer supported and is not getting any security updates.
See https://github.com/nodejs/Release#release-schedule

It is recommended that you run the LTS version of nodejs

Are you passing your images ( as a buffer) through the rbe node? If so how have you configured your RBE node?

The rbe node:

{
“id”: “2e37060d.1bdf4a”,
“type”: “rbe”,
“z”: “b63cd4a9.77fbd8”,
“name”: “”,
“func”: “rbe”,
“gap”: “”,
“start”: “”,
“inout”: “out”,
“property”: “payload”,
“x”: 230,
“y”: 100,
“wires”: [
[
“5ead68d6.0e10a8”,
“aed4b0b8.8456d”
]
]
},

Its input is the ftp-server node’s output, its output goes to the input of my filter function which either drops the buffer by returning null, or passes it to MQTT output which is subscribed to by the python AI script by doing return msg. This is ultimately the reason I posted the topic here to find out if I’m failing to do something that is needed to trigger garbage collection.

Still too early to tell, but since I installed v4.9.1 of nodejs about eight hours ago node-red is only using about 15% memory, usually it was 40% or more by this point with v4.8.2

Function is more important than “security updates” as without the function I could have 100% ironclad cyber security by not running the system!

Security and IOT stuff is a nightmare at the moment. My solution (I don’t trust FLIR corporation’s website to “gateway” my DVR images either) is strong passwords and having the DVR and the Pi3B+ on a private network pushing the AI images to my cell phone via “double NAT” behind my ISP provided router. That is also why I keep Raspbian updated with apt-get update ; apt-get dist-upgrade

There is an issue open on this at the node-red-contrib-ftp-server github.

I can’t run what don’t work, the node-red-contrib-ftp-server is supposed to work with 6.x so I will try that if v4.9.1 doesn’t fix the leak. I’ve no idea why Raspbian Stretch is distributing node-red with v.4.8.2

OTOH with the stability of the python AI and the auto-restart of node-red, even with the leak its only a couple of minutes of “down time” every 30+ hours so its tolerable as I’m not guarding Ft Knox!

The nodered.service file in /lib/systemd/system/ contains the --max_old_space_size=256 parameter that is used to set when the Garbage collection kicks in… BUT it is only a guide as the GC in node.js is “lazy” in that it doesn’t kick in until the limits are exceeded - so a) it will always appear like it is leaking memory as the GC won’t kick in all the time and b) if you are near the limit and then handle a few “large” objects then you may well blow way past it. We picked 256 to try to be too greedy with Pi memory but feel free to tune as you see fit.

Will have a look at that ftp node - but yes not ideal if they aren’t fixing it yet,

Debian just update their base packages very slowly - so Stretch just happens to be stuck on 4.8.2 - their next release (2019) will be (currently) based on node 8.x - but by then that will be mid-life (with 10.x being the current LTS version…) - Sadly - in order for us to have Node-RED pre-loaded on Pi we need to sit on what is there already so 4.8 it is - for now. But as other have pointed out - it is now beyond end of life and not getting security patches - so not recommended for anything other than initial exploration.

Thanks for the information about where the setting is for when garbage collection kicks it, I will try reducing it if 4.9.1 hasn’t fixed things. It seems better, but still too early to tell, I won’t know for sure until probably Friday unless it runs out of memory again sooner.

My DVR is great at 24/7 HD recording and pretty useless for everything else. My “snapshots” come in bursts of about 15/second for typically 10-30 seconds and then typically several minutes of no images. Largest size is typically 160-180K/per image on in bright light, typically 70-90K/image at night under IR, at night there are few false PIR activation, usually not really false since when I looked at the 24/7 record its always seemed to be a neighborhood dog, cat, raccoon, etc. that triggered it.

I won’t delve into the boring details but I’ve PIR motion detectors covering the the camera field of views and “trigger” the snapshots when a PIR goes active. Unfortunately its all or nothing from the DVR so so my filter is discarding all the images that are not from cameras covering that PIR which is most of them since its quite rare for multiple PIR to be active unless someone is actually walking from one field of view to the other (they overlap by design). But since all the cameras are looking at outdoor scenes I get bursts of false PIR activations – on the west and east sides as the sun rises and sets, on the north and south sides as the sun passes directly overhead and the whole thing is made worse by days like today with bright sun, and fast moving clouds.

So I’m hitting a peak of activity now where I get false PIR activation every 4-20 minutes and node-red is still below 20% memory usage. This time yesterday it was over 80% and ran out of memory a few hours later.

I do want to compliment the node-red developers as I find the default behaviors very well thought out and despite the fact node-red has been dying every 30-36 hours it took me a few days to realize it because of the auto restart. I was actually adding another flow to monitor “will” messages from the MQTT broker that was sending the PIR states, otherwise I might have remained blissfully unaware that node-red was dying periodically.

So far since midnight, 642 images have been sent to the AI by my filter with two detections (when my wife left for work this morning) This means probably over 2500 images have passed through my flow with over 1800 discarded.

As I said I will try 6.x if this is still “leaking” and will upgrade to 8.x or 10.x as soon as node-red-contrib-ftp-server is fixed or I can find a replacement. From looking at the node-red-contrib-ftp-server github source it looks like the issue is in the require() statement of the nodejs libraries he used:
var _ = require(‘lodash’),
FtpServer = require(‘ftpd’).FtpServer,
ip = require(‘ip’),
path = require(‘path’),
memfs = require(‘memfs’);

If anyone cares, I’ve put a simplified version of my Python Movidius AI detection script and node-red flow with all the ugliness to deal with my FLIR/Lorex DVR ripped out, up on github:
https://github.com/wb666greene/SecurityDVR_AI_addon

But be aware my instructions and wiki are a mess at the moment as I’ve not yet got the hang of github markup.

Looks like this is the underlying issue - https://github.com/nodeftpd/nodeftpd/issues/124
so yes it would seem to say node6 should still work with it… - but uurgh what a car crash.

Thanks for the info, I’ve passed it on to the author of node-red-contrib-ftp-server via his github comments section on the open issue.

Very encouraging sign. Memory for node-red got up to 88% during the worst of the evening false PIR activation around 4-5PM, then a rain shower moved in, and stopped the false PIR storm and memory decreased to 32%

It has never never decreased more than a few percent during the v4.8.2 near monotonic march towards OOM. So v4.9.1 may contain the solution to my memory leak issue.

Time will tell.

1 Like

Hello wb666greene

I know what I proposed you on a previous post is not a solution but maybe provisionally helps you…Most probably you are not using the desktop right? If this is the case disable X as this will relief a lot of resources from your ram.

From other side and also something provisional… why you don’t limit the amount of trigger messages per minute or use some debounce or similar?

Regards

As I said, X has been disabled since my initial development when it was very useful to to have it so I could see input and output images side by side. Once this was working I disabled X and it now runs “headless”.

There is no “debouncing” possible as the PIR sensors read temperature changes across a field of view and these are inevitable when the PIR field of view is outdoors. Believe me, it still is far fewer images than I would have to run through the system if I was using the FLIR/Lorex incredibly poor video motion detector to trigger the snapshots sent to the AI.

The fundamental problem is the lameness of the FLIR/Lorex DVR I want every snapshot image it sends from a camera covering the PIR vield of view so any “rate limit” is a non-starter, problem is its all or nothing, meaning if it sends anything it has to send the snapshots from all the cameras. I’ve seen “better” (no-name) DVRs that have a separate trigger input for each camera, I regret not buying one of these models, but that is water under the bridge.

Just a little follow up. The author of node-red-contrib-ftp-server has updated it to work with 8.x, but apparently the nodejs ftpd module is not working with 10.x.

I’ve upgraded node-red-contrib-ftp-server and nodejs to 8.x and if anything the leakage is a bit worse than with v4.9.2, but in any event its not a serious problem as after node-red crashes for lack of memory systemd restarts it in a a few seconds which is inconsequential in the overall scheme of things since my Lorex DVR has a 4+ second latency between triggering and sending a snapshot.

Its been running great despite the periodic crashes and restarts. I hope to try reducing the the --max_old_space_size=256 paraneter this weekend to make the garbage collection more aggressive. I hope my understanding of this parameter is not reversed :slight_smile: