Batch + Join = Memory Leak?

Hi There!

I was debugging a memory leak in Node-RED and found that a combination of batch and join creates a memory leak. Obviously the combination of batch and join might not be necessary in this case but it demonstrates the memory leak.

I added the --trace-gc option to node and watched as the memory footprint grew:

Node-RED[30658]: [30658:0x6264b78]   277141 ms: Scavenge 64.7 (67.6) -> 64.3 (67.6) MB, 2.3 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   285997 ms: Scavenge 65.6 (68.4) -> 65.1 (68.4) MB, 1.8 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   290869 ms: Scavenge 65.6 (68.4) -> 65.2 (68.4) MB, 2.4 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   290918 ms: Scavenge 65.6 (68.4) -> 65.2 (68.6) MB, 2.4 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   291027 ms: Scavenge 65.7 (68.6) -> 65.2 (68.6) MB, 1.3 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   292245 ms: Scavenge 66.4 (69.4) -> 66.0 (69.4) MB, 2.1 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   295963 ms: Scavenge 67.2 (70.2) -> 66.8 (70.2) MB, 1.3 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   296006 ms: Scavenge 67.2 (70.2) -> 66.8 (70.2) MB, 3.3 / 0.0 ms  (average mu = 0.997, current mu = 0.998) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   296059 ms: Mark-sweep 66.8 (70.2) -> 66.2 (70.2) MB, 11.3 / 0.7 ms  (+ 43.3 ms in 116 steps since start of marking, biggest step 1.0 ms, walltime since start of marking 82 ms) (average mu = 0.998, current mu = 0.999) finalize incremental marking via task; GC in old space requested
Node-RED[30658]: [30658:0x6264b78]   304256 ms: Mark-sweep (reduce) 66.5 (70.2) -> 66.2 (70.2) MB, 10.1 / 0.1 ms  (+ 76.5 ms in 65 steps since start of marking, biggest step 21.1 ms, walltime since start of marking 91 ms) (average mu = 0.996, current mu = 0.989) finalize incremental marking via task; GC in old space requested
Node-RED[30658]: [30658:0x6264b78]   304918 ms: Mark-sweep (reduce) 66.2 (70.2) -> 66.2 (69.9) MB, 14.3 / 0.0 ms  (+ 44.4 ms in 70 steps since start of marking, biggest step 20.7 ms, walltime since start of marking 62 ms) (average mu = 0.994, current mu = 0.912) finalize incremental marking via task; GC in old space requested
Node-RED[30658]: [30658:0x6264b78]   305918 ms: Scavenge 66.7 (69.9) -> 66.2 (69.9) MB, 2.5 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   305968 ms: Scavenge 66.7 (69.9) -> 66.3 (69.9) MB, 1.9 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   306028 ms: Scavenge 66.7 (69.9) -> 66.3 (69.9) MB, 1.9 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   306385 ms: Scavenge 67.5 (70.7) -> 67.0 (70.7) MB, 1.2 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   307242 ms: Scavenge 67.5 (70.7) -> 67.2 (70.7) MB, 2.8 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   315969 ms: Scavenge 68.4 (71.5) -> 68.0 (71.5) MB, 1.2 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   316417 ms: Scavenge 68.4 (71.5) -> 68.0 (71.5) MB, 2.2 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   320915 ms: Scavenge 68.4 (71.5) -> 68.1 (71.5) MB, 2.7 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   321017 ms: Scavenge 68.5 (71.5) -> 68.1 (71.7) MB, 1.7 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   322145 ms: Scavenge 69.3 (72.5) -> 68.8 (72.5) MB, 1.5 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;
Node-RED[30658]: [30658:0x6264b78]   322242 ms: Scavenge 69.3 (72.5) -> 68.8 (72.5) MB, 1.8 / 0.0 ms  (average mu = 0.994, current mu = 0.912) allocation failure;                                                                                

(the above was over a timeframe of about a minute)

The flow:

[{"id":"94e7f200ff36755c","type":"inject","z":"4ccd24f6fa1b409f","name":"request 1","props":[{"p":"url","v":"https://en.wikipedia.org/w/api.php?hidebots=1&hidecategorization=1&hideWikibase=1&urlversion=1&days=7&limit=50&action=feedrecentchanges&feedformat=atom","vt":"str"},{"p":"requestTimeout","v":"4500","vt":"num"}],"repeat":"15","crontab":"","once":true,"onceDelay":0.1,"topic":"","x":659,"y":629,"wires":[["0b63a3a452d89c91"]]},{"id":"6c2f481c2a26cf36","type":"inject","z":"4ccd24f6fa1b409f","name":"request 3","props":[{"p":"url","v":"https://hnrss.org/newcomments","vt":"str"},{"p":"requestTimeout","v":"4500","vt":"num"}],"repeat":"10","crontab":"","once":true,"onceDelay":0.1,"topic":"","x":659,"y":743.3333536783854,"wires":[["0b63a3a452d89c91"]]},{"id":"d3459fc98422afe3","type":"inject","z":"4ccd24f6fa1b409f","name":"request 4","props":[{"p":"url","v":"https://hnrss.org/newest","vt":"str"},{"p":"requestTimeout","v":"4500","vt":"num"}],"repeat":"10","crontab":"","once":true,"onceDelay":0.1,"topic":"","x":659,"y":799.0000203450521,"wires":[["0b63a3a452d89c91"]]},{"id":"3395e2692741e172","type":"inject","z":"4ccd24f6fa1b409f","name":"request 2","props":[{"p":"url","v":"https://en.wikipedia.org/w/index.php?title=Special:NewPages&feed=atom","vt":"str"},{"p":"requestTimeout","v":"4500","vt":"num"}],"repeat":"15","crontab":"","once":true,"onceDelay":0.1,"topic":"","x":659,"y":687.6666870117188,"wires":[["0b63a3a452d89c91"]]},{"id":"0b63a3a452d89c91","type":"http request","z":"4ccd24f6fa1b409f","name":"","method":"GET","ret":"txt","paytoqs":"ignore","url":"","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[],"x":1023,"y":718,"wires":[["bbab5c7d5095a7d3"]]},{"id":"bbab5c7d5095a7d3","type":"batch","z":"4ccd24f6fa1b409f","name":"","mode":"count","count":"2","overlap":0,"interval":10,"allowEmptySequence":false,"topics":[],"x":1281,"y":570,"wires":[["74198797038fe043"]]},{"id":"c6b40f5b5b408e9d","type":"debug","z":"4ccd24f6fa1b409f","name":"debug 37","active":true,"tosidebar":false,"console":false,"tostatus":true,"complete":"payload","targetType":"msg","statusVal":"","statusType":"counter","x":1646,"y":789,"wires":[]},{"id":"74198797038fe043","type":"join","z":"4ccd24f6fa1b409f","name":"","mode":"auto","build":"object","property":"payload","propertyType":"msg","key":"topic","joiner":"\\n","joinerType":"str","accumulate":true,"timeout":"","count":"","reduceRight":false,"reduceExp":"","reduceInit":"","reduceInitType":"","reduceFixup":"","x":1438,"y":639,"wires":[["c6b40f5b5b408e9d"]]},{"id":"68b0a8c3a40cc8cd","type":"catch","z":"4ccd24f6fa1b409f","name":"","scope":["0b63a3a452d89c91"],"uncaught":false,"x":1032,"y":692,"wires":[["bbab5c7d5095a7d3"]]}]

Using Node-RED 3.0.2 with Node-18 on Raspberry Pi but also happens on an AWS box.

Perhaps someone can confirm or deny the leaking ...

Cheers!

Why are you sending the output of the catch node to the batch node?

What url are you using in the http-request? (It does not get exported)

Url is defined in the respective inject node - rss feeds from wikipedia and hacker news.

The catch ensures that batch receives a message regardless of whether the node fails or not. I do that because I had issues with split and joins and I make it a habit now.

But even without the exception handling, I get a memory leak.

I just tried the same flow on 3.1.0.beta4 with Node18 and it doesn't show the same leak, so it appears to be a 3.0.2 thing.

I let it run for ten minutes but in that time the 3.0.2 had gained 60mb while the 3.1.0 hadn't moved at all. On both nothing else was running.

Oh good because I was just about to reply saying I don't see any memory issue on 3.1.0 beta4 :slightly_smiling_face:

Yes the joys of having raspberries lying around, I could do a simple check with the other raspberry...

I also took out the exception node and have the same effect.

It seems to be the combination of join + batch, I tried it with just a batch and that appeared to work fine.

P.s. the flow that I original created using this combination of nodes was refactored and doesn't use that combination, I suspect that it's a edge case type of flow ...

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.