Node-red application gradually increases memory usage until crashing

Hello guys, we have node-red running in our device. We have prebuilt nodejs and node-red from source in one of the linux machine with target specific (target is device) configs. Moved the built pacakge to device and started node-red from init.d script using start-stop-demon.
versions used in package are nodejs v14.12, node-red v1.2.3
.node-red/pacakge.json is as follows.

{
    "name": "node-red-project",
    "description": "A Node-RED Project",
    "version": "0.0.1",
    "private": true,
    "dependencies": {
        "node-red-contrib-azure-iot-hub": "^0.4.0",
        "node-red-contrib-google-cloud": "0.0.19",
        "node-red-contrib-influxdb": "^0.5.3",
        "node-red-contrib-opcua": "^0.2.91",
        "node-red-contrib-serial-modbus": "0.0.11"
    }
}

So problem here is we have observed that the memory utilization gradually goes up, this we have seen from "top" command, the %mem goes upwards. Initially the speed was very high and it would crash the system with javascript memory leak issue. On community someone suggested that they got performance improvement by deleting debug nodes from flow. Deleting debug nodes drastically reduced the memory utilization speed.

But the increase in % memory is still there, over the time the the percentage changes from 6% to around 27% in 3days as per our recent observation, with debug node it would reach there within an hour.

Please suggest us what is the issue with debug node in node-red, and how can we detect the nodejs memory leak that is causing the increase in memory utilization and eventually crashing application.

What versions of node-red and nodejs are you running?

@Colin Thanks for quick response, I have updated the post with nodejs and node-red version details.

Can you post the node red log output from just before it crashes please.

latest version is 1.2.7

@zenofmud node-red v1.2.3 was used because we had taken the node-red code from source in our build machine, we don't create package every time, just build it locally. Still as to check the possibilities, I had changed nodejs to v14.15.0 and node-red v1.2.7, getting same issue.
@Colin below is the log output that I got from the new build i.e node-red v1.2.7, the output was same for older version.

[23989:0x5602b038d250] 231916414 ms: Scavenge 1006.7 (1025.4) -> 1005.8 (1026.4) MB, 22.4 / 0.0 ms (average mu = 0.872, current mu = 0.848) allocation failure
[23989:0x5602b038d250] 231916526 ms: Scavenge 1006.7 (1025.4) -> 1005.8 (1026.4) MB, 22.4 / 0.1 ms (average mu = 0.872, current mu = 0.848) allocation failure
[23989:0x5602b038d250] 231916613 ms: Scavenge 1006.8 (1026.4) -> 1005.9 (1026.4) MB, 17.8 / 0.0 ms (average mu = 0.872, current mu = 0.848) allocation failure
[23989:0x5602b038d250] 231916777 ms: Scavenge 1006.8 (1026.4) -> 1005.8 (1026.4) MB, 24.8 / 0.0 ms (average mu = 0.872, current mu = 0.848) allocation failure
[23989:0x5602b038d250] 231916943 ms: Scavenge 1006.8 (1026.4) -> 1005.8 (1026.4) MB, 21.8 / 0.1 ms (average mu = 0.872, current mu = 0.848) allocation failure
15 Jan 09:20:49 - [info] [azureiothub:Azure IoT Hub] Message sent.
[23989:0x5602b038d250] 231917083 ms: Scavenge 1006.8 (1028.4) -> 1005.9 (1028.4) MB, 24.9 / 0.0 ms (average mu = 0.872, current mu = 0.848) allocation failure
[23989:0x5602b038d250] 231922048 ms: Mark-sweep 1007.8 (1028.4) -> 1005.6 (1028.6) MB, 4646.4 / 0.7 ms (average mu = 0.781, current mu = 0.233) task scavenge might not succeed
15 Jan 09:20:54 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:20:54 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682649477
15 Jan 09:20:54 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:20:54 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682654270
15 Jan 09:20:55 - [info] [azureiothub:Azure IoT Hub] Message sent.
15 Jan 09:20:55 - [info] [azureiothub:Azure IoT Hub] Message sent.
[23989:0x5602b038d250] 231928290 ms: Mark-sweep 1007.7 (1028.6) -> 1005.6 (1028.1) MB, 5197.8 / 0.2 ms (average mu = 0.642, current mu = 0.167) allocation failure scavenge might not succeed
15 Jan 09:21:00 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:21:00 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682660592
[23989:0x5602b038d250] 231933971 ms: Mark-sweep 1007.7 (1028.1) -> 1005.5 (1028.4) MB, 5297.6 / 0.3 ms (average mu = 0.474, current mu = 0.068) allocation failure scavenge might not succeed
15 Jan 09:21:06 - [error] [modbusSerialConfig:Modbus63] Error: {"name":"TransactionTimedOutError","message":"Timed out","errno":"ETIMEDOUT"}
15 Jan 09:21:06 - [info] [azureiothub:Azure IoT Hub] Message sent.
15 Jan 09:21:07 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:21:07 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682666310
[23989:0x5602b038d250] 231939961 ms: Mark-sweep 1007.6 (1028.4) -> 1005.5 (1028.6) MB, 5048.3 / 0.2 ms (average mu = 0.353, current mu = 0.157) allocation failure scavenge might not succeed
15 Jan 09:21:13 - [info] [azureiothub:Azure IoT Hub] Message sent.
[23989:0x5602b038d250] 231946153 ms: Mark-sweep 1007.5 (1028.6) -> 1005.4 (1028.4) MB, 5169.1 / 0.2 ms (average mu = 0.270, current mu = 0.165) allocation failure scavenge might not succeed
15 Jan 09:21:18 - [error] [modbusSerialConfig:Modbus63] Error: {"name":"TransactionTimedOutError","message":"Timed out","errno":"ETIMEDOUT"}
15 Jan 09:21:18 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:21:18 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682672218
15 Jan 09:21:18 - [info] [azureiothub:Azure IoT Hub] JSON
15 Jan 09:21:18 - [info] [azureiothub:Azure IoT Hub] Sending Message to Azure IoT Hub :
Payload: 1610682678361
[23989:0x5602b038d250] 231951681 ms: Mark-sweep 1007.6 (1028.4) -> 1005.6 (1028.4) MB, 5119.1 / 0.3 ms (average mu = 0.184, current mu = 0.074) allocation failure scavenge might not succeed
[23989:0x5602b038d250] 231957089 ms: Mark-sweep 1007.6 (1028.4) -> 1005.5 (1028.4) MB, 5205.5 / 0.2 ms (average mu = 0.116, current mu = 0.037) task scavenge might not succeed
15 Jan 09:21:29 - [info] [azureiothub:Azure IoT Hub] Message sent.
15 Jan 09:21:29 - [info] [azureiothub:Azure IoT Hub] Message sent.
15 Jan 09:21:29 - [error] [modbusSerialConfig:Modbus63] Error: {"name":"TransactionTimedOutError","message":"Timed out","errno":"ETIMEDOUT"}

<--- Last few GCs --->

[23989:0x5602b038d250] 231951681 ms: Mark-sweep 1007.6 (1028.4) -> 1005.6 (1028.4) MB, 5119.1 / 0.3 ms (average mu = 0.184, current mu = 0.074) allocation failure scavenge might not succeed
[23989:0x5602b038d250] 231957089 ms: Mark-sweep 1007.6 (1028.4) -> 1005.5 (1028.4) MB, 5205.5 / 0.2 ms (average mu = 0.116, current mu = 0.037) task scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0x5602acf6bfc0 node::Abort() [node-red]
2: 0x5602acebc3b6 node::FatalError(char const*, char const*) [node-red]
3: 0x5602ad0c9002 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node-red]
4: 0x5602ad0c925a v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node-red]
5: 0x5602ad256595 [node-red]
6: 0x5602ad2566d4 [node-red]
7: 0x5602ad267eb7 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node-red]
8: 0x5602ad2686f5 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node-red]
9: 0x5602ad26ba1c v8::internal::Heap::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node-red]
10: 0x5602ad26bf65 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node-red]
11: 0x5602ad236c0a v8::internal::factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [node-red]
12: 0x5602ad5355ef v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node-red]
13: 0x5602ad88e5b9 [node-red]

I have searched for many options, one option I added to init script is --max-old-space-size=1024. I have changed this value to 2048 and 2560 in 2 different devices for testing, and I can see them increasing at same rate, just the thing is higher the value longer it takes to crash. Above log may give you an idea about the issue.

It seems something is leaking memory. It can be very difficult to find such problems as I expect you know. You might try disabling sections of the flow in order to work out what is causing it. Possibly it is one of the contrib nodes. I have had node-red running fair sized flows in Pis for months without issue so there is not a big fundamental issue with node-red. It may well be one of the contrib nodes, or if you are handling large chunks of data perhaps something in your code is not releasing it.