Flowfuse Device Agent crashes. FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Hi...

I have been using flowfuse for more than a year with no real issues. There are 2 VM's, Test and Prod, in the environment.

Test is the flowfuse environment. I can edit, deploy (and access) the flows to the local node red with no issues.

Prod is a flowfuse device agent. Up until yesterday I was able to deploy new flows to the device agent without any issues.

Version: @flowfuse/device-agent@3.2.0
debug> v8.getHeapStatistics()
{
  total_heap_size: 7049216,
  total_heap_size_executable: 262144,
  total_physical_size: 6262784,
  total_available_size: 1961951936,
  used_heap_size: 5821592,
  heap_size_limit: 1966866432,
  malloced_memory: 278624,
  peak_malloced_memory: 1851624,
  does_zap_garbage: 0,
  number_of_native_contexts: 3,
  number_of_detached_contexts: 0,
  total_global_handles_size: 8192,
  used_global_handles_size: 3072,
  external_memory: 2631998
}
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] Node-RED version: v3.1.8
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] Node.js  version: v20.19.1
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] Linux 5.14.0-503.19.1.el9_5.cloud.1.0.x86_64 x64 LE
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] Loading palette nodes
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] FlowFuse Light Theme Plugin loaded
Apr 30 09:17:53 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:53 AM [info] FlowFuse Dark Theme Plugin loaded
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [warn] ------------------------------------------------------
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [warn] [@flowfuse/nr-file-nodes/file] 'file in' already registered by module node-red
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [warn] [@flowfuse/nr-project-nodes/project-link] Error: Project Link nodes cannot be loaded outside of an FlowFuse EE environment (line:6)
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [warn] ------------------------------------------------------
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [info] Settings file  : /opt/flowfuse-device/project/settings.js
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [info] Context store  : 'memory' [module=memory]
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [info] Context store  : 'persistent' [module=localfilesystem]
Apr 30 09:17:54 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:17:54 AM [info] User directory : /opt/flowfuse-device/project

The "heap" errors on Prod appeared after I added new flows to Test and deployed the snapshot to Prod.

Here's some of the log...

Apr 30 09:30:20 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:20 AM [info] Stopped flows
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info] Node-RED Stopped
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info] Launching with new settings...
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info] Configuration :-
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Instance           : unknown
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Snapshot           : y4ejlmAzBq
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Settings           : 1c498a9fd8e5540fd688a076bf485fa823d9e626785b638dca0f9ceb69b26ce3
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Operation Mode     : autonomous
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Target State       : running
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Local Login        : disabled
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * Licensed           : unknown
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info] Environment :-
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * FF_DEVICE_ID       : XgVAQxjLqB
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * FF_DEVICE_NAME     : Prod1
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * FF_DEVICE_TYPE     : XXXXXXXXX
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * FF_SNAPSHOT_ID     : y4ejlmAzBq
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info]   * FF_SNAPSHOT_NAME   : YYYYYYYYY
Apr 30 09:30:21 prod1 FlowFuseDevice[86800]: [AGENT] 4/30/2025 9:30:21 AM [info] Updating configuration files
.
.
.
Apr 30 09:30:43 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:43 AM [info]   x: 1000,
Apr 30 09:30:43 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:43 AM [info]   y: 80,
Apr 30 09:30:43 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:43 AM [info]   wires: [ [ 'bba5055619f4107c-e55f9e23.101b2' ] ],
Apr 30 09:30:43 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:43 AM [info]   _alias: 'ffed1e7.f04326'
Apr 30 09:30:43 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:43 AM [info] }
Apr 30 09:30:44 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:44 AM [info] <--- Last few GCs --->
Apr 30 09:30:44 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:44 AM [info] [87265:0x6a792c0]    16122 ms: Mark-Compact (reduce) 493.0 (521.7) -> 491.1 (521.4) MB, 902.47 / 0.00 ms  (average mu = 0.223, current mu = 0.093) allocation failure; scavenge might not succeed
Apr 30 09:30:44 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:44 AM [info] [87265:0x6a792c0]    17089 ms: Mark-Compact (reduce) 492.0 (521.7) -> 491.2 (521.9) MB, 772.83 / 0.00 ms  (average mu = 0.213, current mu = 0.200) allocation failure; GC in old space requested
Apr 30 09:30:44 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:44 AM [info] <--- JS stacktrace --->
Apr 30 09:30:44 prod1 FlowFuseDevice[86800]: [NR] 4/30/2025 9:30:44 AM [info] FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Tried playing with --max-old-space-size but it had no real effect.

I took an export of all the flows and loaded into a standalone node red, i.e. not using the node red of the flowfuse device agent, and the flows loaded without any issues.

Any assistance would be greatly appreciated!

Hi @HermanWiid, welcome to the forum.

First let me state, this is not a FlowFuse issue. By labeliing it up as FlowFuse and positioning it in this category you get less responses from users not familiar with FlowFuse.

So, you error is related to heap memory usage in a Node application - in this case, Node-RED.

Unfortunately these are particularly tricky to trackdown but there are numerous threads on this forum that you could extract information to help you.

The error get thrown in a variety of situations, here are a few:

  • When attempting to process large amounts of data or execute complex algorithms that require a lot of memory.
  • When running a Node.js server with a large number of connections or concurrent requests.
  • When running a Node.js application in a container or virtual machine with limited memory resources.

Could you list out details like the VM size, available memory. If you have access to the VM, perhaps you could use top or htop to check its resources? Also, what contrib nodes you are using?


Closing thoughts...

It is most likely, since this occurred after a modification that there is an issue in your flows. This can happen if you cause backpressure (i.e. lots of messages building up in a loop or delay - aka "in flight". I would consider rolling back to a snapshot where you did not have this problem. Additionally, Node-RED v3.0.8 is not current. I would likely update to v4.0.9.

Hi @Steve-Mcl

Thanks for your input.

Since I was able to load the flows in Node-Red, outside of the FlowFuse Device Agent environment, it seemed more like that it was a FF issue. My thinking was that somehow the FF DA loader was starting Node-Red with constraining memory parameters.

Top for the Test VM (where everything is working)

top - 14:00:40 up 162 days, 19:43,  1 user,  load average: 0.71, 0.25, 0.09
Tasks: 126 total,   1 running, 125 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  0.3 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.3 hi,  0.2 si,  0.0 st
MiB Mem :   3656.5 total,   1674.3 free,   1611.1 used,    737.1 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   2045.4 avail Mem

Top for the Prod VM

top - 13:52:49 up 17:30,  1 user,  load average: 0.00, 0.02, 0.00
Tasks: 121 total,   1 running, 120 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  0.5 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.5 hi,  0.2 si,  0.0 st
MiB Mem :   3655.5 total,   1138.1 free,   1355.9 used,   1474.1 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   2299.6 avail Mem

Contrib Nodes

@digitaloak/node-red-contrib-digitaloak-postgresql
node-red-contrib-auth
node-red-contrib-crypto-js
node-red-contrib-jwt

Node-Red v3.1.8 is bundled with the FF DA installation.

Thanks!

No, the device agent installs the node-red version specified in the device settings or whatever value (if any) is in your target snapshot. So for example, you can override this in device settings -> node-red -> version.

Actually, that is a valid point. In the launcher file, we do add --max_old_space_size=512. What that means is when this heap fills up, Node.js needs to perform a garbage collection (GC) cycle. Coupled with your comment about "now not able to modify flows" (my mind reads "you have recently modified flows") from that, I dont think it is a stretch to conclude something has changed and the GC is now unable able to keep up with memory allocations.

Are you able to try something out? If possible, edit the file lib/launcher.js in your device agent installation and change the line to something like '--max_old_space_size=1024' (or remove it altogether)

Thanks will keep that in mind when I do an update.

Bingo! :slight_smile:
Changed --max_old_space_size from 512 to 1024. All flows loaded without any issues.

Thanks a million for all your assistance. Really appreciated.

1 Like

Thanks for confirming.

I have raised an issue to assess this moving forwards.