Local context storage not intended for large data arrays?

I am storing very large data arrays (about array[8000] with some 20+ objects per entry) in the global context storage. I have set context stores to local storage in settings.js.
This data is updated every 5mins 24/7. Every 5 min an array entry is created at the top of the array and one removed from the bottom. Every array entry has a timestamp.

But I keep on running in some weird behavioral issues. Every now and then some 4000 entries go missing.
Like this morning when I saw that I see that the last 35 entries are in sequence, but the 36th entry is from two weeks ago? Yesterday the array was intact.
If I had to guess, I would say, the data wasnt dumped to disk for 2 weeks and this morning it somehow had some sort of reset and the last data it found on the disk was 2 weeks old?

So is this my fault as context store was never intended to be used for such large arrays? What other options do I have to store that amount of data?

Hi @metter

how have you got context configured? Sounds like you have it setup with a local-filesystem store configured - have you changed any of the default settings? (flush interval, etc)

There shouldn't necessarily be anything inherently wrong with using context in that way - although it is not as optimised as a proper database would be. It certainly shouldn't lose data in the course of normal operation.

How do you access context? Is it using flow.set(...) in a Function node or Change node etc?

Have you checked the contents of the context files on disk to see what they contain? How big are the files?

I am storing and accessing global context in function nodes (global set/global.get).
In settings.js I just enabled these lines:

contextStorage: {
         default: {
             module:"localfilesystem"
         },
     },

Otherwise all is left to default.

as for the file sizes: global.json is 536869701. So 536MB.

I am also seeing that global.json has a timestamp from over 4 hours ago. There are also dozens of global.jsonTIMESTAMP.tmp files with 0 size dated september 11th.

Are you sure you are only updating it every 5 minutes?

There is a calculation cycle that autostarts every 5 mins. I can watch that one trigger and see things moving along the line. During that cycle I write multiple times to the current (first [0]) array item by updating various objects.
But I forgot to mention I actually have TWO such big data array with identical length but different object in them in global context storage.
and since I just lost 4000-something entries, they both have a length of array[3484] at the moment.

What version of node-red are you using?

2.0.6
shouldnt global.json have a much more recent timestamp? It feels like node red is not dumping context data to file, no?

What device is NR running on and how much memory does it have and how much storage?

ubuntu server 20.04.3 LTS running on an i7 with 16GB RAM.
SDA2 is on a 4TB disk and has 100GB allocated. It says 51% full

Yes, it should write it every 30 seconds (if it is changed via global.set()). I wonder if that is too fast for 1/2GB data, and it is showing up a bug.

You think the entire global context is dumped every 30secs? I always assumed it would just update what has changed.

It writes the full context each time (I believe). Only if it has changed though. It should work with your data but I think maybe using context is not a good solution to your problem anyway.

thinking about it. It's a json file and not a database. And my array "shifts" down every 5min, changing all the contents. Therefore it must rewrite it entirely every time..

It has to write the full file however it is updated, it cannot modify just part of the file.
You can adjust how often it writes, see the node red docs Working with Context.

I'll try that thanks. Otherwise I will have to look at SQL.

Even it it appears to work with 5 mins, it may just be because it dramatically reduces the likelihood of hitting the bug. It would be good if you could mock up a flow generating simulated data which demonstrates the problem. Then it could be fixed.

I just found this post:
https://discourse.nodered.org/t/persistent-context-storage/37858/2

by setting

contextStorage: {
        default: {
        module: 'memory'
        },
        disk: {
           module: 'localfilesystem', config: {flushInterval: '600'},
          },
     },

I can keep it all in memory, but flush to disk at the end of every cycle with:

flow.set("my_object", content, "disk")

That will do the trick

I would not bet on it. I am just testing a mockup with this flow. Clicking the second Inject creates a small object in context and clicking the first creates a very large one with the same structure. Clicking the second one works fine and the data is written to file 30 seconds later. Clicking the second one builds the object very quickly and it can be seen in the context tab. When the time comes to flush it node red hogs the processor for several minutes on my i5 running Ubuntu and then node red crashes with this error

8 Nov 16:45:46 - [red] Uncaught Exception:
8 Nov 16:45:46 - [error] RangeError: Invalid string length
    at JSON.stringify (<anonymous>)
    at stringify (/usr/lib/node_modules/node-red/node_modules/json-stringify-safe/stringify.js:5:15)
    at stringify (/usr/lib/node_modules/node-red/node_modules/@node-red/runtime/lib/nodes/context/localfilesystem.js:136:18)
    at /usr/lib/node_modules/node-red/node_modules/@node-red/runtime/lib/nodes/context/localfilesystem.js:214:46
    at Array.forEach (<anonymous>)
    at LocalFileSystem.self._flushPendingWrites (/usr/lib/node_modules/node-red/node_modules/@node-red/runtime/lib/nodes/context/localfilesystem.js:211:24)
    at /usr/lib/node_modules/node-red/node_modules/@node-red/runtime/lib/nodes/context/localfilesystem.js:307:53

The problem is that node red has to create the 0.6GB JSON string from the array before it can write it to disc, which takes minutes, and before it completes something goes wrong, crashing node red. I will submit an issue.

[{"id":"cecf7c5af47399f3","type":"function","z":"bdd7be38.d3b55","name":"Big Data","func":"let data = global.get(\"data\", \"file\")\nlet bigBuffer = {time: msg.payload, buffer: Buffer.alloc(100000)}\nif (typeof data === \"undefined\") {\n    node.warn(\"undefined\")\n    data = Array(5000).fill(bigBuffer, 0);\n}\nglobal.set(\"data\", data, \"file\")\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":350,"y":1160,"wires":[["9357c2ae0507e87f"]]},{"id":"48ce64f2ea702719","type":"inject","z":"bdd7be38.d3b55","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":170,"y":1160,"wires":[["cecf7c5af47399f3"]]},{"id":"9357c2ae0507e87f","type":"debug","z":"bdd7be38.d3b55","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":560,"y":1160,"wires":[]},{"id":"16adf19296938514","type":"function","z":"bdd7be38.d3b55","name":"Little Data","func":"let data = global.get(\"ldata\", \"file\")\nlet bigBuffer = {time: msg.payload, buffer: Buffer.alloc(2)}\nif (typeof data === \"undefined\") {\n    data = Array(5).fill(bigBuffer, 0);\n}\nglobal.set(\"ldata\", data, \"file\")\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":320,"y":1300,"wires":[["47767708860e7247"]]},{"id":"17d1722c0d37f72c","type":"inject","z":"bdd7be38.d3b55","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":160,"y":1300,"wires":[["16adf19296938514"]]},{"id":"47767708860e7247","type":"debug","z":"bdd7be38.d3b55","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":510,"y":1300,"wires":[]}]

Issue raised Saving huge array in global context on localfilesystem crashes node red · Issue #3250 · node-red/node-red · GitHub