Reclaiming memory from Nodes

BShashwat · 10 August 2021 04:29

Hi All,

I am facing an issue where the custom nodes written by me (they constantly poll s3 to watch for latest files & also polls database to look for change in file ingestion status after running ETL Jobs on files) are consuming high CPU reaching almost 100%.

container-cpu

I have read on this forum in another thread post that Node-JS runs GC once memory reaches a certain threshold and consumes lot of CPU. So I wanted to reclaim memory from nodes. Though I have cleaned references to all the variables used within custom node but I can still see increase in memory over certain amount of time.

Can anyone here please confirm that below would be the right way to reclaim memory from complete node:

this.on("close", function (removed, done) {
                if (interval !== null) {
                    clearInterval(interval);
                }
                if (removed) {
                    this = null;
                }
                done();
            });

If there are other best ways of doing this, kindly let me know.
Thanks & Regards

Colin · 10 August 2021 14:17

At what point do you see the memory jump up?

Edit: If it is at fixed times (rather than when you deploy for example) what is it that happens at that time?

TotallyInformation · 10 August 2021 14:28

To understand how JavaScript recovers memory, you need to understand context first.

One of the many reasons that JAvaScript has moved from using var to using const and let is that these help you and the JIT understand when a variable is no longer needed by constraining its context more granularly.

In generaly, a variable that uses const or let or is a parameter on a function (though note the caveat below) will be released once the surrounding execution context is completed. In a simple function or an if block, this would be at the end of the function/block.

In your example, things are more complex since your function is actually an event listener. I think that will retain the function on memory unless you removed it from the list. But any variables in the function would be created and destroyed on each run. Though I'm not entirely sure that is complete because the optimiser will also have kicked in when you run node.js so maybe things are a bit more complex.

But the implication of your code is that interval is created in some higher level of context. All that contains is a reference to the running interval function. I think that once you've cleared the interval, that reference is likely set to undefined, that is easy to test of course. The point being that I don't think that trying to set interval to null will recover much if anything.

But trying to set this to null is a really bad idea. In the context of the function, this is the function itself so you are trying to destroy the function from within. Not sure that the result will be consistent but it is probably a "bad thing" TM.

Once JavaScript has finished with variables, GC can later recover the space on the Heap.

The exception about parameters I mention are, of course, because JS passes by REFERENCE and so you do not get a copy unless the content is a base value (string, boolean, number, etc). So recovery of that is minimal given that it will only be a reference. Though a long string might be more impactful.

One thing that I do in uibuilder in the close (and by the way, I use node.close rather than this.close because it is too easy to end up with the wrong this) is that I do remove the input listener.

So in my nodes, I use a function I always call nodeInputHandler, in node.close, I include: node.removeListener('input', nodeInputHandler) to make sure that if the node restarts, I don't end up with two instances of the listener.

I also make sure that I close out and remove any websocket/socket.io instances and any ExpressJS middleware and routes that my node may have added.

Sorry, another update. Do make sure that you have not accidentally used a variable without declaring it. If you do that it will be attached to global and so may not be released. Also, make sure that you always use 'use strict' and use a good code editor such as VScode with ESLINT. JavaScript is not the easiest or most logical of languages and using an editor with good linting is a great way to avoid all many of issues.

BShashwat · 10 August 2021 18:40

@Colin ... Thanks for the reply. The memory doesn't jump in immediately, it takes a while sometime 4-8 hrs, sometimes a day or more ... so my guess is based on the number of nodes polling both s3 & DB ...the time is varying. More nodes means quick jump and less nodes may take more time. So I deployed on my local docker instead of server and tested, the memory is adding up slowly slowly and once it reaches a threshold the container get hanged as NodeJS kicks in the GC, utilizing high CP, thus making node red hang.

knolleary · 10 August 2021 19:59

At the moment, the only graph you've shared is of CPU usage and you've assumed that is caused by memory usage. Have you got an equivalent graph of memory usage to show that really is the issue at hand?

BShashwat · 10 August 2021 20:06

@TotallyInformation ...I agree with your post and I am already using best practices laid down for JS coding though I have not worked with NodeJs ever before, but all programming languages have same basics so syntactical difference doesn't matter. Just listing them down so that anyone else reading here can take benefit.

using const mostly where assignment will not happen again
I am not setting this to null, it was a case which I wanted to test and it didn't work as you mentioned
Already using 'strict' as mentioned
Not using any global variable, at most at flow level or node level only that too single value only. No objects with references.
I am using ESLint and not using any undeclared variables. Linter shows error for such.
Using VS code
Converted $.ajax .success() to deferred .done() as specified by JQuery website
Cleared variables which are within code, re-run by setinterval() as they will keep adding on after each interval
I have just refactored all my nodes and earlier each node was making its own JQuery Ajax call to interact with API ... for which I have created a separate module (generic) and though I wanted to implement dependency injection similar to Angualr JS but most of the posts online advised that NodeJs already allows to import module so I didn't pursue this further.

Now since I have common module for making Ajax requests a measure of clearing the memory I have wrapped calls in try/finally block the free the variables (though they will go out of scope but not sure after how long will they get collected) as below:

exports.makeAjaxPostCallToAPI = function (endpoint, data, token) {
    try {
        console.debug('API complete URL is ' + process.env.API_BASE_URL + endpoint);
        return $.ajax({
            type: 'POST',
            url: process.env.API_BASE_URL + endpoint,
            data: JSON.stringify(data),
            contentType: "application/json",
            headers: {
                'Content-Type': 'application/json;charset=utf-8',
                "Authorization": "Bearer " + token
            },
            dataType: 'json',
            async: true,
            cache: false,
            timeout: 30000,
        });
    } finally {
        endpoint = null;
        data = null;
        token = null;
    }
}

Also I don't think Javascript passes by reference. It passes by value, just that if any objects are getting passed as parameters they retain reference to values i.e. new copy will refer to old values (Read here for more: Is JavaScript a pass-by-reference or pass-by-value language? - Stack Overflow)

So based on that I have wrapped all my function calls within try/finally and cleared the parameters as they are copy of original (though not a full deep copy). As for original I am clearing them when node closes.

After doing so much memory was still increasing ...so I checked the below links on memory management:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management https://blog.sessionstack.com/how-javascript-works-memory-management-how-to-handle-4-common-memory-leaks-3f28b94cfbec

Based on the circular reference they have explained I believe

CustomCodeNode.prototype.addRequestToQueue = function (node, msg, nodeSend) {
        // const node = this;
        console.info("Custom-code Add Request To Queue function start");

Assigning node = this in every function might b a cause of memory leak as this can cause circular reference and not allow GC to collect and this was in almost all functions (copied it from Node-Red docs). So as an experiment I commented it and instead passed node as a parameter (this will get collected as soon as function goes out of scope) or we can manually make it null as it does not affect the original this object.

After making this change I have deployed the nodes and created 50 flows with 4-5 nodes in each flow. I bombarded the API's from Node-Red with 100's of requests/response going in/out per second and it seems this is working. It has been 8 hours and container is still keeping up with max. CPU going up to 70% and Memory utilization 4GB/16GB ...though I monitor it for couple of days.

Kindly let me know if u have more suggestions to enhance the performance. I will implement the removeListener part which I forgot about. Thanks for sharing that piece.

Here is the chart for past 6 hrs of usage:

You post was really helpful. Kindly share your thoughts so that I can learn and enhance more.

Thanks & Regards,
Shashwat

BShashwat · 10 August 2021 20:23

@knolleary ... I was not having access to the graphs. That graph was shared by dev-ops team. I have taken access today and after today's refactoring I don't see very high memory usage ....here is the graph for memory usage from today.

Can u plz help me understand what could cause CPU surge apart from memory. There were earlier 2-3 flows only with very few nodes and only s3 watcher was polling. Yet CPU was spiking every 4 hours. There were no errors in logs and container use to go down giving unhealthy status. Not sure of any other reason that would do so....and the most amazing thing was even when container was unhealthy/unreachable ... Node-Red was still running and I could see the logs being written and custom nodes calling Rest API's.

Not sure why was this happening, most of the sites mentioned memory usage only to cause surge and all other posts in this forum does not answer clearly about the issue, So it was kinda guess as I could see on my local docker memory increasing every few minutes and not reducing to levels where it started.

Kindly help me out on this,

Thanks & Regards
SB

Colin · 10 August 2021 20:55

I think it more likely that you are doing more work that you think in your flows. You mentioned that you query a database. How often do you do that, how many rows does your query return, and what processing are you running on the data?

TotallyInformation · 10 August 2021 21:07

An interesting view. from that Q:

It's always pass by value, but for objects the value of the variable is a reference

Most things in JS are objects and this explains it much better than I did but is saying much the same.

The author means that by passing a reference, you're passing a reference value (another way to think of it is passing the value of the memory address

OK, so some semantics there and I kind of get it but I'm not convinced that it matters at least most of the time, the effect is pretty much the same.

This is the reason Node-RED works at all of course - the so-called msg is just an object variable and so is (mostly) passed by reference (reference value) from node-to-node. Only where you have 2 or more wires off an output do you get a copy. If this weren't the case, node-red would be a nightmare and would rapidly bloat itself into oblivion.

OK, now I'm a little confused. Perhaps you could share a bit more of your code where you doing this so that we can see the diff in the code?

In my nodes, I implement a function called nodeInstance

// ... module-level setup ...
module.exports = function( RED ) {
    // ... module-level code ...
    function nodeInstance(config) {
        // ... instance-level code ...
        // - fn defined here so that it inherits `RED`. I don't know of another way to do this.

        // Create the node
        RED.nodes.createNode(this, config)
        ...
        // Take a reference to `this` in case the context changes
        const node = this
        ...
        // apply the properties from the config to the node (this)
        node.name            = config.name  || ''
        node.topic           = config.topic || ''
        node.url             = config.url   || 'uibuilder'
        ...
        // define the handler function for incoming messages
        // - defined here so it inherits both `RED` and `node`, I don't know a better way to do that
        function nodeInputHandler(msg, send, done) { ... }
        // Register a handler for incoming messages
        node.on('input', nodeInputHandler)
        // Register the handler for node-red closing or the instance being (re)deployed
        node.on('close', function(removed, done) { ... }
    }

    // register the node
    RED.nodes.registerType(moduleName, nodeInstance, { ... })
    ...
    // ... Express API code ...
}
//EOF

You seem to be suggesting that const node = this has a real memory impact? I can't say I've seen this myself and I use the same approach in other places as well such as in the oneditprepare function in the node's html file.

I'm certainly not an expert JavaScript programmer by any stretch though so I may have this all wrong. I would certainly be interested if there is a better/cleaner way to do things.

BShashwat · 11 August 2021 04:45

@TotallyInformation .... posting one of my node code here ... I will delete it later once u have seen this.

The code is pretty simple it authenticates user, reads metadata from DB based on custom code to run (setup from UI so that metadata has info about which connection to use for warehouse, which schema etc.) and then places a run request to a queue in DB, from where ETL will pick the request and run specific stored proc.
Similar way I have written for File ingestion from S3, DQ on files, Running custom python code, moving data from staging layer to another layers etc.

I have followed the structure as explained in Node-Red docs (as I m new to this) and also taken structure from node-red repo s3 node (written by @knolleary) since he is the founder and would have complete knowledge so just followed his lead.
https://nodered.org/docs/creating-nodes/first-node
https://github.com/node-red/node-red-web-nodes/blob/master/aws/aws.js

So if u see the git hub code it uses var node=this in list & filter functions same way I used const .... but then faced this issue. Have now commented that particular line and instead passed node as parameters and resources are getting released.
@Colin - The code is checking for status update by ETL every 7 seconds and usually it works within seconds and only single value is returned with job status (no rows returned from DB). Once status is success it triggers next node attached or if it fails then flow just terminates logging the node execution time and error.

So only thing I could conclude was that Node-Red is not releasing resources as they are being referred by. Imagine 10-15 nodes in one flow and lot of such flows each polling in parallel ...Finally had to manually set all local and node level variables to null, to clean up and after putting heavy load since refactoring yesterday ....and running 50 flows in parallel through inject node....the CPU has managed to be in control.

SO not very sure but I still think var node=this might b creating multiple memory references from each of the flows for each of the nodes and not getting cleaned up as system is constantly in use. Probably more knowledgeable members here can help me figure out root cause. My issue is solved for now ....but I m still not sure of the root cause.

Here is a screenshot of node-red-container which have I have set running since yesterday 5.50PM IST with high load and looks like it's releasing resources now.Not gone down once.

Though ETL container has spiked to 86% overnight due to heavy load from Node-Red....probably will need to look into that next as that's in Python Flask.

Everyone here, kindly let me know your thoughts about this Node-Red issue and help me gain some insight.

Best Regards,
Shashwat

Colin · 11 August 2021 07:07

Please identify which line is which on the chart.

BShashwat · 11 August 2021 09:15

@Colin ... Chart has legend which displays the following:

Creamish color line is Node-Red container
Purplish blue line is ETL container
Green color line are REST API container
Yellow colored line was previous Node-Red container which was fully spiked and got killed when I deployed new Node-Red container with refactored changes.

Colin · 11 August 2021 10:00

So what problem in that chart are you trying to address?

BShashwat · 11 August 2021 10:12

Resource utilization was high for Node-Red even with just 2-3 flows and 5-6 nodes only, not even running in parallel ...container was getting unhealthy with nearly 100% CPU. Though Node-Red was running in background writing logs and calling API's but container was unreachable. So just trying to figure out what would have caused the container to spike and go down without much load, what would be the right way for resource & memory management and what best practices can I apply to make custom nodes run flawlessly

Colin · 11 August 2021 10:18

The node-red line is not going up, it is the ETL line that is going up. Do you understand why that is happening? Also notice that the ETL container is not doing anything at all at the start, when node-red is maxed out. That quite possibly means that node-red is stuck in a loop somewhere. Have you checked the node-red log to see if it says anything useful? It might be helpful to show us the startup log from node-red.

BShashwat · 11 August 2021 10:32

@Colin ... yeah the latest chart you are seeing is refactored version of nodes which I have done yesterday and now its doing well. However before that Node-Red was maxing out as shown in the very first chart. Nodes do run loops to ascertain status of jobs and look for files in s3....however that is not much load. Hardly an API call every 7 second that loops completes if status is complete/failed/partial. ETL is not my concern, I have asked the ETL team to look into that. Logs doesn't show any failures/errors...all good in logs.

TotallyInformation · 12 August 2021 01:17

OK, so here is a different way of structuring a node's js file - it removes the need to use const node=this completely. Not massively tested but it seems to work OK.

function Test2(RED) {
    const nodeInstance = instanceNode(RED)

    RED.nodes.registerType('jktest2', nodeInstance)
}

function instanceNode(RED) {
    return function nodeInstance(config) {
        console.log('>>>====>>> [jktest2:nodeInstance] Startng')
        RED.nodes.createNode(this, config)

        this.name = config.name
        this.topic = config.topic

        this.on('input', inputHandler(this,RED))
        this.on('close', function(removed, done) { 
            //
        })

    }
}

function inputHandler(node,RED){
    return function nodeInputHandler(msg, send, done) {
        console.log('>>>====>>> [jktest2:nodeInputHandler] Startng')
        send(msg)
        done()
    }
}

module.exports = Test2

//EOF

It also has the advantage that all functions are named which probably makes debugging easier.

What do people think? Is this better than the usual structure (that was in my earlier post)?

BShashwat · 12 August 2021 07:23

@TotallyInformation ... this certainly looks better, cleaner and testable . Many thanks for taking this up and coming back with an enhanced version. Probably people in the community can add more feedback.
Will try to use this one in new node and give a proper feedback in some time as I also need to check with Mocha tests

TotallyInformation · 12 August 2021 14:27

Yes, I'm not the worlds greatest JavaScript dev (in fact certainly down towards the bottom of the pile) since I am not a professional developer and haven't been for many years. So it would be excellent if other people could look and critique.

I would be especially interested to get views from @knolleary and @dceejay. Does this approach really give any benefits? With the exception of the slight added complexity of having 2 functions that return functions (a reasonably common JS construct but a little hard to get your head around), this does seem easier to comprehend to me.

I did actually wonder whether it would be worth trying to go further and see if a node could be wrapped as a JS class. But my head hurt after working this out so I haven't tried that yet.

With my limited experience and knowledge, I can't help feeling that I might have missed a colossal "gotya" somewhere. But as I say, it seems to work and certainly seems to me at least to separate out the component parts of a node which strikes me as being easier to understand.

kevinGodell · 13 August 2021 12:58

I was actually going to suggest that, but didn't want to sound like a preachy javascript nut. I find classes to be a nice way to structure and organize my code. The benefit of adding methods to the class is that they get added to the object's prototype, which is supposed to save memory if you have many instances. Also, I heavily utilize class(static) properties to share info between instances if needed.

Also, when debugging memory issues in the past, I took advantage of the garbage collection feature of nodejs by passing --expose-gc which enables you to directly run garbage collection. I had an issue when passing buffers between nodejs js and c++ code and it seemed like my memory was climbing and not being collected in a very timely manner. Running the garbage collection at an interval proved to me that my code was clean because the memory was released. javascript - How to request the Garbage Collector in node.js to run? - Stack Overflow

Topic		Replies	Views
How Node-red's memory recycling mechanism works General	39	386	16 March 2024
Memory usage increases quickly General	8	377	8 December 2022
A question about freeing resources in on('close') Developing Nodes	9	492	1 September 2020
Custom Node: reoccurring processing, context storage, Developing Nodes	9	829	3 January 2021
Global variables, wich problems? General	13	164	1 January 2024

Reclaiming memory from Nodes

Related Topics