Mongodb4 hangs randomly

Hi

Recently I've detected some random problems when trying to connect to mongo using the node-red-contrib-mongodb4 plugin. Attached you have a simple flow that includes some basic stuff to get the collections from a database, try to find a specific one (defined in the init node) and do something depending on whether it exists or not. If it does, show its contents and if it doesn't, try to create it.

In the past I detected a problem that seemed to be related with the auto-cloning of messages at the output of a node in case this output goes to multiple nodes (out of memory or something like that, don't remember the exact error, but I'd same a deep clone of the response of a collections request was too much for some reason). As a result, as you can see in Example 3, if only one node consumes the output (in my case it's a function node), everything works fine, as long as I create a new message for the output of the function to prevent here again the deep clone.

However, if I test this in Example 1, you can see the output of Get collections 1 connected to a debug and a function node and works fine :thinking: Magic?

So, after going back to the initial "solution" (no debug node, just let the message go to the function node), I decided to add a Complete node to see whether the problem appeared when trying to get the collections, when trying to get the elements of the selected collection or when trying to create it.

Results:

  • With the Complete node enabled, I see a "RangeError: Maximum call stack size exceeded" message generated by Get collections 1, but it still works fine. Or at least it works fine in this flow, because in my main flow (exact same set of nodes for this part), the execution sometimes get frozen there and I need to restart nodered..
  • In my main flow, if the flow can continue after Get collections 1, since the target collection doesn't exist, it gets frozen in Create collection 1.

I mean.. what is this? The database is working fine (it works like a charm in this example flow which does the exact same things), there's no connectivity problem (otherwise it'd never work)..

Just in case it's relevant, I'm using node-red 4.0.2 on top of node.js 20.16.0 and mongo 4.2.

Regards

flows.json (21.1 KB)

Ok, problem found (it had nothing to do with the library getting frozen at all).

What I don't understand is the policy used by nodered to execute the code, because the problem was like 6 nodes ahead of the mongo ones and still I couldn't see any debug messages at all before the point, although they were generated in normal conditions. I had to modify the library to enable some low-level console.log messages to confirm what was going on..

So, is there any way to force these messages to be visible in the debug tab as soon as they're generated? Even if this is not the normal behavior, it would be great to have such a feature for debugging purposes.

Regards

This can happen if the browser is hung up doing something (so it is not able to display the debug messages), or if the browser is running on the same machine running node red, then the machine itself is so busy that the browser does not get a look in. Probably there are other possible causes. Maybe the mongodb node sends its normal message (which causes problems further down the flow) but does not signal Complete until a little later.

The browser is indeed running in the same machine as nodered, it's not frozen (at least I can navigate perfectly in other tabs) and neither is the machine itself ofc. Since the problem was in an infinite loop (I forgot a break) and node.js is single-threaded, I'd understand it in case the problematic node was close to those which generate debug messages, so my guess is that there must be some sort of best-effort queuing mechanism for them, otherwise I can't see why it takes so much time to output this data in the debug tab.

Regards

Show us a screenshot of the nodes from the mongo node up the the one that had the infinite loop, including the debug node(s) whose output did not appear.

Attached you have a very simplified flow showing this. In this case, 2 nodes generate debug messages (Find collection multiple messages and Process elements one) until the flow reaches Read variable, where I've forced an infinite loop. If you just comment the while in this last node, you can see all the debug, but if you don't, no message is displayed, and I can tell the code reaches this point after having debugged the js source files of the mongo plugin.

With the infinite loop active, you can still open/modify nodes but obviously you'll get stuck since you can't deploy and all your changes will be lost. With the top command, I can see the node-red process using 100% CPU (luckily we have multiple cores nowadays :grin:). I'm taking for granted the collection exists in mongo btw.

Regards

flows.json (10.5 KB)

Can you add debug nodes to the outputs of the mongo nodes, set to Output Complete Message, and capture the output and paste it here so I can simulate the flow. This canned text explains how to copy output from the debug window in a way that can be copied and used in a flow:

There’s a great page in the docs (Working with messages : Node-RED) that will explain how to use the debug panel to find the right path/value for any data item.

Pay particular attention to the part about the buttons that appear under your mouse pointer when you over hover a debug message property in the sidebar.

BX00Cy7yHi


Also, in future, if posting a flow, code or data, rather than attaching a file, please use the method explained in this canned text:

In order to make code readable and usable it is necessary to surround your code with three backticks (also known as a left quote or backquote ```)

``` 
   code goes here 
```

See this post for more details - How to share code or flow json

In order to show the output of the Get collections node, I've had to remove the link to the next function node, otherwise I get a RangeError: Maximum call stack size exceeded error. The output is this:
imagen

As you can see, the path to reach the names of the collections is:
msg.payload[x].s.namespace.collection

And, as can be seen in the Find collection node, what I do is a simple validation of the format of the response:

var msg2 = {};
var extraDebugEnabled = true;

/* Checks whether the collection of this flow exists in the database
 */
function collectionExists(targetCollection) {
    while(true) {
        if (typeof msg != 'object') {
            break;
        }
        
        if (extraDebugEnabled) {
            msg2.debug = "P2.1 / hasPayload " + msg.hasOwnProperty('payload') +
                " isArrayObject " + (Object.prototype.toString.call(msg.payload) == '[object Array]');
            
            node.send([null, null, msg2]);
        }

        if (!msg.hasOwnProperty('payload') || (Object.prototype.toString.call(msg.payload) != '[object Array]')) {
            break;
        }
        
        if (extraDebugEnabled) {
            msg2.debug = "P2.2 / length " + msg.payload.length;
            
            node.send([null, null, msg2]);
        }

        for(let i = 0; i < msg.payload.length; i++) {
            let collection = msg.payload[i];
            
            if (extraDebugEnabled) {
                msg2.debug = "P2.3.1 / i " + i;
                
                node.send([null, null, msg2]);
            }

            if (typeof collection != 'object') {
                continue;
            }
            
            if (extraDebugEnabled) {
                msg2.debug = "P2.3.2 / i " + i;
                
                node.send([null, null, msg2]);
            }

            if (!collection.hasOwnProperty('s') || (typeof collection.s != 'object')) {
                continue;
            }
            
            if (extraDebugEnabled) {
                msg2.debug = "P2.3.3 / i " + i;
                
                node.send([null, null, msg2]);
            }

            if (!collection.s.hasOwnProperty('namespace') || (typeof collection.s.namespace != 'object')) {
                continue;
            }
            
            if (extraDebugEnabled) {
                msg2.debug = "P2.3.4 / i " + i;
                
                node.send([null, null, msg2]);
            }

            if (!collection.s.namespace.hasOwnProperty('collection') || (typeof collection.s.namespace.collection != 'string')) {
                continue;
            }
            
            if (extraDebugEnabled) {
                msg2.debug = "P2.3.5 / i " + i + " nameInternal " + collection.s.namespace.collection;
                
                node.send([null, null, msg2]);
            }

            if (collection.s.namespace.collection == targetCollection) return true;
        }

        break;
    }

    return false;
}

if (extraDebugEnabled) {
    msg2.debug = "P0";
    
    node.send([null, null, msg2]);
}

var targetCollection = flow.get("mongoCollectionName");

if (extraDebugEnabled) {
    msg2.debug = "P1 / targetCollection " + targetCollection;
    
    node.send([null, null, msg2]);
}

if (collectionExists(targetCollection)) {
    msg = {};
    
    msg.payload = {};

    msg.collection = targetCollection;

    return [msg, null, null];
}
else {
    msg = {};

    msg.payload = targetCollection;
    
    return [null, msg, null];
}

Regarding the output of the Get elements node, it looks like this:

imagen

Includes the name of the collection (which I simply ignore) and the payload is an array of objects (which is empty in this case). In my example, the Process elements node is doing nothing but clean the message, so that's not relevant.

And just to have a clear picture of the flow:

Regards

I don't understand what you mean. Are you saying that having the function node linked changes what you see in the debug node showing what is coming out of the first mongo node?

Also, in order that full message is posted here, you must use the Copy Value button in the debug pane as described in the previous post. Then I can put that into an inject or function node to simulate the mongo node.

Or do you just mean you had to remove the infinite loop, in which case yes, obviously.

Ok, the full output of the Get collections node is here:
output.json (59.3 KB)

I have tried to insert it directly in the post but the maximum limit is exceeded.

This is a quite large payload (around 60 KB), and that's why if I run the flow connecting the output of this node to both the debug and the function node, I get this RangeError: Maximum call stack size exceeded message. If you pretty-print this output, you'll see around 3700 lines of nested stuff, and it looks like cloning such a json is just too much (it works fine as long as it doesn't need cloning, if you only have either the debug or the function node connected to the output).

And just to clarify, all the tests I've done today (to produce this debug) have been with the infinite loop disabled, otherwise I can't see anything.

Regards

60k is nothing to worry about. What hardware/os are you running on?
Which node is giving the call stack error? Show us a screenshot with the message. I suspect this is a big clue to what is going on.
Where is your mongo database?

If you disconnect the wire to the second mongo node do you still get the stack error? If so then does this give the same problem for you? It doesn't for me.

So replace your mongo node with a Function node containing

msg = /* replace this with the full text you posted here from the debug node */
return msg

Ok, so here we have something interesting based on the tests you've asked me to do:

  1. My hardware: 6-core amd64 CPU with 16 GB of RAM running Ubuntu 18.04 (I'll upgrade to 24.04 next week :sweat_smile:).
  2. Both mongo and nodered are running as docker containers in the same machine (where I have my browser as well).
  3. This is a screenshot showing where the stack error is being generated:
  4. If I disconnect the link to the second mongo node, the stack error is still there.
  5. If I replace the Get collections node and instead use a function node with the msg including the contents generated by Get collections as you said, I don't get the stack error.

Based on points 3 and 5, it looks like the cloning is done in a different way depending on whether it's a mongo or a function node (I don't know if this makes any sense TBH).

Regards

OK, so there is a strange issue with mongo node when two outputs are connected. Which is odd.

Leaving that aside for the moment, if you connect the second mongo node and your infinite loop function, but still using the function node to pass in the data, do your debug messages appear?

With this, again nodered freezes and I don't see debug messages.

Regards

Can you repeat the exercise to get the debug output from the second mongo node please and post it here. With the loop removed.

Output of Get elements (from DEBUG Get elements):

{"payload":[],"collection":"collection1","_msgid":"658db52f9e166aa7"}

And since it's only cleaning the message, the output of Process elements (from DEBUG Process elements) is:

{"_msgid":"658db52f9e166aa7"}

I mean, everything as expected.

Regards

You are correct. Having replaced the second mongo node with a simulation none of the debug outputs show if the loop node is connected.

Hopefully someone who knows more about the inner workings of node-red will see this and tell us what is going on.
Here is the failing flow using only core nodes. I can't post it inline as it is too big. The flow is not large, it is the data content.

failingflow.json (73.7 KB)

Do you think I should unmark the Solution of this thread so that someone actually pays attention to it? In the end there are still 2 open questions:

  • The stack problem when using the mongo node producing a too big output that needs to be cloned. People can go crazy not knowing what is going on.
  • The debug problem, with the messages taking too much time to be shown in the debug tab. Under certain circumstances, it can take you a while to discover the bug because of this (as has happened to me).

Regards