Does RED.util.cloneMessage handle buffers properly or is this a bug?

I was playing with node-red v4 recently and trying to break it. Noticed that If i alter the buffer in a cloned message it will affect the buffer from the original message. Is that supposed to happen?

If not, then the fault is that of lodash.clonedeep. There is some unreachable code in that lib that causes the buffer to be sliced instead of copied. As we know, buffer.slice will create a new buffer, but it is using the same memory location as the original buffer.

As an alternative, I was testing structuredClone as a possible replacement for clonedeep. It seems promising, but much slower. Also, it changes a Buffer to a Uint8Array. So, not very usable yet.

The buffer cloning issue is also present in v3. Probably goes back to 2019 when a merge was made to add lodash.clonedeep to fix some other msg cloning issue.

[{"id":"98e46ea07725d735","type":"function","z":"9221bfc6b556d2ee","name":"clone buffer test","func":"// output orginal msg payload\nnode.warn(msg.payload.toString());\n\n// make clone of msg\nconst clone = RED.util.cloneMessage(msg);\n\n// increment buffer at position 10\nclone.payload[10]++;\n\n// output original msg payload\nnode.warn(msg.payload.toString());\n\n// are buffers equal?\nnode.warn(`buffers equal: ${msg.payload === clone.payload ? 'yes' : 'no'}`);\n\n// are buffers sources equal?\nnode.warn(`buffers source equal: ${msg.payload.buffer === clone.payload.buffer ? 'yes' : 'no'}`);\n\nreturn msg;","outputs":1,"timeout":0,"noerr":0,"initialize":"","finalize":"","libs":[],"x":520,"y":440,"wires":[[]]},{"id":"c88f7adaffe6e287","type":"inject","z":"9221bfc6b556d2ee","name":"","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"[105,115,32,116,104,105,115,32,97,32,97,117,103]","payloadType":"bin","x":330,"y":440,"wires":[["98e46ea07725d735"]]}]

Forgive me if this is expected behavior. Perhaps I missed the docs somewhere.

Not an intentional behaviour. From your analysis of the issue, do you know if this is a known issue with lodash.clonedeep that is logged somewhere, or have you done the detective work to get to that conclusion?

1 Like

it seems that somebody tried to fix this a few years back and it was ignored for a couple of years and then finally closed without further comment.

@kevinGodell,
Unfortunately this proposal has never been finalized, otherwise you could 'perhaps' have used it to 'workaround' the lodash issue in Node-RED...

1 Like

@BartButenaers That sounds interesting. Based on what I have seen with node-red source code, it seems like that would be easier to implement system-wide vs per node. And then it could be configured in settings.js for the advanced users that need that level of customization.

As the example of configuration choice, a user could decide if they want a buffer in a cloned msg to be sliced or copied based on their performance and/or data integrity requirements.

Yes it was a very interesting feature. If I remember correctly, @massi-ang had already done quite a lot of testing but I don't think he ever created a PR with his implementation

He needed it to be able to pass Tersorflow tensor objects between nodes. But you could e.g. pass also OpenCV image objects between nodes, and so on...

My brain is a bit foggy, and I am computerless this week... I don't know anymore if his implementation involved using the Node-RED plugin mechanism. Because it seems to me that it would be best to allow developers to create hooks, to allow them to introduce cloning of specific types of objects.

But of course performance is crucial. This feature will be called multiple times for every message, for every node that has multiple output wires...

1 Like

BTW I have been talking about plugins above, but in fact I mean a message hook.

1 Like

I did some further testing using lodash.clonedeepwith and it went pretty good. It seems that it is very similar to lodash.clonedeep. If you choose to use it and do not pass it a customizer function, then it will return the exact same results of lodash.clonedeep. But if you do pass a customizer, then you can have good control over how cloning is performed.

benefits of changing from lodash.clonedeep to lodash.clonedeepwith:

  • drop in replacement, seems to have backwards compatible output if no customizer is used
  • faster than structuredClone without customizer
  • and still faster than structuredClone with customizer correcting the buffer cloning bug
  • ability to fine tune cloning for other object types

As an example, if a user truly needed to ensure that a buffer was cloned and was using a new memory location, then a simple customizer function could be passed in:

const customizer = (value) => {
    if (Buffer.isBuffer(value)) {
        // use same memory
        // return value.slice();

        // use same memory
        // return value.subarray();

        // new memory allocation
        return Buffer.from(value);

        // new memory allocation
        // const newBuf = Buffer.allocUnsafe(value.length);
        // value.copy(newBuf);
        // return newBuf;
    }

    // return undefined for all other values to indicate to lodash to do its normal cloning on that value
    return undefined;
}

and then call the cloner with the customizer function

cloneDeepWith(msg, customizer);

And then a more complicated example if you must consider your resources when handling large buffers and dont want a true clone causing new memory allocation:

const customizer = (value) => {
    // dont allocate new memory if buffer is huge!!! at this point we care less about data safety
    if (Buffer.isBuffer(value) && value.length > Buffer.poolSize) {
        return value.subarray();
    }
    // always return undefined
    return undefined;
}

edit: further thoughts, I think core could also add its own types not to clone, such as it does with the req and res properties of a msg, but instead you could check for types or instances of request/response. Perhaps this old todo can be solved node-red/packages/node_modules/@node-red/util/lib/util.js at master Ā· node-red/node-red Ā· GitHub

Ah that is good news. I was not sure anymore whether you would be able to clone some basic data types (number, string, boolean, Buffer, ...).

From this I assume your preferred way of implementing this is via @massi-ang his second proposal "add cloner via specific configuration" via the settings.js file? I think you are correct in that case, but - if I am not mistaken - the settings.js file is not automatically updated when a new Node-RED version is installed, so existing installations won't have the fix. Which might be confusing when people register a req/resp related fix. Because yes, I know that the req/resp fields sometimes give us some headache like recently in dashboard D2 (but fixed now).

I realize that manually editing the settings.js file - to add customizer functions - is the most powerfull and safe way to do things. But it is not the most user friendly way of doing things. Proposoal 2 from Angelino solved that, because it was up to the node developer to register the function automatically.

1 Like

BTW thanks a lot for taking the time to investigate this feature!!!!

1 Like

No thanks needed. I just wish I had functional devices so that I could do more. Old mac is trying to die and has become barely usable. Just inherited my wife's old pc and got ubuntu running on it and its pretty nice. And the other good new is that i was notified today that the framework 16 laptop that i pre-ordered last year in july will finally ship soon. So, maybe I can get dev'ing again.

The todo was that little note in the code about coming up with a better way to handle the non-cloning of req and res. This customizer would handle that. In the customizer, simply returning the value is the direct way to not clone it.

const customizer = (value) => {
    if (value && ['IncomingMessage', 'ServerResponse'].includes(value?.constructor?.name)) {
        return value;
    }
    return undefined;
}

Yes. I noticed in that old conversation that nick had concerns of end users breaking their messaging by being allowed to affect the cloning. That's why I tended to think that it would be best in the settings.js so that it could not easily be changed. After further thought, that would not be practical in the situation you mention because the customizer would be global and called against all messages possibly causing a performance hit. But, I will add that if a node could have a customizer configured, then it would have to exist for the entire flow since the message will get passed to an unknown number of receiving nodes. Definitely would have to be set on the flow and not just the node?

The main takeaway is that it is not really difficult to maintain backwards compatibility when using customized cloning with lodash.clonedeepwith. If the feature was made available to nodes, then the http nodes could utilize the same customizable lodash.clonedeepwith to handle their own cloning to solve that todo issue. @BartButenaers good luck with your pull request. Sounds complicated. :rofl:

Other thoughts: fork-flow-node a core node that is meant directly for when you will fork a flow causing a cloning situation and it could have special access to the alternate customizable cloning. The end user would have to be in a situation and have the knowledge when to use it. That would make it easier for developers of nodes to not have to implement config nodes.

p.s.

I just verified in httpNodeMiddleware that the res can be handled properly with the correct customizer

  httpNodeMiddleware: [
        (req, res, next) => {
            const resClone = cloneDeepWith(res, customizer);
            return resClone.status(404).end('dfddffdfd');
            //next();
        },

Without the customizer, i get a 500 error because the cloned response cannot be used.

Well I am not really a fan of putting stuff into the settings.js file, because it is not really user friendly. But of course if you would go for proposal 1, then you can only specify for which prototype you want custom cloning. So you can't specify condtions like in your last example (i.e. value && ['IncomingMessage', 'ServerResponse'].includes(value?.constructor?.name)).

On the other hand... If you require users to put it in the settings.js file, users migth get confused:

  • Somebody installs node X, and sees on the readme page that he needs to add a custom cloner in his settings.js file (e.g. to clone OpenCV image instances).
  • Afterwards he shares his flow here on discourse.
  • Somebody imports his flow, but this user is not aware about the custom cloning required.
  • His flow will not work like expected, and a lot of digging might be required to find the root cause.

I don't know by head if you can show a red triangle on your node in the flow, by adding a validator that checks whether some custom cloner is avaialble. Don't think that is possible.

Yes that way you would have minimal impact on performance. But I don't think it would be easy to determine the tree of nodes a message can pass, starting from some node (via link nodes, subflows, ...). I think that would be way to complex.

And we also don't have to become paranoic. Currently the lodash library also does type checking for standard cloning. And it only happens for nodes that have more than 1 output wire. If the custom cloning functions are implemented with performance in mind, I assume you won't have a visible impact.