Nodes That Erase Messages

Hi, I've been using Node-Red for about 1 week, and I like it a lot. I do have what might be a newbie question though. I've searched the forum, but I haven't found a topic that addresses this issue, but maybe I'm missing something.

I'm using a few contributed nodes, specifically node-red-contrib-minio-all, node-red-contrib-bool-gate, node-red-contrib-csvtojson, and node-red-contrib-mongodb4.All of these nodes have output that erases the incoming message. This makes it difficult to resume post-processing work. I've also created a subflow, but these nodes erase _linkSource from the message, forcing me to save and restore the _linkSource so that the flow is able to return to the caller. They also overwrite the payload.

My use cases is that, in response to a new file upload, I use node-red-contrib-minio-all to pull a CSV file from a Minio S3 bucket, then use node-red-contrib-csvtojson to transform the file to JSON, and then push the JSON file back to a different S3 bucket which will trigger further processing. I also move the original object to an archive, but since Minio S3 doesn't have a "move" function, I have to copy and delete the object.

I could do these steps in sequence or use a status node to detect when a node completes and then trigger the next node. However, the minio, mongodb, and csvtojson node output doesn't contain my original payload, so I'm lost.

My approach may be wrong, so I am open to suggestions/correction. I did see that a Flow Context can hold data across nodes (which is what I'm doing with _linkSource). Will that work when many messages may be processed at one time? What about with subflows? Right now I save the _linkSource in the main flow, but restore it in a subflow. Might that cause problems when the subflow is shared? I can't tell from the docs whether it is safe to use a Flow Context in this way, but I'd rather not build it, then have data potentially get overwritten during processing.

Thanks for any useful direction or suggestions.

Hi and welcome to the forum.

It is good to see you are giving it a go.

One thing that may be catching you is that (as I see it): Node-red is sequential.

You get a message and it may have a useful/important payload. (Though there are more parts to a message than just the payload, but for now.....

That goes into a node. That node then does stuff to the message.
Ideally, everything you want to do with the message should be handled by sequential nodes - if not one node.

The idea is that you have a message and it causes an action (or actions) depending on it's contents.

The next node should then get a message that is important to it.
And the process repeats.

If you are needing a message to survive going through a node, then that node may not be the right one to be using.
OR
You put the next node in parallel with the other node so it gets the same message.

Subflows are a world unto themselves.
You can use the same subflow at many points in the flow and they shouldn't be able to cross contaminate one another.
I say shouldn't only because NOTHING IS PERFECT. :wink:
Bugs do happen.
But for the most part: They are individual things in terms of the flow.

It is good to also see you've done your homework/study on Flow Context.
Yes, that can get messy if used incorrectly and for now I would advise not using it.

I know NOTHING of the nodes you mention and so shall shut up here.
Sorry I can't be of more help, but.... I'm only just off the bottom rung of learning.
Good luck.

P.S.

What MAY suite your needs is a stack (queue?)
You get the message.
Send the output with the message to TWO places:
1 does the filtering and determines what/where the message is going/what happens to it.
2 goes to a queue and waits for the decision to be made.
In the mean time, gate nodes are set to router the message to the desired place and when all is set, a signal is sent to output the queue and the ORIGINAL message is then routed as needed.

Thank you very much for the guidance and insight. You've confirmed some things I suspected, and given me some ideas to try. I suppose the challenge is running messages through a sequence, which implies that the message should survive the trip through various nodes. I will look into using a queue. My many years of programming experience causes me to be suspicious of the Flow Context so I will approach that with caution.

Sorry I didn't say this in the first reply, but I kind of forgot to mention it.

I guess also the idea is that the node does what it needs to with the message and it's output (the node) is determined by that node and is for the next node.

(Sorry for poor example to follow)

message arrive to (node1) (value `3`) -->  Is the value greater than `1`?  (y) (node2)  -->  Next node get a `y` from node1 (node3)

Does that help too?

Just to make sure I understand, do you mean that the nodes are not passing on message properties that you have added? Something like msg.myProperty for example.
If you see this on any core nodes then you should report it as an issue. For contrib nodes it is worth raising an issue against them, but it may not be actioned.

There is a fairly simple technique using context which allows for multiple messages coming in rapidly. I am on my phone but will try to look that up later, unless someone else provides it first.

Edit. Corrected above to 'not passing on message properties'

Correct, the non-core nodes do not pass on my custom properties and erase node-red properties like _linkSource. The core nodes work exactly as I expected and pass on all properties, but most of the non-core nodes I've tried do not.

You are on the right track. I'll illustrate below:

I get notification of a file's arrival in an S3 bucket. I verify the S3 event type, content type, and ensure the size is not zero. If all tests pass, then I'll move the file to the "accepted" bucket for further processing. If any tests fail then I move the file to the "rejected" bucket with an error reason.

This flow uses core nodes which pass the updated message through the flow as I expected.

My trouble begins when I use a node that has an output that does not include any of the input message properties. Thats when I lose the flow. I'll experiment with some of the queue nodes later today and see what happens.

This should do what you want. It queues messages coming in until it is told to release the next one, so you can use that to protect the flow context around your problematic node. In the flow, replace the Process node with nodes to save your properties to flow context, then do the stuff that loses it, and then restore it from context again.

In the Queue node, which is a delay node in rate limit mode, set the rate to significantly longer than the max time that the process should take. Then if it locks up for some reason the next one in the queue will be released anyway, eventually.

[{"id":"b6630ded2db7d680","type":"inject","z":"bdd7be38.d3b55","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":120,"y":1700,"wires":[["ed63ee4225312b40"]]},{"id":"ed63ee4225312b40","type":"delay","z":"bdd7be38.d3b55","name":"Queue","pauseType":"rate","timeout":"5","timeoutUnits":"seconds","rate":"1","nbRateUnits":"1","rateUnits":"minute","randomFirst":"1","randomLast":"5","randomUnits":"seconds","drop":false,"allowrate":false,"outputs":1,"x":290,"y":1700,"wires":[["a82c03c3d34f683c","d4d479e614e82a49"]]},{"id":"a82c03c3d34f683c","type":"delay","z":"bdd7be38.d3b55","name":"Process taking 5 seconds","pauseType":"delay","timeout":"5","timeoutUnits":"seconds","rate":"1","nbRateUnits":"1","rateUnits":"second","randomFirst":"1","randomLast":"5","randomUnits":"seconds","drop":false,"allowrate":false,"outputs":1,"x":510,"y":1700,"wires":[["7c6253e5d34769ac","b23cea1074943d4d"]]},{"id":"2128a855234c1016","type":"link in","z":"bdd7be38.d3b55","name":"link in 1","links":["7c6253e5d34769ac"],"x":75,"y":1780,"wires":[["3a9faf0a95b4a9bb"]]},{"id":"7c6253e5d34769ac","type":"link out","z":"bdd7be38.d3b55","name":"link out 1","mode":"link","links":["2128a855234c1016"],"x":645,"y":1780,"wires":[]},{"id":"b23cea1074943d4d","type":"debug","z":"bdd7be38.d3b55","name":"OUT","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":650,"y":1620,"wires":[]},{"id":"d4d479e614e82a49","type":"debug","z":"bdd7be38.d3b55","name":"IN","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":450,"y":1620,"wires":[]},{"id":"3a9faf0a95b4a9bb","type":"function","z":"bdd7be38.d3b55","name":"Flush","func":"return {flush: 1}","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":170,"y":1780,"wires":[["ed63ee4225312b40"]]}]

Thank you. I will try this out today and reply.

Thank you, this solution works perfectly!

This is a good workaround, but it doesn't really address the issue that @agile-anthony has identified. Without this extra logic and possible throughput penalty, contributed nodes that do not preserve the _linkSource message property cannot be used in subroutines defined by link call and link out (return) nodes. The documentation on creating nodes says

If the node is sending a message in response to having received one, it should reuse the received message rather than create a new message object. This ensures existing properties on the message are preserved for the rest of the flow.

This needs an added warning that creating a new message will cause subroutines to fail. I also think that issues should be raised against contributed nodes that do this.

BTW, a quick look at the code suggests that node-red-contrib-csvtojson may not actually have this problem and that node-red-contrib-bool-gate should be easy to fix.

1 Like

Please go ahead and raise issues ( or pull requests ). This is a team sport after all. Thanks.

1 Like

I'll have a go at the four nodes that @agile-anthony mentions, but he says that "most" of the contributed nodes he has tried have the issue. Any thoughts on how to hunt them down or prioritize fixing them?

Update:

My testing seems to show that node-red-contrib-csvtojson is OK. If anyone has an issue with using the node in a subroutine, please report it.

The node-red-contrib-bool-gate package is not available on GitHub. A fork of it, NodeRed-BoolGate-Extended, has been published to npm but for some reason does not appear in the Flow Library. I'm not sure how to proceed on this one.

An issue was raised over a year ago against node-red-contrib-minio-all and a PR submitted to fix it. I posted a comment on the PR in the hope that something might happen.

Inspecting the code of node-red-contrib-mongodb4 makes me think that it will not kill existing message properties, but I do not have a database to configure it with. I'd be grateful if someone using the node would test it in a subroutine.

So, by contrast, if a node receives a single input message and may output a theoretical infinite number of messages, then should it still pass the original message on each, only the first message out, or not at all? And then there is the case where a node may have more than 1 output, should it repeat the original input message on each of the outputs? Probably not. At the very least, I think we can all agree that a node that has a 1 to 1 input and output should pass the original message.

All good questions, but the obvious answer is that if a node can be made to operate correctly in a subroutine, it probably should. And if it can't, there should be a warning in its documentation.

I think the idea of correctly is incorrect.

OK. Use whatever term you like. Deleting the _linkSource property will cause the link out node (configured to "Return to calling link node") to fail to return and throw a warning. (Not an error, although that seems to be the original design.) The user who deployed the link call node will be sad.

1 Like

Thanks to everyone for your input. I like Node-RED a lot and your support helps my confidence in the product.

I was aiming for brevity when I reported that "most" contributed nodes that I tried were problematic. As @drmibell noted, node-red-contrib-csvtojson does preserve the message after processing; and node-red-contrib-bool-gate has mild problems. I did a bit more work with node-red-contrib-mongodb4 and found that it does work as expected. It does overwrite msg.payload, I just needed to use a custom propery (noobie mistake).

The component that gives me the most trouble is node-red-contrib-minio-all. It returns a new msg._msgid, removes msg._linkSource, and removes my custom properties :angry: Also it has 2 outputs: an error output and an Ok output. But it always returns both outputs which interferes with managing flow. There are open issues on Github for some of these, but I will report those that haven't already been reported.

1 Like

I posted the flow as a workaround which can, I believe, address the issue by saving required properties in context. In my earlier post I had made the point that issues should be raised against any nodes that do not maintain properties.