Live stream about pluggable message routing: some questions

BartButenaers · 21 July 2020 21:23

Hi folks,

Unfortunately I don't have the opportunity to follow the live streams on monday, so I watched the replay on youtube this evening. The pluggable message routing is one of my favorite items on the backlog, so it was really interesting to watch this one. So much fun to see how the development of such an important feature is being started from scratch...

If you are interested in core development, you should really watch it! It is a great initiative that we get involved with development this way

But I have some questions, so I'm going to post them here. Although I understand that the design is far from complete yet, so perhaps it is a bit early for some of those...

Looking at the code, it seems like there is 1 router stack per flow. Is that correct?

As a result I assume that all wires in a flow share the same router stack? But then I not really understand how we can send something to a remote flow, when "all" wires have a RemoteRouter (while the nodes in that flow run locally)? Don't see the picture yet...
In the live-stream some time ago for Js foundation, there was an interesting thought that groups (= new feature from the 1.1 release) could be run in the future on a remote machine. Is that somehow related to the RemoteRouter also, or not? I mean, has a group also a router stack?
Do PreMessageLogger and PostMessageLogger have different functionality, or is it also possible to add the same logger on multiple locations in the router stack?
Currently - when an output is connected to N wires - the message will cloned N-1 times. Is the cloning mechanism the same in the new setup with pluggable routing? Reason of my question is that I try to avoid message cloning (as much as possible) when messages contain lots of data (e.g. audio or video). When using a RemoteRouter, I assume that it will serialize the data anyway (?) so cloning in advance is not really required?? I assume cloning will not be inside one of the routers in the stack?
In the past there have been some discussions about introducing a "Wire.js" object in Node-RED, which can have its own properties. Is that also related to the pluggable routing somehow, or does it have nothing to do with it?
The hot pluggable stack makes sense to me. Could this perhaps become a new sidebar panel, where multiple stacks can be managed and activated?

I also like the idea of having in a 1.x release a minimal version (with only a LocalRouter in the router stack).

Bart

knolleary · 21 July 2020 22:12

Thanks Bart.

Right - it is still very much in the exploration phase. What code exists is just me starting to explore the problem space and is likely to change a great deal as it evolves.

So whilst I cannot give definitive answers to your questions, I can maybe share my current thoughts on where I see it going. As it evolves, we will have a design note written up to formalise the design.

Yes, all wires have the same stack.

In the case of an imaginary RemoteRouter, its send function would look something like:

send(src,destinationId,msg,next) {
   if (I know destinationId lives somewhere else) {
      // send msg to the remote destination
   } else {
      // it is local, so pass the message on down the stack
      next();
   }
}

The idea with the stack is each layer decides if it wants to do something with the message, and decides whether that means the message handling stops there, or it gets passed down the stack for the next layer to handle.

The high-level concept is a group could one day have extra meta-data associated with it. That meta data could be used by the RemoteRouter to help it decide if a message needs to be sent somewhere based on the group of the destination node. But that's all off in a possible future piece of work

I used those as examples to get across the idea that someone could create a layer that gets called before the LocalRouter, and a layer that gets called after the LocalRouter. I wasn't making any statement about the functionality inside them.

This is something I've been thinking about today. In the current code the cloning happens in the Node before it gets passed to the Router. I can certainly see there will be scenarios where the router may want to have more control over that. Equally, we want to make it harder to write layers because they have to now worry about cloning. So yes, this is still being worked through in my head - and it may lead to a different conceptual model than the stack idea discussed so far...

We don't have any concrete plans to add properties to wires. But if we did, then that would be extra information available to the router layers to help them do whatever they need to do - same as Groups.

As it stands, I don't see any reason for an end-user (ie someone using the Editor) to have to know or care about the router stack. There will be places (eventually) where they do configure the stack, they just don't know they are doing it. For example, when they enable the Flow Debugger, it will, under the covers, add the debugger into the stack. Or when they enable the Flow Testing feature we're currently designing and click a button to run tests against their flows, it will insert the flow-tester layer in the stack.

So no, at this stage, there is no plan for any UI element for the pluggable message router.

BartButenaers · 22 July 2020 05:54

I appreciate it! You won't be prosecuted in court when it appears afterwards that the result is another approach

Ah ok, had inpreted that incorrectly.
I had understood that the stack worked like this:

Some layer in the group calls the destination (e.g. a Group)
The destination determines what to do with the message. E.g. if a Group can be remote in the future, then the group determines whether it has to send it to a remote server or not.
The destination returns to the stack whether it should proceed or not.

But instead you determine inside the stack what to do with the message...

That is indeed through. And also good to avoid that Layer-developers start cloning to be safe for a particular case. And then we end up with multiple layers, each one cloning the message.

Interesting thoughts!!

Ok I got it! It would indeed introduce room for a lot of mistakes, if e.g. somebody puts the LocalRouter at the top of the stack...

It is just not clear to me at which position the non-core layers will be inserted (under the cover) into the stack. Suppose I create a CustomLayer which somebody can install (may the lord be with him ), where will it be inserted automatically? Is that perhaps the last part of your live-stream, where you use something like this:

Then my CustomLayer has to return in which phase it needs to be started (e.g. preLocal). And there might be multiple Layer objects all running in the same phase. Is that correct?

Thanks for your time!!!

P.S. when I look at all my questions, you and all participants in that live stream are damn lucky that I cannot be there on monday evenings

dceejay · 22 July 2020 07:43

...though it would save all this typing

knolleary · 22 July 2020 08:47

This is precisely the area I'm currently thinking about and I don't have the answer yet.

If the user specifies a custom stack in their settings file, it isn't currently clear how they would then be able to enable the Flow Testing or Flow Debugger (for example) which would want to insert themselves into the stack.

An alternative model I'm currently looking is to define a set of named steps a message goes through when passing through the router. For example (and I'm writing this off the top of my head... I'm still working through this)

node.send -
           \-> "preSend"
                 \-> localRouter
                       \-> "postSend"

and allowing custom code to be attached to those steps - so something could explicitly declare itself a preSend layer or a postSend layer.

Something like:

{
   router: {
      "preSend": [routerA, routerB],
      "postSend": [routerC, routerD]
   }
}

But there is still the issue over controlling the order within the steps.

So I'm currently looking at whether there are more steps that could be defined in the lifecycle of sending a message.

This model is very much inspired by the Fastify library for creating HTTP applications, which uses a hook based approach for custom behaviours to be added to the HTTP handling path - https://www.fastify.io/docs/latest/Lifecycle/. This is an alternative to Express, which uses the middleware stack approach.

ristomatti · 22 July 2020 09:43

I've yet to watch the stream but (I believe) I understand the concept based on this discussion. I know pluggable message routing has been mentioned to be on the backlog but now that you mentioned the source of inspiration I'm thinking if router is the best name for these layers. In some of the use cases discussed here, it makes more or less sense as the message can be thought of being routed through the debugger/test system and definitely if routed to some remote host.

But in other possible use cases I could think of like logging, timing, tagging or modifying the messages, maybe a hook (as in Fastify) would sound more fitting but also covering the other use cases as it's more "fuzzy" in what it means?

Further if thinking about the imaginary RemoteRouter, to me it would make more sense to create such a thing as a new (async) node rather than a transparent layer?

With this said I also like how router sounds though. But to me it means something that decides what path a request/message takes the next, not so much what it does with It.

Sorry if this does not make sense since like said, I've yet to watch the stream. Going to do that now.

Edit: After now watching the start of the stream I understand the name choice a better. In essence it's more like pluggable message router and not routing. So the layers will be router plugins/hooks/middleware rather than being a stack of routers?

Edit 2: On the stream you mentioned it having the model of Express middlewares rather than Fastify hooks?

Edit 3: Excellent stream! Most interesting so far.

knolleary · 24 July 2020 16:38

Given the interest in this thread, wanted to share where I've got to with the design work on this.

The Design Note PR is here: https://github.com/node-red/designs/pull/7

But it's easier to read here (as you get to see the picture I've drawn):

I have moved away from the concept of a stack of routing layers, to a model more like Fastify where you can register handlers for a number of hooks through the lifecycle of a message.

Having spent all week staring at the problem, it feels like a better fit to me.

The router stack model was completely free range in terms of what could be done. But in many ways that would make it harder to use and you had to consider the complete end-to-end behaviour even if all you wanted to do was insert a bit of extra logging.

The hook model leaves the fundamental lifecycle intact - the router is still principally there to pass messages between nodes. The core functionality (cloning messages, async handling etc) is all built in - but the set of hooks provided allow plenty of flexibility in how and where custom code is added to that path.

Without repeating the full content of the design node I've linked to, here's the picture of what the hooks would look like:

and a summary of them:

preSend - passed an array of SendEvent objects. The messages inside these objects
are exactly what the node has passed to node.send - meaning there could be duplicate
references to the same message object.
preRoute - called once for each SendEvent object in turn
onSend - the local router has identified the node it is going to send to. At
this point, the message has been cloned if needed.
postSend - the message has been dispatched to be delivered asynchronously
(unless the sync delivery flag is set, in which case it would be continue as synchronous delivery)
onReceive - a node is about to receive a message
postReceive - the message has been passed to the node's input handler
onDone, onError - the node has completed with a message or logged an error

Please go read the full design before asking about what I've written here.

BartButenaers · 24 July 2020 19:42

Hey Nick,
I completely understand that you have reviewed your design over and over again. This mechanism will be tremendous step forward, but if you make a mistake in this phase then you are stuck in a year from now...

It seems to me that you have done your homework very well! This new model with hook points will give you much more control, to make sure that our contrib handlers don't mess up things too much...

I have a few questions about the design document:

Are the handlers in all hooks allowed to update the message or metadata, or are there any restrictions?
A single hook can have multiple handlers. Do you think there might be problems if the sequence of those handlers is changed by the user (in his settings.js file)? Just wondering, but I don't have any particular examples in mind ...
A handler can change the cloneMessage flag. I assume one of the handlers can set the flag to true, while another handler can set it back to false. So handlers can override the flag modifications from other handlers. Don't you think that this might somehow result in conflicts?
This would allow a preRoute handler to do its own cloning behaviour and then set cloneMessage to false so that no further cloning would happen for that message. Last week I was experimenting with OpenCv.js and I was stuck because their matrix objects need to be cloned like this:

Currently that is not possible. Do I understand it correctly that I could add - in the future - a preRoute handler to my OpenCv.js nodes, that allows the matrices to be cloned correctly?

Nice design!!
Bart

TotallyInformation · 25 July 2020 09:46

Really? Surely you can clone a msg object inside your custom node right now? What I don't think you can do is then stop Node-RED from cloning it again if it thinks it should have been?

BartButenaers · 25 July 2020 11:02

Julian,
Suppose such an Opencv.js node output has N wires to other nodes. This means that Node-RED will (deep) clone my output messages N-1 times ( because it sends the original message to the first wire, but it will create a copy for all other wires). But that output message contains an Opencv.js matrix object, which needs to be cloned as myMatrixObject.clone(). But currently there is no way for me to tell Node-RED how it should clone my messages. So hopefully that is possible in the future...

cinhcet · 25 July 2020 11:07

This goes a bit off-topic now, but is the pluggable message routing the right way to insert such a functionality?
Wouldn't it be better if the node itself could specify the cloning behavior if its is output is connected to multiple other nodes in this case for an opencv object?

BartButenaers · 25 July 2020 13:03

Nick has been so kind to mention the cloning in the design document, so IMHO we are still on topic

Don't think that would solve the problem, since the message (containing an object which need custom cloning) will traverse through the flow through all kind of nodes. And those nodes are not aware of the cloning requirements of the messages passing through them. Moreover a message can contain objects from other nodes also, which might need other special cloning treatment. Nick's router however is nicely positioned between all nodes, so it could do the job. As I said before, he has created a very nice design!!

Will try not to go into too much detail (because this feature is only in an early design status), but that really hurts to be honest . Currently msg cloning in Node-RED simply calls the Lodash cloneDeep function. Suppose we have multiple cloning hook handlers in the future:

I develop a custom cloning hook handler to clone Opencv matrix objects.
But another custom cloning hook handler implement custom cloning of another type of objects.
And so on ...

And all those objects can be available in one single msg! I assume that a cloning hook could e.g. call the Lodash cloneDeepWith function (instead of cloneDeep) which accepts a customizer parameter:

function customizer(value) {
  // Ask each custom clone hook handler whether it wants to custom clone this (nested) msg property
  if (...) {
    return value.cloneNode(true);
  }
}
var clonedMsg = lodash.cloneDeepWith(msg, customizer);

But now I'm going to stop brainstorming, otherwise I most certainly will be banned from the Node-RED community

TotallyInformation · 25 July 2020 13:04

I can certainly see that the proposed solution will be a lot more efficient but I can think of several ways you can indicate to downstream nodes that something else must happen. To say that you can't currently do it seems a little strong. But perhaps I'm missing something. Anyway, well off-topic so I'll shut up.

knolleary · 25 July 2020 21:53

I don't think there were be restrictions - but there will be rules of the road that should be followed otherwise things would break.

I don't see these hooks as being things end users touch in general. It will be the tools that are developed on top of Node-RED that make use of these hooks that will be what end users touch.

So yes, there will be scope for things going wrong, but to some degree it will be up to those tools to manage... we just have to make sure the hooks are well-defined and understood.

Yes... but again, if something is touching the clone flag, it needs to know what it is doing and it needs to be aware that it doesn't exist in isolation. But I don't see there being lots of things wanting to touch the clone flag - especially wanting to set it to true. It's far more likely that something wants to set it to false to prevent any further cloning from happening.

That's an interesting idea. Technically, this proposal would allow that. We need to evaluate if that is a sensible thing to allow, or even if its technically possible, whether its something we'd endorse doing.

knolleary · 30 July 2020 17:03

Quick update on this thread as it has generated some good discussion.

I've made some good progress on the design and now also have a fully working implementation. Which is nice.

The hook names have been updated to better reflect what they are for and to be more consistent around the use of the on/pre/post prefixes.

~~preSend~~ -> onSend - passed an array of SendEvent objects. The messages inside these objects are exactly what the node has passed to node.send - meaning there could be duplicate references to the same message object.

preRoute - called once for each SendEvent object in turn

~~onSend~~ -> preDeliver - the local router has identified the node it is going to send to. At this point, the message has been cloned if needed.

~~postSend~~ -> postDeliver - the message has been dispatched to be delivered asynchronously
(unless the sync delivery flag is set, in which case it would be continue as synchronous delivery)

onReceive - a node is about to receive a message

postReceive - the message has been passed to the node's input handler

~~onDone , onError~~ -> onComplete - the node has completed with a message

I've also:

updated the design note with some more details on the new RED.hooks api this will introduce.
added some words to explain the difference between RED.events and RED.hooks as they look similar but serve different purposes.

Some links

Design Note PR: Pluggable message routing by knolleary · Pull Request #7 · node-red/designs · GitHub
Direct link to the design note: https://github.com/node-red/designs/tree/message-routing/designs/pluggable-message-routing
Draft PR that implements this current design - Pluggable Message Routing by knolleary · Pull Request #2665 · node-red/node-red · GitHub

I am still looing at how these hooks can be extended to other parts of the lifecycle - starting/stopping/deploying etc - as that will be needed by components such as custom routers or the flow debugger. The question remains whether they will be hooks (which can modify the data passing through) or events (which are outside observers to the data).

BartButenaers · 30 July 2020 20:40

Nice work!! Congratulations with this 1.x.0 implementation

No further questions at this moment. Looks very complete and understandable.

The preRoute step can be used to do remote message routing. As no message cloning has happened by this point, it avoids that overhead when serialising the message to send over the network serves the same purpose.
I appreciate that you have taken into account the remarks from the above discussion!

I'm very curious to see some sample hook handler implementations, and I'm very curious how it will trigger the creativity of the developers in the community. Perhaps it might be interesting to demonstrate something like that in a monday evening live-stream. Note that I can't participate the next two weeks, in case you want to avoid a huge series of stupid questions

Topic		Replies	Views
Only clone messages sent to 2nd or higher wire on same output Feature Requests	20	1139	24 December 2021
🎉 Node-RED 1.0.0-beta.3 released News	139	8320	6 June 2020
[New Node] node-red-contrib-events: Alternative to link nodes Share Your Nodes node-red-contrib-events	43	2632	31 January 2022
Message cleanup of used resources at completion Developing Nodes	17	1341	24 March 2019
Weird issue with node-red-contrib-queue-gate General	30	1345	26 November 2020

Live stream about pluggable message routing: some questions

Related topics