Rather than having to add a new node to cleanup up resources, is it possible to add a function to a message that is called at the point the message is destroyed. For example, wish to open a connection to a database assign it to the message and then when the message is completed have it the function called. May have several separate node functions on the database and trying to avoid opening and closing the connection. That is open it once for a message, have different nodes use the connection and close on completion. May even allow for it to be UOW across nodes.
Most nodes that do things like connect to a database use an underlying config node that can be shared between them, such that it maintains the long running connection and the individual nodes can all use it. When the connection gets torn down can depend on what sort of thing you are connected to.. typically they stay connected until a redeploy - but obviously other paradigms can be implemented.
Understand the suggestion however it implies the connection is bound to the node and thus updates cannot be contained into a unit of work across nodes unless the workflow is serialized. I was expecting nodered to be able to scale. This implies several messages to be running passing thru the flow/node at same time. To achieve such it implies certain work areas and resources need to be managed at the message object.
If the framework allows a single node to handle many messages simultaneously and considering the pattern you have described, it implies the resource being used can operating multi threading. Not sure if this is possible for some connection types. Plus it forces the use of two phase commit between node connections to achieve a unit of work. Again not all software has this capability making it more complex (and overhead) to manage for simple operations. Nor allows ACID with the sending of the response.
My solution to my problem was to have the configuration node maintain a pool of connections. These are assigned to the message at first touch point node and then released back the resource to the configuration node after response posted. The later not achievable so has to be done prior to sending unless I make my own version of these node types.
Nodejs (and hence node-red) are entirely single threaded. Nodes handle messages one after another, not at the same time.
Are you suggesting that nodes such as the MySQL config node be redesigned in some way so you can tell them when you are finished with a connection?
I understand it is single thread but it places it in a event queue where there is a blocking operation. It was my impression that another thread could operate through the same object. but bounded by function scope. But must admit lack of knowledge. A call to mysql would be a blocking operation. If this didn't occur then performance not only would it be bounded by the core but also blocking operation suggesting crap performance as latency on blocking operations is proportionally higher than CPU utilization.
Which then makes me wonder how nodered makes use of multi-core. Had assumed it would uses cluster feature. Is this an incorrect assumption? Does one have to raise new instances of nodered and load balance across the ports. If so seems like a very poor limitation.
On MySQL haven't looked at it in detail but noted short comings in enabling UOW across nodes. On read activity its not much of an issue as I believe MySQL can take several concurrent requests in one connection. But if there is a need to be ACID across nodes then there may be a need to redesign config nodes to release resources.
If my requirement not in place, can see it being easy to implement. All it requires is a callback stack on the message which is processed before message work area is destroyed. Any node could add a function to stack.
It is possible to use multi-cores with Node.js - just not simple. Most people scale horizontally rather than vertically which is to say that they run multiple instances of their microservices.
Of course, Node-RED is not what I would exactly call a "micro" service, it's general and platform nature means that it is perhaps a little more portly than a more specialist microservice. That isn't a criticism, merely an observation.
So if you need a more efficient approach to handling a database, I would suggest that a more appropriate approach would be to create your own API microservice then you can do exactly what you want and tightly control the connections and throughput. You would also be able to more easily scale it out.
Node-RED is an excellent hammer (and much more!) but not everything is a nail.
UPDATE: Perhaps I should also have pointed out that a custom node can also easily provide an API (or several). I've been using this to good effect in uibuilder. In my case, it wasn't really necessary to use a config node and API's have proven a much better approach.
Such things do block the current execution path, but it does not start another thread, the same thread is used to do other things while waiting for the activity to complete.
Node-RED does scale - just perhaps not in a fashion you are expecting or in a way that supports the use case you have.
The event-loop nature of node.js does a good job of allowing multiple tasks to be handled - especially when those tasks involve I/O to external systems. For example, sending a query to a database does not block the event loop waiting for a reply. Depending how low level you want to get, the query is set to the database and then the event loop moves on to the next task. At some point later a TCP packet will come back and a task will be scheduled with the event loop to handle that - calling back into the code that submitted the query.
Adding Functions to the message object is a definite anti-pattern. It can be done, but it wouldn't be a feature we'd encourage. (The fact our HTTP In node does it is a long-regretted mistake we are slowly undoing). Messages should be JSON encodable so they can be serialised.
Depending on the database technology being used, it might be possible to attach a serialisable transaction ID to the message as it passes through a flow - but the actual work is done in the node, not the message.
Thanks for the clarification. Understand the concern about adding functions to the message as it bounds the message to the instance on nodered (affinity). Note not to say I wouldn't add functions to a message as find this pattern useful in distributing work load as demonstrated by mapper/reducer architecture..
Not sure I agree with comments on horizontal/vertical. Chip technology and its limitation is driving vertical scaling. IBM power big boxes have 1500 logical cores. Even mobile phone brag about having 8 cores. Framework should align with future direction of CPU's. Seems strange to be enabling Nodered to play in API micro services arena with swagger then say one needs to develop ones own API micro services to support it. Personally believe it is a great direction and use of Nodered.
My concern is not about efficiency but about achieving ACID for a message's existence. My comments about efficiency are only raised to indicate both are possible. Nodered appears to be a great framework and presentation layer but fundamentally very restrictive on types of data flows it can handle. It seems to me that if the issue of ACID within an messages existence it would greatly enhanced the product scope of use. I don't see it being a big matter to overcome but understand introducing affinity being a concern and causing new limitation/problems.
And that's where we differ. You consider this to be very restrictive. We consider it to be just one type of operation and not one that is very common in the use cases Node-RED has traditionally been applied to.
In the 5 years of discussions with users I think this may be the second time this topic has been brought up.
I'm not dismissing the validity of the use case; there certainly is a place for more transactional semantics across a set of nodes and that is something worth exploring. But the lack of those semantics has not held Node-RED back.
Understand and accept in the context of if its target market being IoT. Pity as there is potential for greater things. Notice other products, such as NiFi, in this space face similar queries/demands as there usage creeps into traditional processing.
While I agree and have long hoped for more capabilities for general data processing (ETL type workflows), unless more people get involved with the development of Node-RED, it can only progress at a certain pace.
You are absolutely not comparing like-for-like now. NiFi is build on Java, it requires a much more significant environment than Node-RED does, its design cues come from a very different place.
There is room for both - and more.
Trying to run NiFi on an Android phone or a Pi Zero for example would be pretty pointless but Node-RED can do that. So it isn't surprising that NiFi might do better at raw transactional throughput than Node-RED does. Though I don't actually know whether anyone has really tried pushing Node-RED to its limits on a more powerful platform?
Node-RED's benefits are (I'm sure not exhaustive):
- Runs on low-cost platforms.
- Simple to install and run, no complex architecture required.
- Very quick to produce prototypes.
- Excellent community.
- Some professional support - IBM and Hitachi plus a few other people who use it professionally and are building commercial products with it.
- Is very stable - few errors will crash a Node-RED system.
- Size - my full development environment for Node-RED, including NR itself and all of the associated packages, flows, projects, etc is just 320MB. The zip archive alone for NiFi is 1.3GB, over 4x larger.
Node-RED's not so strong points:
- Limited core developer support (not a criticism of those who are working hard on it or supporting it, just an observation).
- Node.JS is less likely to reach performance levels achievable by fully compiled languages with more complex architectures.
- It is designed for ease of use, not always for transactional throughput.
I can't really comment much more about NiFi. At present I can't run it. Having downloaded it for my Windows 10 i7, 16GB RAM machine, it is currently giving:
Hmm, not quite sure what is going on there. Utilitisation is still climbing rapidly so I'm guessing it will fail spectacularly at some point.
Seems that it really doesn't like being started from a PowerShell shell.
Running from a legacy cmd.exe shell seems to start it up
but it never gives me the web interface that it should on http://localhost:8080/nifi
I just didn't wait long enough. Though no clues were given, it actually took around 15 minutes on my powerful machine to actually startup something useful. The only clue being a Windows prompt for a firewall change.
It should also be noted that, unlike Node-RED, NiFi has no direct visualisations. It is good for ETL (Extract, Transformation & Load) workflows but you need a different tool if you want users to see the output.
You can, of course, use NiFi along side Node-RED. Using NiFi for complex ETL tasks such as consuming multiple, high transaction rate log files, merging them and outputing to InfluxDB. Or transforming other streams of data to MQTT outputs.
My comparison to NiFi was only from the sense of it evolving and being a similar wire frame. One such evolution is MiNiFi which is a cut down form that has a low foot print targeted at small devices. Which demonstrates my point from the other perspective with its creep into IoT world.
I believe that MiniFi is really just an agent for sending data forwards to NiFi?
I don't think that any of us are arguing against that. We, of course, all want Node-RED to go from strength-to-strength. However, we also need to be cognisant of priorities and available resources for development.
Is it? Well it is supported by the Apache Foundation so it has arguably better exposure for that. But I don't see it making any inroads into professional areas of analytics that I am involved with. Indeed, I'm seeing more people moving away from Java-based tooling because of the overheads and the overall management of Java as a platform.
As I said, there is room for both approaches and for both tools. Each filling its own specialisation.
I continue to hope for the expansion of Node-RED into other areas of ETL as I like to keep the number of tools that I have to deal with to a minimum but given the level of support and development available to Node-RED as a platform right now, personally, I am happy with its direction and development.
Just my view of course. The big sticking point for me is the ability of Node-RED to produce user interfaces backed by server processing. It is so easy to do. Tools like NiFi don't help me with those issues at all. For ETL tasks, I generally turn first to the PowerQuery engine in Excel. But if I needed to do things like log aggregation and didn't have the budget for Splunk, I might well turn to NiFi.
I'd love to hear about any other similar tools you may have come across.
NiFi has GUI with some useful features not present in Nodered.
Heritage of NiFi being NSA, it has probably induced take up in fraud detection. Has foot print in major enterprise in log and event analysis (why not well known) which is now being extended into other domains. Is in use in large telecos and I heard large miner. My hands on touch point came from replacing Splunk with NiFi on recommendation of major consultancy. I first sighted in Hortonworks presentation in big data space at IBM event on analytics.
The software delivering wire frameworks has been around since turn of century under title message broker. Originally Neon acquired by IBM and focused on data flows in enterprises which are transaction focused. Products such as Ab Initio who require NDA to be signed, have major presence in major enterprises but not known as high cost and secrecy requirements. Most of your card transaction are likely to have gone thru it.
Excellent breakdown, thanks.
Now that happens to be very interesting to me as I'm currently responsible for the design of a new organisation and accompanying technology to service around 15k users across multiple organisations.
We have been thinking about Splunk but I'm getting noises about the cost so this could be interesting. We will be using ServiceNow for most reporting but are very likely to need other tools to draw data together.
Anyway, sorry we are well off-topic now. Interesting discussion though.
If it is of any help Elastic was the second part of equation on replacement. It is gaining loads of traction in the fraud space with big data. Cost was a big decider.
Just remembered it was ELK stack of Elastic.