I have 3 Inject nodes will run every hour and 1 Inject node every 30 minutes.
And all of them will flow into a switch node where I will check if there's an existing inject node running or not using a flow variable. I will let the flow continue from that inject node if flow.switch is empty. If not, it will loop and wait until the flow.switch is empty.
Now my question, is it possible that 2 or 3 injects nodes will simultaneously enter the "ON/OFF?" switch node? If that's possible then I have a problem...
As you used link nodes (which are good for some things) it makes it difficult to know who is connected to whom.
If all those 3 (4?) run every hour, there is a chance they will all send their message at the same time, because they are all started when the flow is deployed. (Or machine booted)
As there doesn't seem to be any exotic nodes, how about you post the flow and people may be able to get a better understanding of what is happening.
Thanks for answering.
Just to clarify, are you saying that these 3 inject nodes which runs every hour will never reach the "ON/OFF?" switch node at the same time?
Thanks @dceejay for the clarification.
So that's why does using split node in large array, example with 20k elements, slows down nodered to a crawl for 2-4 minutes until it's done?
I guess situations like that encouraged the innovations like multi-threading and multi-processing. If you have such workloads that needs parallel processing, you may have to re-think about your solution design. You could split the functionality, let's say keep the major part in NR and put that "time critical" part into a Python script or another external module that supports parallel processing
That may or may not be related. It will depend how big large is and also maybe what you are then doing subsequently. Nodejs passes objects by reference, so the large array will be one object in memory, and splitting it has to create n new objects, whereas just passing one object is just moving a pointer.
"whereas just passing one object is just moving a pointer." - My problem with this is that traversing 20 thousand records takes a lot of time. That's why I used split node to split the array to multiple messages and be processed simultaneously in the flow. This cuts down processing time but increased the load much.
For example: array with 20,000 elements and I used split with a fixed length of 1,000 will create 20 messages. These messages will be processed simultaneously.. But also increases the load.
They won't be processed simultaneously, they will be processed one after another (assuming that 'processing' doesn't include an asynchronous activity like database, file or internet access). So by splitting it and passing messages about you are increasing the overheads. The quickest way would likely be to do it all in a single function node traversing the array and doing whatever is necessary, but nothing else would get a look in while that was going on. Alternatively, if how long it takes is not critical, but you don't want to slow down other things going on then if you split them as you currently are but then put a rate limit node in the flow (using a Delay node) the processing would take longer but there would be gaps to allow node-red to continue. the slower you set the rate limit the longer it will take but the less the effect on the rest of the system.
Thanks @Colin for the detailed explanation!
Yes I'm doing API call (internet access) in the process, depending on a certain condition. So this might be ideal then.
Using a delay in between loops is a good idea. Would putting a random 100-500ms per loop make a difference? I really don't want to make the process take longer.
I also tested the same number of elements, around 20 thousand records. Using split (with loop) vs without using split (just loop) and the results are quite the same in terms of speed. Kinda weird, because I was expecting the split one would be much slower.
which mostly shows that it is the API calls that are taking the time. Could be network latencies - or the fact you are DDOSing the API server. A delay between loops would allow some of the queues to drain down - but whether that is a problem or not depends on how much memory you have to play with or how much the server cares about many calls.
Add another debug after the Get Data, I imagine that is where most of the delay is, but time spent fetching (assuming that is a different machine or process) shouldn't lock up node-red. If it does then I guess the Get Data process must be on the same machine and that process is locking up the whole machine, so nothing to do with node-red itself.
@dceejay thanks for the explanation.
Question about setting delay between loops:
Would a loop inside a function and using setTimeout have the same effect as using the delay node?
I'm just curious on what you meant by "it isn't quite as easy to do".
What I was going to do is do an array loop inside a function and in between loop do a time delay using setTimeout. Would a timeout inside a function (instead of using delay node) allow some of the queue to drain down?