Delay in Round Robin Node

Hi All,

I have been working on Fleet management solutions.

Am using Node-red to listen the feed from the device and send it to my loadbalancer server.
My flow like below:

  1. Listener -> Round Robin Loadbalancer Node (25) -> Loadbalancer server
  2. From loadbalancer server, it stores in Database.

Listener listening 1 crore or more feeds than that in quick time.

The main issue is data listened time and data stored time in database are having 4 to 5 hours delay.
For example,
Device sent time = 02:50:00
Database store time = 07:50:00

I couldn't find where is delay or there any issue using Round Robin Node.

Regards,
Siva

When using a non-core node it is always a good idea to tell us what it is and how you have configured it. Is it node-red-contrib-something?

Hi Colin,

Load balancer node from node-red-contrib.

Am attaching the screenshot of part of flow.

party node = Load balancer node
portal = Loadbalancer server

Regards,
Siva

Can you give us the full name(s) of the extra nodes you are using - e.g node-red-contrib-xxxxx

Even better would be a link to them :slight_smile:

node-red-contrib-loadbalance

Is it possible that it is actually a timestamp issue rather than an actual delay? Something to do with timezone possibly.

I assume it's not a timestamp issue, among 1 crore packets, only 20% delayed.
Whether there is any limitations in number of connections in loadbalancer node or TCP Node

What timezone are you on? The difference in times is exactly the same by minutes but +5h apart. I'm guessing India as it'd be UTC+5 timezone.

Could it be that messages handled by particular servers are getting wrong timestamps? Possibly due to timezone settings being different. Are you able to tell which server handled individual messages?

You can look at the below example entries:
Device sent time = 03:50:00
Database store time = 04:20:00

Device sent time = 01:50:00
Database store time = 03:20:00

Device sent time = 02:50:00
Database store time = 10:30:00

It's not 5 hours delay only. You can look at the below example entries:
Device sent time = 03:50:00
Database store time = 04:20:00

Device sent time = 01:50:00
Database store time = 03:20:00

Device sent time = 02:50:00
Database store time = 10:30:00

Flow:
TCP Listener IN (Node) => Round Robin Loadbalancer Node(25 splits) => Server (25 nodes are same server only)

From all the 25 Loadbalancer Node(Party node), it goes to the same portal server.

Have you added logging to determine where the messages get held up?

We are trying to identify - can you provide the way how to log it using which node?

We have more than 3k data per sec from devices.

You can add a node in the path to each server and log it from there.

When you say you are feeding 25 nodes in one server, do you mean 25 node-red nodes running in the same node-red server? What are you doing in the nodes that benefits from parallel working, given that node red itself is single threaded?

Possibly for some reason one of them is not able to get hold of some resource and locks up for long periods. In fact a good start might just be to count the number of messages going through each server, if one locks up for hours then you should see a big difference between them.

1 Like
  1. what node can used to log in the path of each server?
  2. From the device:
    a) it connects to loadbalancer server.
    b) loadbalancer server connects the packet to 5 node server running parallely.
    c) Each node server having same setup as mentioned in above response.
    d) 25 node-red nodes are running the same node-red server (25 X 5 servers here)
    e) 25 node-red nodes are connected to the same TCP OUT loadbalancer server.
  3. How to find the message lockup for long period?
  1. There are numerous different methods of logging, to a file to the system journal, to an external system. But see point three, I would try that first, given the issue of the large amount of logging that would be involved.

  2. Having a bit of trouble understanding. Do you mean you have inputs coming in, these are sent, via a loadbalancer, to one of five servers each running node-red, then in each server you run another loadbalancer sending to one of 25 nodes in that server. What do the 25 nodes actually do that allows 25 to run in parallel in one instance of node-red?

  3. I would add a node in front of each of the 25 nodes, that increments two global variables (named uniquely for that node). Then add another node after the node has completed its action on the message that decrements one of those variables. Then you will have a count of the total number sent to each node and the number waiting to be handled by this node. You could log those once a second or whatever is appropriate, or even show them in real time on a dashboard. If all is working well then presumably they will all go up at the same rate and the number waiting for each node should never be more than one or two.

Another thought, presumably you have checked that the node-red servers are not running out of processor time (remembering that each node red server only has access to one core).

How to check node-Red processor time? Node-Red installed on m5.large instances [2vCPUs and 8GiB RAM] ubuntu server.

We are running this in production , so we are right now in a very critical stage where we are not able to monitor node-RED performance on the server.

Is there any professional support we can get for this particular issue and also we would like to confirm whether our approach is correct for receiving real time data and passing via tcp to another server.

Simply we need performance monitor for nod-RED : TCP in , node-RED: TCP out and node-red-contrib-loadbalance and how it is working for the real time data pipeline. any professional support also will be fine.

Attached the architecture diagram.

If you are setting up such a sophisticated system, but don't know how to check the processor utilisation, then I think definitely you need professional support.

If you are already in production cannot do debugging on the system then you also need another parallel system that you can debug on.