Inject node runs late sometimes

Follow up: second day. The production machine sent the Telegram message precisely at 17:00. In line with the observation from the OP. The delay is not consistent. It may not happen. I will keep watching.

Hi Andrei.

Um, yes, but you have a huge part of the equation over which you have no control.

The sending of the SMS. I have heard of people getting an SMS days later!
I too have had time variations between when an SMS is sent and when it is received.

This testing is using Telegram messages, not SMS. I am pretty sure Telegram is not contributing to delay messages.

I have been receiving Telegram msgs from RPi consistently for well over a year. All have been received in Telegram within a few seconds of the specified time. All messages are marked with the correct time from NR.

Or are we just talking about the OPs original flow?

I just used Telegram as a way to test a production machine for which I normally don´t have direct access. The subject of the topic is to understand what could cause the inject node to trigger a flow with a considerable delay (several minutes). This issue was reported in the OP. I am monitoring a machine that is facing the same issue.

Thanks for the summary Andrei. Would you be able to share your flow and I will run it up on my system as well.

Hi @Bobo, the flow is an inject node programmed to the time of your best convenience followed by an exec node that gives the date + time, followed by a debug node. I just change my exec node with the linux top command in batch mode. This way I will get date + time + uptime + average load. Very useful for troubleshooting.

top -b | head -n 1

Tip: You get almost the same output with just uptime (and a slightly cleaner one without "top" in the begnning).

2 Likes

Tip: You get almost the same output with just uptime (and a slightly cleaner one without "top" in the begnning).

Indeed. I wanted however to receive some additional information on what the average CPU load looks like. I am considering also to gather memory usage data. I am puzzled why the very same flow reports no delay in a development machine whereas delays are reported in a production machine.

I believe I have a reproducible, and explainable, way of showing this problem.
I configured the simple flow below into a pi. It simply uses an inject set to fire at 12:00 and logs that to the system log, including the current timestamp value in human readable form. Then, at about 10:30, I powered the pi down (I made sure the power was removed, not just halting the pi, I don't know whether it has a hardware clock that might keep going if the power is left on).
At about 11:50 I disconnected the routers connection to the internet, so the pi would not be able to get a timesync, and powered the pi up. After it was up and running I ran node-red-log and left it monitoring the log. Looking in the log it can be seen that the pi had restarted with the clock set to the time it was when it was powered down, so it was showing 10:30 instead of 12:50.
I then reconnected the router to the internet and a short time later saw, in syslog, the clock synchronising.
At 12:00 the inject node did not trigger, and in fact it triggered at 13:23. The timestamp in the log and in the logged message both showed 13:23.

I imagine what happened is that node red started up and the code invoked by the inject node looked at the time in the system clock (10:30) and decided it had to wait 90 minutes before firing, and that the fact that the clock later changed was not taken into account, so it did in fact fire 90 minutes later, at 13:23

image

[{"id":"d586b6c6.2299a","type":"inject","z":"21667d1e.a5b9d2","name":"12:00","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"00 12 * * *","once":false,"onceDelay":0.1,"x":140,"y":340,"wires":[["62af8270.b4f564"]]},{"id":"8aaa8ddb.8ce168","type":"debug","z":"21667d1e.a5b9d2","name":"Daily test debug 12:00","active":true,"tosidebar":true,"console":true,"tostatus":false,"complete":"payload","targetType":"msg","x":510,"y":400,"wires":[]},{"id":"62af8270.b4f564","type":"function","z":"21667d1e.a5b9d2","name":"Human readable","func":"msg.payload = new Date(msg.payload)\nreturn msg;","outputs":1,"noerr":0,"x":370,"y":340,"wires":[["8aaa8ddb.8ce168"]]}]
31 Jul 10:30:28 - [info] Node-RED version: v1.0.3
31 Jul 10:30:28 - [info] Node.js  version: v12.18.3
31 Jul 10:30:28 - [info] Linux 5.4.51-v7l+ arm LE
31 Jul 10:30:29 - [info] Loading palette nodes
31 Jul 10:30:33 - [info] Settings file  : /home/me/.node-red/settings.js
31 Jul 10:30:33 - [info] Context store  : 'default' [module=memory]
31 Jul 10:30:33 - [info] User directory : /home/me/.node-red
31 Jul 10:30:33 - [warn] Projects disabled : editorTheme.projects.enabled=false
31 Jul 10:30:33 - [info] Flows file     : /home/me/.node-red/flows_eeyore.json
31 Jul 10:30:33 - [info] Server now running at http://127.0.0.1:1880/
31 Jul 10:30:33 - [warn]
---------------------------------------------------------------------
Your flow credentials file is encrypted using a system-generated key.
If the system-generated key is lost for any reason, your credentials
file will not be recoverable, you will have to delete it and re-enter
your credentials.
You should set your own key using the 'credentialSecret' option in
your settings file. Node-RED will then re-encrypt your credentials
file using your chosen key the next time you deploy a change.
---------------------------------------------------------------------
31 Jul 10:30:33 - [info] Starting flows
31 Jul 10:30:33 - [info] Started flows
31 Jul 13:23:16 - [info] [debug:Daily test debug 12:00]
2020-07-31T12:23:16.496Z

4 Likes

Hi @Colin, I think you nailed it !!

I am still collecting data but what I have in hands confirms your assumptions. The root cause of the issue in my case seems to be the power outages suffered by the pi. There is no delay when there is no power cut.

Also it makes sense considering the way that the fake clock works in the pi. This fake clock is the artifact that solves the timekeeping problem a stock pi does not have a hardware clock (RTC).

Some extracts about the clock sync daemon:

systemd-timesyncd is a daemon that has been added for synchronizing the system clock across the network. It implements an SNTP client.

The daemon saves the current clock to disk every time a new NTP sync has been acquired, and uses this to possibly correct the system clock early at bootup, in order to accommodate for systems that lack an RTC such as the Raspberry Pi and embedded devices, and make sure that time monotonically progresses on these systems, even if it is not always correct.

The daemon touches file below as a way to save the time. The modification time of this file indicates the timestamp of the last successful synchronization

pi@pi:~ $ ls -l /var/lib/systemd/timesync

-rw-r--r-- 1 systemd-timesync systemd-timesync 0 Jul 31 10:26 clock
pi@pi:~ $

References:

https://manpages.debian.org/buster/systemd/systemd-timesyncd.service.8.en.html

https://wiki.archlinux.org/index.php/systemd-timesyncd

The corollary to my post is that what I described will happen if node-red starts up before the time is synced, which might happen if the wifi is slow to startup, In the case of a power fail it may well be that node-red gets running quicker than the router does.

As a workaround one could use one of the scheduling nodes, but one would have to test to see if they suffer from the same problem.

Right, now comes the question how to avoid the issue.

I am considering to test a function node to inject time. It would use cron.js but would first force a time sync. Probably there are some smart way but anyways it is a good opportunity to do some practice with the cron library, which I don´t use since ages.

PS: not sure if my idea makes sense. Forcing a sync requires the network to be up. It may not be smart to make the flow wait for the network. Considering to add a RTC to each pi.

You could just have an inject every minute and check the time manually to perform the action. Or try the other scheduling nodes such as node-red-cron-plus.

Unfortunately (I believe) the library under the hood will have the same issue. I'm gonna raise an issue with the Dev.

@ristomatti, I see now. Good point. I was using uptime -p which provides only the uptime. Using uptime is better as it provided uptime + average load + current time. Exactly what I want and simpler. Thank you very much !

1 Like

I have updated cron-plus node (not released / still testing atm) to handle clock changes.

I realise posting this here might be considered a little off topic (as the topic is about the inject node) but if what I have done works for cron-plus, the idea could be carried to the inject node? That said, if desired / preferred, I don't mind this being split off to a new thread

So...

Having failed to find anything in node-js that alerts the runtime to a system time change, I have added some code to monitor system time. In basic terms, every second, the clock is checked. If it is out by +/-5 seconds since the last time it checked, it considers the system time has changed and instructs all schedules in the node(s) to refresh. NOTE: I would appreciate thoughts on system time monitoring method, the duration and frequncy chosen etc. This is the code...

        const MAX_CLOCK_DIFF = 5000;
        var clockMonitor = setInterval(function timeChecker() {
            var oldTime = timeChecker.oldTime || new Date();
            var newTime = new Date();
            var timeDiff = newTime - oldTime;
            timeChecker.oldTime = newTime;
            if (Math.abs(timeDiff) >= MAX_CLOCK_DIFF) {
                node.log("System Time Change Detected!");
                refreshRunningTasks(node); // << this is where I refresh the cron tasks
            }
        }, 1000);

I have tried capturing a demo (where I set the time back & you can see the events occur on time) as a video/gif using ShareX however, ironically, it cant handle the system time changing during recording :man_shrugging: - so here is 3 screen shots with annotations instead...

1 Like

Great work! Maybe you should consider creating a PR to the cron library instead?

I already have (for the cron lib I use)

I dont use the same CRON lib as the inject node as doesn't support some of the extended cron capabilities I wanted (in particular, year parameter and date sequences)

1 Like

Thanks everyone for your help on this.
It was nice to follow through your various logical deductions that found the likely root cause. :wink:

The reproduced example by Colin would definitely explain what happened to me. I remember my original problem started happening during a period of router instability (e.g. router rebooting 30 times a day because of a poorly behaving Google Home) and also after I powered off the pi for periods of time to fit a new 3D printed Node Red case etc.

1 Like