Watchdog to restart NR

krambriw · 15 October 2018 05:36

I have below a simplified flow for monitoring that certain messages, filtered out by the Switch Node, are continuously received from a specific equipment (hw connected via serial port in another flow but the same NR).

My questions: If NR "freezes" for some unknown reason, will the Trigger Node, being triggered, still fire and execute the function in the Exec Node? Or is everything dead from the "freezing" point?

TotallyInformation · 15 October 2018 13:01

Essentially Node-RED, like the underlying NodeJS (and JavaScript in general) is largely single-threaded. So if anything stops or kills the process then the flows cannot continue.

However, it depends what you mean by "freezes" - there are lots of things that might go wrong with a flow or a node that wouldn't lock the process since that is a fairly core design principal for JavaScript/NodeJS applications.

In my, albeit limited, experience, very little causes NR to stop. As far as I can remember, as long as the startup process works, NR keeps going unless something causes an almighty crash and you can handle alerting and reporting for that as you would any other application or service.

Bottom line, impossible to know for sure whether the trigger will or wont fire. Best to use defensive programming to make sure your flow handles exceptions.

krambriw · 15 October 2018 14:21

Thanks Julian, yes, you are right. I was just wondering since a timer would normally run in a separate thread (to my understanding) and I hoped that the it would already load the function code to be executed when it was triggered. As you said, defensive...is the goal

Anyway, if I can't control or find what is causing the sudden death I can always build a watchdog script "outside" of NR, basically monitoring in the same way and if needed force a restart or reboot
(I hate it of course, lipstick on corpse)

TotallyInformation · 15 October 2018 14:29

Sorry, wasn't sure if you knew that. I don't know the internals of NodeJS and V8 well enough to predict how a setTimeout might behave though if the parent thread crashes, I'd expect it would simply disappear. But because Node-RED is a service and does its best to keep on truckin' - it can be hard to predict how it would behave.

Well an external watchdog is often a very sensible approach to reliable computing. Nobody can ever cover all eventualities. Putting in a catch node may help in some circumstances so may also be worth considering.

Taken to an extreme, you might even set up a second system with bridged MQTT brokers. Making use of a heartbeat MQTT output from each system along with an LWT so that the brokers know when each system is on/off-line. Then you could have the same flow on both systems with a gate at the start of the flow to block progress on one of the systems until it detects that the other system is offline

Instant poor-man's high-availability.

krambriw · 15 October 2018 14:42

Thats a cool one!!! Have to consider if this is overkill for my home automation, but tempting I must say, just for the thrill !!!

drmibell · 15 October 2018 16:31

I sometimes use this sort of arrangement to patch developmental bits of code into and out of my deployed system for testing. No risk, no thrill.

Topic		Replies	Views
Notification when NR is stopped General	12	935	21 September 2019
Locking up NR...and escape out from there General	18	2326	10 April 2019
Restart Node Red if no data received from NCD modem General	9	779	22 September 2021
Is there a right way to update NR from a flow? General	12	760	23 January 2021
Persistence after crashing General	15	1098	22 September 2020

Watchdog to restart NR

Related topics