Sorry folks, but I really am stuck with what is going on with error trapping.
To the best of my knowledge things are ok as they were.
Me, every wanting to stir the hornet's nest, I am trying to make the flow better.
This is what I have found to be the part which is the problem:
Code below:
As is/was the red line was there.
All errors caught and passed to the next node.
"Name flow" names the flow.
The gate
though shown as open should be closed.
That goes up to the other node (circled in red) and there is where the fun happens.
The idea is that usually it goes on through the gate and the set topic
node basically does a bit more than the name flow
node and passes on to the higher system.
But sometimes when I am working on a flow, I don't want to propagate these errors and just keep them local.
So the gate
is closed. However, that other function
node just enables a local indication of a flow error rather than passing it up higher.
Anyway, so I was doing house keeping.
Originally I messed up and forgot to move the msg.error
to msg.payload
.
So down the stream the message was in msg.error
, though I thought it was msg.payload
.
(But again: anyway.)
I optimised it and set msg.payload = msg.error;
Then delted msg.error;
to keep things succinct.
This is the code for the two function
nodes:
[{"id":"169525cc.6f8a0a","type":"function","z":"c3ec93b2.f3154","name":"Name flow","func":"const flow_name_ = flow.get(\"name\");\nmsg.topic = flow_name;\nmsg.payload = msg.error;\ndelete msg.error;\nreturn msg;\n","outputs":1,"noerr":0,"x":4400,"y":300,"wires":[["3a30b545.b56caa","9b7d5e17.fff498"]]},{"id":"9b7d5e17.fff498","type":"function","z":"c3ec93b2.f3154","name":"","func":"var name = flow.get('name');\nvar ok = global.get('Error_Flow_Pass');\nvar stop = global.get('Error_Flow_Stop');\nvar caught = global.get('Error_Detected');\nvar counter = context.get('counter')|| 0;\n\n//node.warn(\"Name of the flow \" + name);\n// -- check if this is a CONTROL message.\nif (msg.topic == 'CONTROL')\n{\n node.warn(\"Control signal received \" + msg.payload);\n if (msg.payload == 'STOP')\n {\n context.set('notify',1);\n msg.topic = name;\n msg.colour = stop;\n } else\n {\n context.set('notify',0);\n msg.topic = name;\n msg.colour = ok;\n context.set('counter',0);\n node.status({});\n }\n return msg;\n}\n\n// -- Send YELLOW as the message if error detected.\n//\tThis is only if the error reporting is blocked.\nvar pass = context.get('notify') ||0;\nif (pass == 1)\n{\n if (counter == 1)\n {\n // set msg.payload to indicate\n node.status({fill:\"red\",shape:\"dot\",text:\"ERROR\"});\n node.warn(\"Error detected check indication\");\n msg.topic = name;\n msg.colour = caught;\n context.set('counter',2);\n }\nreturn msg;\n}\n","outputs":1,"noerr":0,"x":4590,"y":230,"wires":[["65452684.31ed48"]]}]
Originally it was more like:
(from another machine and flow)
[{"id":"928681dd.e0c018","type":"function","z":"b4454b36.0344c","name":"Name flow","func":"var device_name = global.get('myDeviceName');\nmsg.topic =\"ERROR_REPORT/\" + device_name + \"/\" + flow.get(\"name\");\nnode.status({fill:\"red\",shape:\"dot\",text:\"ERROR\"});\nreturn msg;\n","outputs":1,"noerr":0,"x":4350,"y":220,"wires":[["3f278612.cb40ca","3f894e5e.00c832"]]},{"id":"3f894e5e.00c832","type":"function","z":"b4454b36.0344c","name":"","func":"var name = flow.get('name');\nvar ok = global.get('Error_Flow_Pass');\nvar stop = global.get('Error_Flow_Stop');\nvar caught = global.get('Error_Detected');\n\n//node.warn(\"Name of the flow \" + name);\n// -- check if this is a CONTROL message.\nif (msg.topic == 'CONTROL')\n{\n //node.warn(\"Control signal received \" + msg.payload);\n if (msg.payload == 'STOP')\n {\n context.set('notify',1);\n msg.topic = name;\n msg.colour = stop;\n } else\n {\n context.set('notify',0);\n msg.topic = name;\n msg.colour = ok;\n }\n return msg;\n}\n\n// -- Send YELLOW as the message if error detected.\n//\tThis is only if the error reporting is blocked.\nvar pass = context.get('notify') ||0;\nif (pass == 1)\n{\n // set msg.payload to indicate\n node.warn(\"Error detected check indication\");\n msg.topic = name;\n msg.colour = caught;\nreturn msg;\n}\n","outputs":1,"noerr":0,"x":4530,"y":170,"wires":[["aff43e4d.57c668"]]}]
So, I changed the code from the second one to the first one.
And when I tested it (by generating an error) it all went pear shaped.
I am 99.9% sure that it is the second node, because I have cut it down to basically those two nodes - with the catch
node.
I inject an error and yuck.
Though not exactly the same node layout, this is my test bed.
I have a delay
node to attempt to rate limit
the messages.
Note I have disconnected the link
node and I have a debug
node looking at what is sent.
The gate
is closed so that part of the flow isn't in the problem - yet.
So I press the generate error
button (bottom) and this is what happens.
(this is the code in that node:
var n = 23 + 56 / a;
//var n = 23 + 56 ;
msg.payload = n;
return msg;
)
NOTE The note.
If this is not there, I can press the button and I get an error.
I then press the NEXT
button and I see the error in the debug
node.
I put in the link as indicated by the note
, DEPLOY and press the inject error
button.
The machine dies.
Looking at the CLI/terminal from which I started NR (in safe mode) I see this:
27 Sep 21:02:12 - [warn] [function:Variable repeat time] Value set
27 Sep 21:02:18 - [warn] [vcgencmd:Other stuff] Previous vcgencmd command hasn't finished yet
27 Sep 21:17:41 - [warn] [function:f445a2f0.bdbfb8] Control signal received STOP
27 Sep 21:20:09 - [warn] [function:Button Colour] Node Configuration not set
27 Sep 21:20:12 - [info] Stopping modified nodes
27 Sep 21:20:12 - [info] Stopped modified nodes
27 Sep 21:20:12 - [info] Starting modified nodes
27 Sep 21:20:14 - [info] Started modified nodes
Alas not much to tell me what is going on.
But a remote machine which is monitoring the machine in question basically tells me NR is dead.
SSH'ing to the machine on a second terminal I get this:
pi@TimePi:~ $ ./node-red_status.sh
â—Ź nodered.service - Node-RED graphical event wiring tool
Loaded: loaded (/lib/systemd/system/nodered.service; enabled; vendor preset: enabled)
Active: failed (Result: timeout) since Fri 2019-09-27 20:15:34 AEST; 1h 17min ago
Docs: http://nodered.org/docs/hardware/raspberrypi.html
Process: 250 ExecStart=/usr/bin/env node-red-pi $NODE_OPTIONS $NODE_RED_OPTIONS (code=killed, signal=KILL)
Main PID: 250 (code=killed, signal=KILL)
Sep 27 20:14:04 TimePi systemd[1]: Stopping Node-RED graphical event wiring tool...
Sep 27 20:14:56 TimePi Node-RED[250]: 27 Sep 20:14:56 - [warn] [catch:40abdd07.f5fb44] Message exceeded maximum number of catc
Sep 27 20:14:56 TimePi Node-RED[250]: 27 Sep 20:14:56 - [error] [catch:40abdd07.f5fb44] RangeError: Maximum call stack size ex
Sep 27 20:14:57 TimePi Node-RED[250]: 27 Sep 20:14:57 - [warn] [function:Button Colour] Node Configuration not set
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: State 'stop-sigterm' timed out. Killing.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Killing process 250 (node-red) with signal SIGKILL.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Main process exited, code=killed, status=9/KILL
Sep 27 20:15:34 TimePi systemd[1]: Stopped Node-RED graphical event wiring tool.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Unit entered failed state.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Failed with result 'timeout'.
lines 1-17/17 (END)
So given it is dead.
Pressing ^c
in the CLI which started NR in safe mode does nothing.
I stop NR from the other CLI yields no result.
I think this was brought about by me moving msg.error
to msg.payload
but I can't see anything which really points to that.
Yeah: My problem. But I am not seeing what is causing this.
I've narrowed it down to those couple of nodes but can't see how it is doing it.
Thanks.