Error detecting really not "working"

Sorry folks, but I really am stuck with what is going on with error trapping.

To the best of my knowledge things are ok as they were.
Me, every wanting to stir the hornet's nest, I am trying to make the flow better.

This is what I have found to be the part which is the problem:

Code below:

Screenshot%20from%202019-09-27%2021-03-29

As is/was the red line was there.
All errors caught and passed to the next node.
"Name flow" names the flow.
The gate though shown as open should be closed.
That goes up to the other node (circled in red) and there is where the fun happens.

The idea is that usually it goes on through the gate and the set topic node basically does a bit more than the name flow node and passes on to the higher system.
But sometimes when I am working on a flow, I don't want to propagate these errors and just keep them local.

So the gate is closed. However, that other function node just enables a local indication of a flow error rather than passing it up higher.

Anyway, so I was doing house keeping.
Originally I messed up and forgot to move the msg.error to msg.payload.
So down the stream the message was in msg.error, though I thought it was msg.payload.
(But again: anyway.)

I optimised it and set msg.payload = msg.error;
Then delted msg.error; to keep things succinct.

This is the code for the two function nodes:

[{"id":"169525cc.6f8a0a","type":"function","z":"c3ec93b2.f3154","name":"Name flow","func":"const flow_name_ = flow.get(\"name\");\nmsg.topic = flow_name;\nmsg.payload = msg.error;\ndelete msg.error;\nreturn msg;\n","outputs":1,"noerr":0,"x":4400,"y":300,"wires":[["3a30b545.b56caa","9b7d5e17.fff498"]]},{"id":"9b7d5e17.fff498","type":"function","z":"c3ec93b2.f3154","name":"","func":"var name = flow.get('name');\nvar ok = global.get('Error_Flow_Pass');\nvar stop = global.get('Error_Flow_Stop');\nvar caught = global.get('Error_Detected');\nvar counter = context.get('counter')|| 0;\n\n//node.warn(\"Name of the flow \" + name);\n//  --  check if this is a CONTROL message.\nif (msg.topic == 'CONTROL')\n{\n    node.warn(\"Control signal received \" + msg.payload);\n    if (msg.payload == 'STOP')\n    {\n        context.set('notify',1);\n        msg.topic = name;\n        msg.colour = stop;\n    } else\n    {\n        context.set('notify',0);\n        msg.topic = name;\n        msg.colour = ok;\n        context.set('counter',0);\n        node.status({});\n    }\n    return msg;\n}\n\n//  --  Send YELLOW as the message if error detected.\n//\tThis is only if the error reporting is blocked.\nvar pass = context.get('notify') ||0;\nif (pass == 1)\n{\n    if (counter == 1)\n    {\n        // set msg.payload to indicate\n        node.status({fill:\"red\",shape:\"dot\",text:\"ERROR\"});\n        node.warn(\"Error detected check indication\");\n        msg.topic = name;\n        msg.colour = caught;\n        context.set('counter',2);\n    }\nreturn msg;\n}\n","outputs":1,"noerr":0,"x":4590,"y":230,"wires":[["65452684.31ed48"]]}]

Originally it was more like:
(from another machine and flow)

[{"id":"928681dd.e0c018","type":"function","z":"b4454b36.0344c","name":"Name flow","func":"var device_name = global.get('myDeviceName');\nmsg.topic =\"ERROR_REPORT/\" + device_name + \"/\" + flow.get(\"name\");\nnode.status({fill:\"red\",shape:\"dot\",text:\"ERROR\"});\nreturn msg;\n","outputs":1,"noerr":0,"x":4350,"y":220,"wires":[["3f278612.cb40ca","3f894e5e.00c832"]]},{"id":"3f894e5e.00c832","type":"function","z":"b4454b36.0344c","name":"","func":"var name = flow.get('name');\nvar ok = global.get('Error_Flow_Pass');\nvar stop = global.get('Error_Flow_Stop');\nvar caught = global.get('Error_Detected');\n\n//node.warn(\"Name of the flow \" + name);\n//  --  check if this is a CONTROL message.\nif (msg.topic == 'CONTROL')\n{\n    //node.warn(\"Control signal received \" + msg.payload);\n    if (msg.payload == 'STOP')\n    {\n        context.set('notify',1);\n        msg.topic = name;\n        msg.colour = stop;\n    } else\n    {\n        context.set('notify',0);\n        msg.topic = name;\n        msg.colour = ok;\n    }\n    return msg;\n}\n\n//  --  Send YELLOW as the message if error detected.\n//\tThis is only if the error reporting is blocked.\nvar pass = context.get('notify') ||0;\nif (pass == 1)\n{\n    // set msg.payload to indicate\n    node.warn(\"Error detected check indication\");\n    msg.topic = name;\n    msg.colour = caught;\nreturn msg;\n}\n","outputs":1,"noerr":0,"x":4530,"y":170,"wires":[["aff43e4d.57c668"]]}]

So, I changed the code from the second one to the first one.

And when I tested it (by generating an error) it all went pear shaped.

I am 99.9% sure that it is the second node, because I have cut it down to basically those two nodes - with the catch node.

I inject an error and yuck.

Though not exactly the same node layout, this is my test bed.
I have a delay node to attempt to rate limit the messages.
Note I have disconnected the link node and I have a debug node looking at what is sent.

The gate is closed so that part of the flow isn't in the problem - yet.

So I press the generate error button (bottom) and this is what happens.
(this is the code in that node:

var n = 23 + 56 / a;
//var n = 23 + 56 ;
msg.payload = n;
return msg;

)

NOTE The note.
If this is not there, I can press the button and I get an error.
I then press the NEXT button and I see the error in the debug node.

I put in the link as indicated by the note, DEPLOY and press the inject error button.
The machine dies.
Looking at the CLI/terminal from which I started NR (in safe mode) I see this:

27 Sep 21:02:12 - [warn] [function:Variable repeat time] Value set
27 Sep 21:02:18 - [warn] [vcgencmd:Other stuff] Previous vcgencmd command hasn't finished yet
27 Sep 21:17:41 - [warn] [function:f445a2f0.bdbfb8] Control signal received STOP
27 Sep 21:20:09 - [warn] [function:Button Colour] Node Configuration not set
27 Sep 21:20:12 - [info] Stopping modified nodes
27 Sep 21:20:12 - [info] Stopped modified nodes
27 Sep 21:20:12 - [info] Starting modified nodes
27 Sep 21:20:14 - [info] Started modified nodes

Alas not much to tell me what is going on.

But a remote machine which is monitoring the machine in question basically tells me NR is dead.

SSH'ing to the machine on a second terminal I get this:

pi@TimePi:~ $ ./node-red_status.sh 
● nodered.service - Node-RED graphical event wiring tool
   Loaded: loaded (/lib/systemd/system/nodered.service; enabled; vendor preset: enabled)
   Active: failed (Result: timeout) since Fri 2019-09-27 20:15:34 AEST; 1h 17min ago
     Docs: http://nodered.org/docs/hardware/raspberrypi.html
  Process: 250 ExecStart=/usr/bin/env node-red-pi $NODE_OPTIONS $NODE_RED_OPTIONS (code=killed, signal=KILL)
 Main PID: 250 (code=killed, signal=KILL)

Sep 27 20:14:04 TimePi systemd[1]: Stopping Node-RED graphical event wiring tool...
Sep 27 20:14:56 TimePi Node-RED[250]: 27 Sep 20:14:56 - [warn] [catch:40abdd07.f5fb44] Message exceeded maximum number of catc
Sep 27 20:14:56 TimePi Node-RED[250]: 27 Sep 20:14:56 - [error] [catch:40abdd07.f5fb44] RangeError: Maximum call stack size ex
Sep 27 20:14:57 TimePi Node-RED[250]: 27 Sep 20:14:57 - [warn] [function:Button Colour] Node Configuration not set
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: State 'stop-sigterm' timed out. Killing.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Killing process 250 (node-red) with signal SIGKILL.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Main process exited, code=killed, status=9/KILL
Sep 27 20:15:34 TimePi systemd[1]: Stopped Node-RED graphical event wiring tool.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Unit entered failed state.
Sep 27 20:15:34 TimePi systemd[1]: nodered.service: Failed with result 'timeout'.
lines 1-17/17 (END)

So given it is dead.

Pressing ^c in the CLI which started NR in safe mode does nothing.

I stop NR from the other CLI yields no result.

I think this was brought about by me moving msg.error to msg.payload but I can't see anything which really points to that.

Yeah: My problem. But I am not seeing what is causing this.
I've narrowed it down to those couple of nodes but can't see how it is doing it.

Thanks.

first function node:

const flow_name_ = flow.get("name");
msg.topic = flow_name;

Yeah, and I shall admit I am not seeing anything wrong.

Oh. Ok the missing (or extra) _

Thanks.

I'll look at it.

Like this ?

Yeah, ok. So I stuffed up with the extra _.

But why am I not seeing any errors from that?

Particularly if I have a rate limit set so if it runs away with errors compounding errors, it would at least slow them down and not grind the machine to 100% CPU load.

But why am I not seeing any errors from that?

To be honest, I have no idea what your flow is doing and/or what its' purpose is (it is not explained), I have some problems digesting this topic, I only noticed the typo. That is where problems usually start.

1 Like

No worries.

I'm stupid.

Learning though. That's a good thing. I guess being awake for 18 hours isn't helping.

I have read many of your posts... and many times you are claiming that you're stupid... but you are not! :slightly_smiling_face:

You put a lot of effort into this, and as your name states are trying to learn, trying to understand things. You are doing exactly the right thing, you ask questions, engage in conversation and try to solve problems.

Failing and making mistakes is a part of the learning process.... so please, don't sell yourself short! :sunglasses:

2 Likes

But please do get some sleep :zzz: :zzz:
Your brain will be better for it.

5 Likes