Stops listening and restart won't work

#1

Not really sure why this is happening and I can’t see anything in the logs but sometimes (every second/third day, or sometimes more often) the GUI stops responding. The port is still bound to 1880 and there are no errors in the logs and I’ve monitored CPU, temperature and memory and there is nothing alarming there either.

Running on a RPi 3 B and accessing it over VNC and SSH works and the GUI is also unresponsive on the RPi itself (over VNC) using the local IP so I have ruled out network issues.

Once the “unresponsiveness” happens the only thing that helps is a reboot of the RPi. Trying to restart node-red isn’t working and it just “hangs” doing nothing. kill -9 doesn’t stop the service either.

I am running it as a “service”:
`

Auto restart on crash

Restart=on-failure
`
I also have tried to disable the “restart on crash” but that doesn’t seem to change the behavior, neither is there anything in the logs that would indicate that ta restart and/or crash has occurred…

Any suggestions for what I can look at?

#2

Loads of questions:
Firstly what versions of everything? Post the output from node-red-log following a node red restart.
If you run the command top in a terminal does it show the cpu is being hogged?
Are you using MQTT? A cause of node red apparently hanging can be an MQTT loop. Try subscribing to everything from the mqtt command line client when it hangs and see if anything is going on when it hangs.
When you say a node-red restart doesn’t work, what does node-red-log show after you try to restart.
How are you restarting?

#3

Hi Colin! Thanks for your response and the excellent line of questions… :slight_smile:

Versions:

Welcome to Node-RED
===================
13 Jun 09:30:21 - [info] Node-RED version: v0.18.2
13 Jun 09:30:21 - [info] Node.js  version: v6.12.3
13 Jun 09:30:21 - [info] Linux 4.9.59-v7+ arm LE
13 Jun 09:30:24 - [info] Loading palette nodes
13 Jun 09:30:29 - [info] Dashboard version 2.8.0 started at /ui

Top:

%Cpu(s):  3.3 us,  0.2 sy,  2.7 ni, 93.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :   949580 total,   518456 free,   184604 used,   246520 buff/cache
KiB Swap:   102396 total,   102396 free,        0 used.   692032 avail Mem

Not using any MQTT.
Restarting by first disabling the service sudo systemctl disable nodered.service and then trying node-red-stop but it never stops. sudo doesn’t make any difference. Also kill or kill -9 on the node-red process isn’t stopping it, only a reboot does…

#4

Is the top from when it hangs? I am never sure what the %Cpu(s) line tells you, but I presume you have checked that no process is consuming lots of cpu (when it hangs).
If sudo kill -9 doesn’t work then something odd is going on. When it hangs and sudo kill -9 node-red does not work what does ps -l <pid> show (where <pid> is the pid of node-red, which you can get from ps au|grep "node-red")?
You don’t need to disable node-red to stop or restart it, just use

sudo systemctl stop nodered
sudo systemctl start nodered

or just

sudo systemctl restart nodered

You don’t need the .service either.
Can you stop/start and kill -9 node-red when it is apparently operating normally?

#5

The top varies a bit when it “hangs” but is kept below 20% and idles at about 5% CPU.

It stops and restart normally when not in “hang-mode” and here are the latest logs from the last hang:

14 Jun 21:04:30 - [error] [http request:Update outside-temp] Error: socket hang up
14 Jun 21:04:30 - [error] [http request:Update outside-temp] Error: socket hang up
Stopping Node-RED graphical event wiring tool....
nodered.service: State 'stop-sigterm' timed out. Killing.
nodered.service: Killing process 336 (node-red) with signal SIGKILL.
nodered.service: Killing process 926 (nrgpio) with signal SIGKILL.
nodered.service: Killing process 928 (python) with signal SIGKILL.
nodered.service: Killing process 929 (nrgpio) with signal SIGKILL.
nodered.service: Killing process 931 (python) with signal SIGKILL.
nodered.service: Killing process 932 (nrgpio) with signal SIGKILL.
nodered.service: Killing process 934 (nrgpio) with signal SIGKILL.
nodered.service: Killing process 935 (python) with signal SIGKILL.
nodered.service: Killing process 937 (python) with signal SIGKILL.

I tried with a sudo systemctl restart nodered when the above happened…

The Raspberry PI is acting normally otherwise and there is no visible lag or any other “strangness” when node-red hangs.

#6

I use “sudo service nodered restart”(or stop or start), and for me works on a rapi 3.

Regads

#7

Are you only getting that error when it hangs?

#8

I believe that does the same thing as the systemctl command

#9

Yes, otherwise it is working normally

#10

So the http request is failing with a timeout or something and that is locking up node red in such a way that you cannot even kill the process. What node are you using for the http request?

#11

Ah, no, sorry. I meant that it is working fine for some time and the [http request:Update outside-temp] is updated as they should be but then it “hangs” and then the request errors for socket hangup starts.

#12

Sorry I am not following you. Are you saying that the socket hangup does not happen at the same time as node red hangs?

#13

Sorry to confuse you… :slight_smile:
I don’t know if the “hanging” happens before or after the socket hangups start (I think it hangs first, but am not sure).
I have removed the HTTP request nodes to test and it still hangs.

#14

I don’t think you answered that question did you?

Also are you doing any file access from the flow, possibly with a remote or removable drive?

#15
ps -l 314
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY        TIME CMD
4 D  1000   314     1  0  85   5 - 47349 -      ?          1:00 node-red

Here is the output during the “hang”…

#16

Note that the state (under S) is shown as D, which means it is uninterruptible, hence kill -9 will not work. You did not answer my other question, are you doing any file access from the flow, possibly with a remote or removable drive? If I were a betting man I would put a small wager on the possibility that you are accessing an NFS drive.

#17

Sorry to disappoint, but no NFS or file access…
I have an Arduino nano on USB with two analog sensors hooked up, 6 1-wire Ds18b20 and one DHT21 on GPIO all shows on a dashboard

#18

Personally I’d swap out the sdcard and see if the problem reappears.

#19

Also it could be a power supply issue, is the PSU up to the job?

#20

It’s a 4A, had problems with 2A