Yet when I ask NR, it says CPU load at 100% and just basically not moving.
It used to be pulsing from about 30% to 100% every 20 seconds on a time span to do things.
I get that. When it has a few things to do the load goes to 100%.
But now just stuck at 100% all the time?
Yet top isn't really showing that.
And as I just said, the machine has just hung when I got that screen shot.
This suggests something in your flows is stuck in a loop.
You can check that theory by:
Stop the Node-RED service : node-red-stop
Use top to verify NR has stopped and CPU usage is back at a normal level.
Start NR in safe mode : node-red --safe
Check top again.
If the cpu usage is still at the expected level, then the issue is with your flows.
Look for anywhere that might be looping in some form and add a debug node that also logs to the console (so it will appear in the output of node-red-logs).
Yes, that would be something you could try. Otherwise, as I said, think about where in your flows you might be looping and add some debug to try to narrow it down.
It may be your flows have been affected by the async message delivery changes that were in 1.0.
It is obvious the machine is running 100% because even trying to ssh to it is painful.
I can nearly go and make a cup of coffee in the time it takes to establish a connection.
That error message on the screen shot.... Is that something I should consider as a possible cause?
The only reason I am asking is once before a node was being problematic.
If I had it installed, it would kill the machine.
Uninstalling it, it would work ok.
Alas simply not using it didn't help.
a quick google for kswapd0 will tell you that it is the process that manages swapping things in and out of memory... and once that get busy you know you are running out of RAM. There are quite a few guides on things you can do to tune that a bit.
Not wanting to sound more stupid than I am (though I don't think that is going to be . . . . )
That kind of stuff still is in the realm of magic and I am not confident to go into it blindly.
As I read it, it is not a good idea to have swap files active as they kill SD cards.
Though this link may not say that, I remember it because one day I was exploring the net and read something about swap files killing SD cards, and therefore how to disable them.
This was done WEEKS ago. I can't really associate this as a cause.
I have disabled it on ALL my RPIs and none of them (including the Zeros) seem to have any problems like this.
Also this is a RPI 2 quad(?) core. All the zeros are single.
It is not complicated. Something in your flow is clobbering the processor. Start it in safe mode and disable the code that triggers what you think is causing the problem (the pings or whatever). Deploy and check it is ok. Gradually re-enable bits until it goes wrong, then you will know the problem. It will not be the pings themselves it will be something that follows on from the ping.