CPU load giving funny output from machine

Usually I am happy with what I am getting from the node for CPU usage on a RPI.

I have a couple of RasPi Model B rev 2.

When I am "flogging" one of them it shows clearly on the dashboard.

I am updating a Raspberry Pi 3 Model B Rev 1.2 and with the top command I am seeing 100% CPU load.

But the CPU load node is spitting out stuff like 75% usage.

This is a machine which I usually don't have online and haven't done too much work recently.

I've checked the flows in both machines.

Both are set as so:

[{"id":"1951f1d4.2480f6","type":"cpu","z":"675e227d.d158b4","name":"","msgCore":true,"msgOverall":false,"x":1190,"y":300,"wires":[["164c9f51.747fa1","7b6bc7a3.8b71e"]]}]

Yet one shows a way better CPU load than the other for what is going on.

So, back to the 3:
top is showing 100%.

The node is saying: 65%

Pictures to show:

Screenshot%20from%202019-10-01%2015-11-19 Screenshot%20from%202019-10-01%2015-11-30

I had truly hoped that was a quadcore device with a load of 4.87 4.97 4.68. So that means 4.87 average over the last minute, 4.97 average over the last 5 minutes, 4.68 average over the last 15 minutes. Where the a load of 1.0 is the equivalent of a single core at 100%
What 4.87 means on a single core like the pi is that 3.87 processes were waiting 100% for a minute, for that 4th process to finish and the rest to continue. So the system is 387% overloaded compared to what it can handle.

So the 11% displayed in the gauge looks a bit funny but chances are it’s so overloaded that it can’t properly update. Same for the temperature, 55 seems a bit low for 4.68 over 15 minutes.

Other machines with the same flow structure happily indicate 100% CPU load when they are being flogged.

Does this help ?

And what are you injecting into the gauge ? (I need to read: CPU load node)

Well, then I wouldn’t have a clue beyond that a nearly 400% overload for a long enough time (upwards to 15 minutes) means that you’ll be seeing those delays for a while longer. Might be a good idea to shut down unneeded processes, or restart those you don’t immediately need later.

The worst load average I’ve seen in my life was when an acquaintance worked as database administrator and showed the load average charts for the 6-core database server from the moment ticket sales for a popular concert were opened: 24 22 20
Result: it was so busy with people attempting to buy a ticket that the database couldn’t handle the lookups fast enough that the actual sale process slowed down on the website.

While it’s an unfair comparison, the same might happen on your machine: because the pi-updates package is pulling in so much cpu processing, the rest of the machine slows down to near zero, and has to wait for their processes to be run/resumed to run. And that includes incoming node-red events, or timers that are delayed because of it.

Well, kind of.

But I can't associate that with what I am seeing.

I was installing a program/package. I don't know why but it was SLOW AS . . . .

The NR flow was saying a normal CPU load. Say 30 - 70%.

But trying to type commands from the CLI was painful at best.

I did top and it shows me 100% CPU load.

So?

Well, as I said, if I do something like that on other machines (ok, not a RP3) but Raspi 2 model B - or what ever it was I posted - and Node-Red shows me 100% CPU load.

So making that the base line:
When I do it on this RP3, it should show the same. It isn't/doesn't.

So, I am curious to what is going on.

A long time later - after the package is/was installed, I could type easily and no problems.
and top showed 40% CPU load.

So. . . . .

What's going on?

When you say you did this on similar machines, did they just show a cpu load of 100% in a graphic tool, or a load average of 4+ in top too? Because 4.6 doesn’t just mean 100%, but 100% plus 360% waiting to be processed. There’s a good analogy of a traffic jam: 0.5 means half the road is filled, but everything is driving. 1.0 means the entire road is filled, but it’s still driving. Above 1.0 means cars have to stop because they no longer fit on that amount of road, and have to wait until some cars have moved away. 4.68 over 15 minutes means that over the last 15 minutes the road was fully covered in cars, but there were also but there’s also 3.68 times that amount of cars waiting to join them that could not move on. So everything they had to do was blocked until those first cars left.
https://scoutapm.com/blog/understanding-load-averages

Yes.

On the other machines when doing heavy things I see the cpu load go to 100%, or say 80+%

It helps me track what is going on and when it is complete.

What I mean is that seeing a CPU load of 100% does not mean the same as a load average that high, and you can’t compare the situations as “the same” unless you know the load average of those times. Say they were 1.20, or 1.60 or even 2.0 (100% load plus 1 equivalent 100% waiting), that’s still different from the extreme load average you’re seeing now.

I'm still missing something then.

On other machines which have the same part of the flow, they realistically indicate the CPU load on the gauge.

They are doing some heavy load task, it shows high CPU load.

With the same flow structure and node setting, on this machine, it isn't showing a similar output.

So: why isn't it?

Possibly because it is so heavily overloaded that it is not even able to keep up with updating the display. Are you running the browser on the pi (directly or via VNC or similar)? If so then even more likely it is not able to update the display.

1 Like

Exactly what Colin just asked, but probably better than my attempts so far. Top does show that you have a graphical environment running, after all Xorg shows up. But based on the colouring of the terminal you took the screenshot from one of your Ubuntu machines, so that’s giving some hope that the browser screenshot was remote too.

Well, I can't say.

One of my machines - TimePi - is an OLD one. 2 USB ports and composite output on an RCA socket.

It does a fine job of updating its CPU load.

I also have a RPPZ (W) and it too does a great job. And it is single core.

The machine in question is a RPI3 and it was ONLY installing a package and you say it was incapable of updating the CPU load?

Once the package install was complete, it was again usable.

Yeah, ok.

All we are telling you is that the processor was very heavily overloaded. Why that should be is another issue. It is nothing to do with node-red as far as I can see. I note pi-packages is hogging the processor. It seems to have used 35 minutes of processor time in the last 52 minutes, so it is doing a lot. I don't know what that command is.

check your SD card on the PI for errors - disk errors will not show up in top but will have highest priority for retries - if you were installing at the time this may be something to look at

Craig

This is a misty area for me.

As it is a remote machine and headless.

The first way I found needed to boot from grub and do these weird commands.

Another path which I followed says it is destructive and so you need to be careful.

So it is the boot SD card of a remote RPI.

What command would I use - please.

If you have (serious) errors on your SD card, you should see it during boot, but as it is remote you can use sudo dmesg.

To scan for errors, the boot partition needs to be mounted as read-only. The easiest method is to remove the SD card, mount it on another computer and use the fsck command on it (don't do this on the pi itself)

Doing the sudo dmesg I get this:

dmesg.txt (16.3 KB)

But looking, at it, there doesn't seem to be a boot error.

It is easy to try a new card and see if it helps. Burn your image backup of the card onto a new one and try that.

Then there are no "serious" errors, still recommend to fsck on another system. Or try this (haven't tried, don't blame me if something goes wrong)