CPU spikes on Raspberry 4 every hour

WhiteLion · 28 January 2022 16:43

hi floks,
I ve a problem with lag spikes / high CPU usage which accrue about every hour or so. (see screenshot) on my raspi 4.

I ve some pretty big flows for my switches (18 x ~400 nodes).
But the CPU runes nicely and smooth most time. I did not trigger events that couse these flows to run and create CPU load when the spikes appear. But when I disable the half of them (for tests) it seems to reduce the spikes (see smaller spike in the middle of the screenshot). So I guess the problem comes from them. The question is: why does these flows causes the spikes while they are not running/doing stuff? are there internal tasks (or so) coming from node red that causes this ?
Thx!

Colin · 28 January 2022 17:21

Are the spikes definitely coming from node red?

What is the scale on the graph?

WhiteLion · 28 January 2022 17:54

I am pretty sure they does since I tested around with disabling flows and watch the impact of the spikes.

I use "loadavg" node (node-red-contrib-os) to measure the load (node-red-contrib-os).

Description:
The load average is a measure of system activity, calculated by the operating system and expressed as a fractional number. As a rule of thumb, the load average should ideally be less than the number of logical CPUs in the system.

HaroldPetersInskipp · 28 January 2022 17:56

Even if pretty sure, I'd check the logs for more info.

WhiteLion · 28 January 2022 18:10

I ll will set log level to trace mode and post the log here.

Colin · 28 January 2022 19:13

Also run top or htop and see what it says.

How often are you updating the chart?

WhiteLion · 28 January 2022 19:54

I looked at the log (trace mode) while the CPU spikes and there was nothing happening. just the update of the CPU temperature "vcgencmd measure_temp" every 30 seconds.

The chart updates or nothing I did in all the flows matches these CPU spikes.
There are different charts active when different events are triggered. Its hard to say maybe there is one chart update per second max. BUT They are not active nor viewed all together. At the page where I measure CPU load are only three active charts at a time active.

I ll try htop again. Is there a way to let htop create useful log ? hmmmppff

TotallyInformation · 28 January 2022 20:12

Well if you are already running InfluxDB and Grafana, I'd recommend also running Telegraf. It isn't a big overhead and it will easily capture lots of useful device performance information into its db. You can then monitor it using Grafana.

Otherwise, since the spike is so regular, just have a gander at htop around the time it is happening. Check whether it really is a service spike and if so what is the service causing it. Also check for spikes in SWAP usage as that can easily trigger a CPU spike when running of an SD-Card.

edje11 · 29 January 2022 13:49

If you are using influxdb it's possible it will generate the spikes due the constant query's of the system db.
To reduce the cpu (spike) load by influxdb significant you can disable the database tables internal used by influxdb.

Edit /etc/influxdb/influxdb.conf and find the section [monitor]
Uncomment the line store-enabled and give it the value false.

[monitor]
  # Whether to record statistics internally.
    store-enabled = false

  # The destination database for recorded statistics
  # store-database = "_internal"

  # The interval at which to record statistics
  # store-interval = "10s"

Restart infuxdb sudo service influxdb restart

Influxdb will now stop using the tables for internal use.
Note from the influxdb website:

Set to false to disable recording statistics internally. If set to false it will make it substantially more difficult to diagnose issues with your installation.

Colin · 29 January 2022 14:19

It possibly could be influxdb, but first @WhiteLion should inspect the htop output when it is spiking in order to get a better idea about what is going on.

WhiteLion · 29 January 2022 17:13

I am still trying to find out. htop paints a different picture than "loadavg" node (node-red-contrib-os). There I could see only a spike for about 1 second. That is caused by an arp request -> pinging every 10 seconds for present detection. The curve loadavg draws is not comprehensible. In my theory could this be the case:

Every 10 seconds cpu spike by arp - request.
Every 60 seconds cpu load measure by loadavg.
At some point the time where these events meet (maybe by a delay of the trigger (both use an injection node) and then the curve of high load will be drawn.

But I could be totally wrong with that.

I dont use influxdb. I use MariaDB for only two values. all the rest of my config is a json object saved to SD card every 10 minutes.

Colin · 29 January 2022 17:51

If htop is not showing significant cpu usage then you don't have a problem.

TotallyInformation · 29 January 2022 23:53

Not surprising to be honest. A node is a long way away from the actual OS - at the wrong end of a complex platform running over a high-level language interpreter.

Where I need to run things like ARP checks, I usually run a dedicated script via CRON and feed the data into Node-RED. I actually run an NMAP scan every 15 minutes to see what is on the network. At the end of the script it calls a web endpoint which is defined in Node-RED and a flow aggregates the data into a table.

WhiteLion · 30 January 2022 12:52

thank you very much for your help and for taking care of my problems.

system · 13 February 2022 12:52

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CPU load giving funny output from machine General	28	1475	4 October 2019
Node-Red High CPU load after 24 hours General	11	219	18 June 2024
My Node-RED flow feels heavy on Raspberry Pi 4 General node-red-dashboard , function-node	14	371	7 August 2024
Hardware recommendations (2022) for node red home automation server General	22	3921	25 April 2022
Successive deployments increase CPU load General	24	2146	6 February 2021

CPU spikes on Raspberry 4 every hour

Related topics