Cpu profiling a single node: collecting ideas

BartButenaers · 10 December 2021 13:09

Hi folks,

Remark: this is a quick experiment of about 30 minutes. So no deep analysis. Just wanted to collect some ideas from the community, to avoid me wasting precious free time by investigating this further... I only needed this experiment anyway to solve an issue for one of my new nodes.

Based on feedback from another discussion, I started using Jsonata in one of my new nodes. However using my new node-red-contrib-msg-profiler node, I found that my jsonata based node needs about 5 seconds to transform an object. Which is way too long.

So I thought: would be nice if there was a simple way to allow us to do cpu profiling of a particular node. However I "think" that a standard cpu profiling (i.e. connect to the NodeJs debugger with a cpu profiler) will profile the entire Node-RED application. But instead I would like to profile only a single node...

So I tried this:

I added this code to the input message handler of my node:

const inspector = require('inspector');
const fs = require('fs');
const session = new inspector.Session();
session.connect();
...
node.on("input", function(msg) {
   // Before starting my message handling code, I start a cpu analysis
   session.post('Profiler.enable', () => {
      session.post('Profiler.start', () => {
          // Here is my original message handling code
          ...
          // My code has finished, so let's collect the cpu analysis results
          session.post('Profiler.stop', (err, { profile }) => {
             if (!err) {
                // The cpu profile result is e.g. written to a file 
                fs.writeFileSync('c:\\temp\\profile.cpuprofile', JSON.stringify(profile));
            }
        });
      });
   });
});

Then I go to Chrome and I open the link "chrome://inspect" which shows me a link to the NodeJs tools:
Then I can load my profile file from disc:
Select the loaded profile:
And then I see in the flamechart immediately that Jsonata's deep cloning just uses way too much cpu time for my purpose:

image1181×634 35.7 KB

So now at least I know for sure that it is not in my code, so probably cannot do much else except from not using jsonata for my node...

For me such a cpu analysis of a node would be VERY useful. But I have some first doubts already:

I don't think I solve this via a custom node, even not by using message hooks. So I assume it can only be implement as part of the core...
I start a new cpu profiling for every message. Might be causing a lot of overhead. I have not checked that.
Not sure if a profile of a single message is enough. Perhaps we want to have a profile analysis of multiple messages, to get better statistical results. But not sure if something like that is possible. Because when you keep the profiler open, you will be profiling other nodes also...
It is kind of inefficient that you need to open the profile file in a third party tool. Would be nice if this would be somehow be integrated into Node-RED. Perhaps by generating the svg of the flamechart in Node-RED (e.g. using flamegraph). But perhaps other have better ideas.
And so on...

So to repeat: I have not analysed this into depth! It is only a "very" quick experiment for my own node. For me the experiment was a great success, but not sure if it can be reused in a general way...

Would be nice to hear what others think about something like this.

Bart

TotallyInformation · 10 December 2021 20:50

I think that you are right that you wouldn't be able to create a custom node to do this. However, you might be able to create a standard module. If that were accepted into Node-RED as a util function, it would be available to all nodes.

By the way, there are timing functions as part of the node.js console module which work fine for simple needs.

BartButenaers · 10 December 2021 21:02

Hi Julian, thanks for joining!
Yes indeed a util function could be useful. But I also would like to profile nodes which are developed by other folks, or standard nodes. Just to get a quick idea where my performance bottlenecks are located.

Yes indeed for very simple things that works fine. But I don't would like to have timing information, without having to add extra statements in the code. And of course when the performance bottleneck is somewhere deeper in one of the dependency libraries that I am using, then there is no way you can start adding extra statements...

Now that I started again with developments for video surveillance, I really need some tools to help me understand quickly where the bottlenecks are. Which is the reason why I am experimenting with a few performance nodes at the moment. But not too much time that I can spend on it unfortunately, because my video surveillance nodes are my true goal at the moment...

TotallyInformation · 10 December 2021 21:14

Well of course that isn't true - this a scripted language, so with the exception of compiled c/c++ libraries, you can simply delve into the node_modules folder and amend any code you like. I do that occasionally in order to find out how something is (or isn't) working. And if you mess things up completely, you are only a rm -R ./node_modules/ && npm install away from resetting everything.

And of course, this is the disadvantage of a tool like node-red that hides so much from the developer. Great for people with little or no coding experience, bad for people pushing the envelope and developing complex nodes. If only Node-RED weren't so darn good at being a flexible, low-code platform!!

Of course, in your own nodes, it is relatively easy to permanently embed debug and performance checks, creating some functions that check for a debug flag and only do something if it is set. Or even right at the start of your node, checking for an environment variable and, unless found, setting the function to do nothing at all. You can even embed a function into the node-red settings to make it easy if you need to add some additional debugging to someone else's node

BartButenaers · 10 December 2021 21:28

Ok you are - from a technically point - absolutely right
I will rephrase: "there is no way I can start adding extra statements in all the modules that I use, in my limited spare time". Now the sentence should be lawyer proof ...
I want to see in a very limited amount of time automatically where the bottleneck is located, without me having to start digging for hours. This is work for a profiler tool, so I can spend my hobby time on developing nodes.

Yes but even for my own nodes it feels not good to me if I have to repeat similar boilerplate code over and over again. I really want this kind of stuff outside of my node, in a separate tool or node. At this moment I only try to get the concept working, via a couple of profiling related nodes.

dceejay · 10 December 2021 21:30

Don’t forget the built in metrics can produce detailed timings log, which no doubt could be surfaced.

TotallyInformation · 10 December 2021 21:31

I nearly added that to my answer. But the metrics show memory not timing according the docs site.

TotallyInformation · 10 December 2021 21:38

As always, I look forward to seeing what you come up with and hopefully you will prove me wrong in my assumptions.

I do think that we need to accept that nodes are complex forms. There is a lot going on and not all of it happens in the expected order thanks to the way that node.js works so any node-based timer will be a fairly blunt object (pun intended!) but may welll still provide useful information as long as we remember that what is apparent in Node-RED isn't always what is actually happening thanks to async processing and the node.js processing loop.

BartButenaers · 10 December 2021 21:39

Yes indeed I have to be honest that I haven't used that yet, due to my lack of knowledge about those metrics. But that migh indeed provide some useful information...

BartButenaers · 10 December 2021 21:52

Little side node: during the last year, I had to work quite a lot with Dynatrace at my daily job. Such tools are really mind blowing. Once you get familiar with such tools, doing a performance analysis is becoming pure science.

So it is becoming some kind of second nature for me to analyze software

If I could only have the bare minimum of profiling nodes implemented, I would be very happy ...
And if I can contribute afterwards to implement some of that stuff into e.g. the Node-RED debugger, then I will do that with pleasure. But now first experimenting and hopefully get some real life experience feedback from the community...

Now back on topic: all ideas about cpu profiling are welcome!

knolleary · 10 December 2021 22:21

I think the real challenge is getting a meaningful profile for an individual node without getting a lot of noise from the runtime and other flows.

The sorts of nodes you may want to profile are likely to be doing quite a lot of work and likely to be asynchronous. As soon as they are async, you'll get work happening in other parts of the flow and no longer have a meaningful profile of the node.

BartButenaers · 10 December 2021 22:35

Yes true. That is an absolute partystopper ...

BartButenaers · 11 December 2021 06:38

Damn Nick, I so hate it when you are right.
So we need to be able to determine:

Which (sync or async) calls are related to a node, to be able to determine the cpu usage per node.
Which (sync or async) calls are related to a message, to be able to trace a message in depth and determine where it has been slowing down.

That makes it indeed much more complex...

The only thing I found coming close to something like that is ClinicJs Bubbleprof, because they group async data together to get an overview over your application:

So I assume the result would be a bubble for each Node-RED node...

They do some instrumentation at startup, so you need to start it like this:

clinic bubbleprof -- node C:\Users\Gebruiker\AppData\Roaming\npm\node_modules\node-red\red.js

This is rather similar to how Dynatrace works: they inject shared libraries into your application, and this injected code is installing all kind of hooks to collect tremendous amounts of information about your application.

Anyway you can see here that ClinicJs Bubbleprof does the instrumentation via the async_hooks NodeJs module. I see that other applications (like e.g. raygun-apm) also use this NodeJs module, e.g. to trace and profile web requests through a webserver. So I assume that is the way to go if we ever want to do decent profiling and message tracing in Node-RED.

One of the big disadvantages of this approach is performance, when you look at these benchmarks. But you can enable/disable async hooks on the fly, so then users only enable it when they really need it...

This tutorial might be a starting point if somebody ever has a huge pile of free time to get started with something like this.

Again all feedback about these thoughts are welcome!

BartButenaers · 11 December 2021 07:42

Just to make clear what I mean: in this tutorial they keep track of everything that is being executed for a particular HTTP request ID in a webserver. If we could something similar for both node ID's and message ID's in a Node-RED runtime flow, then we might have a starting point for profiling at both levels...

Paul-Reed · 11 December 2021 09:42

Quick question!
As the profiling tool becomes more comprehensive to catch the different cases, is it likely to have an impact on the flow execution speed, thereby adding a delay to the flows, and skewing the results?

BartButenaers · 11 December 2021 10:13

Paul,
At my work Dynatrace is continuously profiling our applications, via the injected code. Which is real useful because we have a LOT of measurements passing by every minute. And now we can always do a performance analysis afterwards, to explain why certain hickups have occured. That is necessary because we cannot simply replay (after a hickup has occured) that scenario afterwards on a test system to reproduce the problem. Most probably you will not be able to reproduce it anyway, since it is causes often by very small stupid temporary issues.

On heavy servers it is not really noticable which system resouces Dynatrace uses, so you can let it run all the time. Moreover on our application servers there are only Dynatrace agents, which push the info to a cluster of Dynatrace servers where the analysis of that profiling data happens.

But if we would go for the async-hooks approach, it seems to me a quite different setup. If you look at my benchmark link above, the load is quite large (since you do everything locally on the application server) P.S. I have not looked into this in detail!!!! Since Node-RED runs mostly (see survey) on Raspberry, this will be noticable. So imho I would turn it off by default. Then you can activate it, when you want to do some troubleshooting. And if a company runs Node-RED on heavy hardware, the could choose to turn it on always if the want. Because they the can also monitor the profiling data, and give alerts automatically if a resource bottleneck is occuring. Because if you have lots of servers running, it becomes nearly impossible to keep an eye on everything manually.

Just my opinion about this...

system · 9 February 2022 10:14

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[ANNOUNCE] node-red-contrib-msg-profiler: 1.0.0 beta Share Your Nodes	12	1520	5 December 2021
[ANNOUNCE] node-red-contrib-v8-cpu-profiler: beta Share Your Nodes	18	1479	16 August 2022
Profiling flows with ClinicJs to analyze CPU performance issues Share Your Projects	2	845	13 December 2021
Performance profiling and observability Developing Nodes	13	957	14 January 2023
CPU Usage spikes - how to identify problems General	9	5394	19 March 2022

Cpu profiling a single node: collecting ideas

Related topics