Profiling flows with ClinicJs to analyze CPU performance issues

BartButenaers · 13 December 2021 22:02

Hi folks,

Last week I started a discussion to collect ideas whether it is possible to build a - in the future - Node-RED profiler (to find which nodes consume lots of system resources like cpu/ram). However Nick had a very fair point that a lot of nodes will do their heavy processing asynchronously, so from that point there is no relation anymore between the node and the system resources it uses. Indeed a node can use system resources:

By code that is being executed synchronous (i.e. directly called inside the node)
By code that is being executed asynchronous (i.e. in the background by the NodeJs event loop)

While I was looking for a solution to this problem, I came across ClinicJs. In my opinion, the ClinicJs BubbleProf tool can be very useful to profile a Node-RED flow. Because BubbleProf groups together asynchronous operations in "bubbles", and measures cpu times for each of those groups. By doing it that way, the nodes of your flow will be mapped to bubbles.

I will "try" to explain step by step how to use ClinicJs for cpu profiling of your flow.

Analyze which node(s) consume high cpu - bubbles of asynchronous functions

Start Node-RED via ClinicJs, so it can inject profiling code into NodeJs:

clinic bubbleprof -- node C:\Users\Gebruiker\AppData\Roaming\npm\node_modules\node-red\red.js

ClinicJs will immediately add an extra line at the top of the Node-RED log:

To generate the report press: Ctrl + C

13 Dec 19:57:28 - [info]
Welcome to Node-RED
===================
13 Dec 19:57:28 - [info] Node-RED version: v2.1.1
13 Dec 19:57:28 - [info] Node.js version: v12.19.0
Now execute the actions in Node-RED which result into your performance issue. In my case things are becoming really slow when I inject a message into this flow, which contains a few nodes that I am developing at the moment:

So I inject a message and I repeat that a lot of times, to make sure the profiler can collect enough statistical information to create a decent profiling analysis.

Note that you should avoid executing other actions (not related to your issue) as much as possible, because the profiler will also collect information about that. Most optimal you create a flow only containing the nodes related to your problem, and avoid other actions like deploy and so on. Otherwise the analysis will become much more difficult, due to the 'noise' of other nodes.
As soon as you are done, press Crtl-C in your shell window. This will stop Node-RED and the profiling analysis will automatically start:

13 Dec 21:19:31 - [info] Started flows
Received Ctrl+C, closing process...
13 Dec 21:20:17 - [info] Stopping flows
13 Dec 21:20:17 - [info] Stopped flows
Analysing data
Generated HTML file is file:///C:/Users/Gebruiker/node-red-contrib-tank-volume/.clinic/12524.clinic-bubbleprof.html
Terminate batch job (Y/N)?

Remark: in my case (on Windows 10), I had to press CTRL-C twice.
The generated html file will automatically be opened in a browser window, and shows your profiling result as bubbles:

image1574×816 116 KB

The bigger the bubble, the more cpu has been spent in that bubble. So the more times that I would have repeated my test (by injecting new messages), the bigger my circle would become. But of course the startup code of Node-RED consumes a lot of cpu compared to my little test sequence, so those bubbles will be bigger in such short tests...

This way the major cpu consumers become immediately visible:
a. The Node-RED node registry, which probably has loaded all my installed nodes and my flow.
b. The editor-api, which probably has served all requests from my flow editor session.
c. The server. I assume that is all other NodeJs stuff? Not sure...
d. The PolygonTransformationNode which is really interesting for me as a contributor, because that is one of my new nodes inside the flow that I have been testing.
When hoovering above the bubble of my PolygonTransformationNode, you will see this popup:
When you click this node, the corresponding async stack trace will be displayed:

image1112×305 36.6 KB

You will see a the list of related async actions, which have been grouped into this bubble. Which means in fact, all the asynchronous functions that have been called from within my node. And for each of those async actions, you will see the time being spend on synchronous actions. Because inside each async action code will be executed, which means a series of Js statements and synchronous functions are being called in a sequence.

Ok so now I know that my PolygonTransformNode is responsible for the high cpu.

Analyze which part of a node consumes high cpu - flames of synchronous functions

But I also would like to have a look which of the synchronous actions inside my PolygonTransformNode is consuming all that cpu. My experience with ClinicJs is limited to about 20 minutes ( ), so I don't know if you can navigate from the bubbles directly to this information. So I will use another tool to profile the synchronous actions:

I press Ctrl-C again in my shell window, to close Node-RED completely.

Start Node-RED via ClinicJs again, but now with the Flame application:

clinic flame -- node C:\Users\Gebruiker\AppData\Roaming\npm\node_modules\node-red\red.js

Repeat the test again in the flow.
After pressing CTRL-C again (twice), the generated html will be opened in a browser window automatically.
The html will contain this time a flame chart:

image1600×784 142 KB

This chart is another way to represent a call stack, but you have to read it from bottom to top. And the wider a (black) box is, the more cpu that function has consumed.

a. Node-RED starts at the bottom, where the flow is being started. Lots of cpu of course, because everything else is being called from here (i.e. all the black boxes above it, are also being run by this box at the bottom).

b. From this point on all our nodes are running. P.S. not sure how this needs to be interpreted, because if I am not mistaken, the upward part is called asynchronous (via the node 'input' event)?

c. All the sequential functions - that are being executed inside a node - are displayed here. But they are not very wide, so not that much of cpu being consumed in this node.

d. For my PolygonTransformationNode, the boxes are much wider. Which means that the sequential functions in this node are consuming lots of cpu. When we drill down in the call stack (i.e. go upwards), you can see that the cloneDeep of jsonata is the root cause of the high cpu usage inside my node.

Summary

Although ClinicJs is a very nice tool, I am aware that it has a number of disadvantages:

There is no memory profiling. So you will still need other profiling tools, e.g. Chrome developer tools.
You need external tools, since there is no integration into Node-RED. Which makes it very cumbersome to use...
You need to change your startup command, which is not always easy (e.g. in Docker).
It is a pitty that the bubble and flame charts are not integrated into a single tool. Would have been nice if the related asynchronous and synchronous profiling had been combined.
You need to stop Node-RED to start the analysis. So no live profiling by attaching the profiler to a running Node-RED instance.

This is the first time that I have used ClinicJs, so I am absolutely NO expert. If you find other useful stuff in this tool, please share it here with us! And if you have found other similar/better tools, please create a separate discussion and share the link to it here (to keep this discussion focussed on ClinicJs).

Although the literature is quite heavy to read, I hope it can help you troubleshoot your performance issues a bit more scientifically based...

Bart

BartButenaers · 13 December 2021 22:43

When I do some more tests (by injecting) more messages into my problematic flow, then the profiler will be able to collect much more issue-related information. And then the results becomes more interesting, because my Node-RED flow becomes visible in the bubbles:

a. The 'server' bubble is becoming smaller, because my test has been consuming more percentage of the cpu (used during the analysis time frame).

b. First my PolygonTransformationNode is being called, which consumes quite some cpu.

c. From that node there is a connection to my next node in the flow (the ImageAnnotationVisualizerNode), which means the wire (between two Node-RED nodes) is becoming visible as a connection (between two bubbles).

d. A longer connection means that there is more latency between two bubbles.

Although I have to be honest that I don't succeed in showing my second node as a bubble in all my tests. Need a bit more experience with the tool, before I can explain such things...

BartButenaers · 13 December 2021 23:12

And a last test for today. I have added in my "input" message handler (of my PolygonTransformationNode), both an async function and a promise:

node.on("input", function(msg) {
   async function asyncCall() {
       new Promise(resolve => {
           // My bad performance code has now moved here
       });
   }

   asyncCall();            
});

Fortunately ClinicJs can still group all those async actions together in my PolygonTransformationNode bubble async call stack:

The async-hooks NodeJs module (used by ClinicJs under the cover) seem to be quite useful for me, in the context of profiling a Node-RED flow.

But if anybody has some doubts about this, or if I have forgotten something important, please share your thoughts here! Because chances are pretty high that I have overlooked something ...

Topic		Replies	Views
Cpu profiling a single node: collecting ideas Feature Requests	16	839	9 February 2022
[ANNOUNCE] node-red-contrib-v8-cpu-profiler: beta Share Your Nodes	18	1483	16 August 2022
[ANNOUNCE] node-red-contrib-msg-profiler: 1.0.0 beta Share Your Nodes	12	1523	5 December 2021
CPU Usage spikes - how to identify problems General	9	5435	19 March 2022
Performance profiling and observability Developing Nodes	13	969	14 January 2023

Profiling flows with ClinicJs to analyze CPU performance issues

Analyze which node(s) consume high cpu - bubbles of asynchronous functions

Analyze which part of a node consumes high cpu - flames of synchronous functions

Summary

Related topics