File writing blocks NR?

SandeepA · 7 June 2023 13:30

Hi ,

While i am aware NR uses Node.js ... and i would expect that when we write the files from NR - other processes will continue, What we see is that other NR processes go on hold when we do file write.

We have 100k records to be written to file. And currently we are trying to send all 100k in payload.
I am aware this is not the best solution and buffer wise writing is recommended etc. - but want to understand if it makes sense that while file is being written entire NR should become unresponsive.(editor is unresponsive. Even on console we see nothing).

jbudd · 7 June 2023 14:34

It's surprising that volume of data would cause a perceptible delay.

I packed 1 million short, identical lines into msg.payload with a function then wrote it to a file.
The whole process took about 2 seconds on a Raspberry Pi with 1GB memory.
The editor is accessed with Firefox on WIndows.

SandeepA · 7 June 2023 15:07

While I am currently not yet into performance. I understand if 2 s it may make it non perceptible. But i would assume this not to impact main thread of Node.js. Even it takes time - i am not clear why everything should get stuck.

Steve-Mcl · 7 June 2023 15:21

What "other processes"?

A process is an OS thing (an application, a different node-red is a process). Flows inside 1 node application (i.e. 1 instance of node-red) are all 1 process sharing 1 event loop.
Please be 100% clear.

How big are these records?
How big is the final file?
How long does the operation take?

Written to 1 file? or multiple files?
Is there any processing of data before writing to file?
Are you writing the content of payload directly to a single flie? Is this file being written to an SD card? and FTP server? a NVME drive?

Lastly, can you provide a minimal flow that demonstrates the issue.

SandeepA · 7 June 2023 16:39

I mean other flows.

100k records. Each record around 300 bytes. Flow took end to end 15 mins. 31 MB.

1 file.

Yes. We have some internal application processing to produce the data and write to DB and then we pull from Db and write to file.

Payload goes to write node and is written in 1 step to the file.
Linux mount point.

Steve-Mcl · 7 June 2023 16:52

What is the mount point? A network device? USB device?

I meant, do you process this file data (the payload) in node-red before writing to file?

Last couple of questions:

How are you writing the file? File node? Function node?
Do you have branches in the flow that pass this large payload down multiple wires? And/or debug nodes?

SandeepA · 7 June 2023 17:10

Taken from Internet :
A mount point is a directory on a file system that is logically linked to another file system. Mount points are used to make the data on a different physical storage drive easily available in a folder structure. Mount points are fundamental to Unix, Linux and macOS. Windows can use mount points, but it is not common.

I see what you are getting at.
So once received from DB -

Checks if payload is empty (switch node)
Converts to CSV ( csv node)
replaces some characters using regexp (function node)
To file node.

Using "write file" node.

2 debug nodes yes.

Colin · 7 June 2023 17:23

We know what a mount point is, the question is what it is connected to. For example it might be a local hard disc, a local SD card, a network drive, or a plethora of other possibilities.

Steve-Mcl · 7 June 2023 17:44

As Colin says, we know what a mount point is. The question was "What is the mount point" not what is a mount point

At a guess, this is a network share. That is a bottleneck for sure

But I suspect parsing 31MB of data in CSV node, 31MB Regex parsing, split/duplicated twice (branches), there are several places this is slow.

I recommend you place inline flow measurements along each part of your flow to identify each bottleneck. This will help.

Lastly, can you share your flow (replace your database node with an inject node & populate the payload with some sample data)

jbudd · 7 June 2023 17:51

You might be able to do the data manipulation in your SQL query and create a CSV file directly?

SandeepA · 7 June 2023 20:04

My bad . Sorry.

This should be seen across application. We dont see in other applicatioons working on same mount point.

Will try.

Will do.

Steve-Mcl · 7 June 2023 20:10

Not necessarily. It may actually be an issue with the large payload and the file node or the filesystem (which you still have not answered) and/or the node version. This is why I ask the questions.

Colin · 7 June 2023 20:53

How have you confirmed that it is the File node that is taking the time?

TotallyInformation · 7 June 2023 21:23

One other point that is particularly important when handling particularly large objects. That is, if you have any nodes having >1 wire on an output port, you are making an actual copy of the data at that point. Something which is going to be slow.

While 15min is certainly excessive, it is important to understand how much of that time is spent writing the file. There are a tremendous number of things that can impact write performance. Certainly I would always want to avoid a monolithic write of that size, especially to a network mount.

Have you checked to see if your DB server can write to a CSV file directly? Many can and this will almost certainly be better than relying on another process. Sometimes it is better to treat Node-RED as a prototyping tool but then convert the process to native handling. A decent SQL db should be able to do data manipulation including reshaping and regex replacements as well as dumping data to a CSV.

SandeepA · 8 June 2023 10:42

I am preparing a flow to be posted here.

My original Q remains - should file writing hold the editor ? Wouldnt node.js offload it to non main thread ?

I am trying to decipher now based on responses if indeed file writing node is the one - causing the editor to not respond or one of the preceding nodes (Convert to CSV, function node, debug node) . Will check and confirm.

Steve-Mcl · 8 June 2023 10:51

The editor is served by the node-red back end so the answer is yes.

But, the question by its self is irrelevant. It is far more likely the other things mentioned like writing large data to a network share (which you STILL have not confirmed), processing large string data in a single threaded application, causing duplicates of data (due to branching) are things to identify first.

There may well be an issue with the file node HOWEVER if you do the flow timings as suggested and it turns out the file write took 3 seconds (and it was your processing that took 14 mins, 57 secs) then the question is mostly moot.

On the other hand, it may turn out a node is doing something sync when it could/should be async (and would therefore unblock)

Bottom line: we need you to answer the questions, do some flow timings, and post your demo flow so we can assess.

dceejay · 8 June 2023 12:00

ISTR all calls in the file write node are async, but there is a performance hit if you set the filename via a msg property (rather than fixing it) as we have to check the name hasn’t changed every msg and if so then close and reopen the file. However even that should be minimal on 100k chunks.

SandeepA · 8 June 2023 12:00

Thanks Steve . The missing piece for me was what you mentioned here on sync/async. Rest agreed it will depend on if indeed thats the bottleneck.

Cos based the inputs here its clear that there may be other reasons for the time it takes and blocking. So i will go node by node to check which one it is. Luckily its 3-4 nodes to check.

I will revert back once done.

Steve-Mcl · 8 June 2023 12:03

Agreed (I need to take a peek at the code to refresh my memory - but only after we fry the bigger fish ;))

TotallyInformation · 8 June 2023 12:30

Again, there is no other thread. Node.js is single threaded. What it does with async actions is not hold up the processing loop while waiting for slow operating system functions. So assuming the file out node didn't get coded with a sync call (which you can easily check yourself in the node's code), it will be doing the best it can.

Topic		Replies	Views
Write file node error when writing to an open csv file General	27	949	21 September 2022
Example of Node Red Going Wacko! (Or Node js?) General	74	1608	5 December 2020
Closing file after use of node-red : file General	10	784	6 April 2021
Reading and Writing into a text file from the function node General	34	24892	23 January 2019
Making it easier to work within a team with Nodered General	29	1758	15 October 2022

File writing blocks NR?

Related topics