Exec node with async execution as well as stdin capability

Last year I had problems with the sqlite node making the whole node-red process unresponsive when accessing a large database file with complex SQL statements.

I tinkered a bit and created a custom subflow-based exec node based on the function node using the nodejs libraries childProcess and stream in order to fire up my own processes that access the sqlite database and return data from it. The trick is that the process is run asynchronically and allows for inputting data through stdin during startup (which the exec node currently lacks). This stopped node-red from being blocked by a blocking call. The subflow was created in a way that it could be reused for any other applications where spawing processes asynchronically is needed. Here is the pasteable subflow for it. The nodejs libraries above need to be added to the node-red function node configuration.

My questions now are

  1. if such functionality could benefit a broader mass of users or if it is a corner-case that concerns near to none?
  2. if anyone is interested in me trying to create either a new proper exec async node with above functionality that can be shipped with node-red core
  3. or I shall try extending the current exec node with some user-configurable settings such that it can call processes asynchronically (and provide stdin to them as a new feature)

About my background: I'm not really a good node-js dev but I know my way around how OSes handle processes and their streams (at least for Linux and Windows). I think I could manage creating the necessary tests and code for it. I also already released a node-red library and know my way around the node-red internals a bit (thanks to the good documentation).

Some things like new dependencies to the node-js libraries stream and childProcess would have to be considered as well as I think they are currently not dependencies of node-red core.

Before I start doing anything I'd like to get a heads-up by the community whether this could be feasible at all.

As always, thank you for your time!

Hi Marcus!
Thank you for this proposal.
I yet have the impression, that the exec node already is able to spawn a childprocess:

Would your node do anything different ... or add functionality on top?

I am not trying to be the sand in your engine, but did you consider switching to another type of database ?

Reading your previous topic; you are collecting sensor data (i assume timestamped) - this is where influxdb shines, it probably solves all your problems in an instant; 20 sensors for an 18 hour period should be milliseconds work for a db - if indexed and normalized.

Sqlite is a serverless database and although it can handle a lot of data, you will always end up with a blocking factor of the exclusive lock when writing (which could also be a factor in the delay when querying at the same time).

The exec node always runs asynchronously.

Also there is spawn mode, where you can pass data in via stdin.

1 Like

And the daemon node that can be started and stopped and also reads from stdin.

1 Like

Thanks for the feedback!

First, I'm sorry. I seem to not have done my research properly. Yes indeed, the exec node spawns a process and does not block for it to finish. I got that totally wrong.

@Colin: How can it be done? I cannot find anything in the docs or in the link to the source that @ralphwetzel posted that states on how to pass data via stdin. I could not make it work.

@dceejay: Thanks, I did not know about node-red-node-daemon until now. Here I seem to have problems to start e.g. an inline python call that echoes stdin once the stdin pipe was finalized, i.e. using process python and arguments -c "import sys;print(sys.stdin.read())". Using this in the console by calling

echo "Test" | python -c "import sys;print(sys.stdin.read())" 

works fine though.

Any ideas or thoughts?

@bakman2 You are absolutely right and it brings up a painful subject I have been postponing for too long :wink:

You are right, not with exec node, I was thinking of the daemon node.

Would a PR be appreciated for adding "stdin functionality" to the exec node then?

If yes, would you like me to give it a try? How should it be done? Have a new msg.stdin property that gets read on message input/new input form element on the configuration page and then pass that as stdin to the process?

The easiest way would probably be to send the content into the process' stdin pipe and then close the pipe.

An advanced way could be to keep the stdin pipe open after process start and then be able to reference it again using the pid in order to send more input to stdin.

I extended the exec-node to handle stdin input. If someone wants to give it a try check out the branch on my fork:

It works quite well for me. I didn't want to change much of the original exec node's code but the cleanup that handles killing the timer and destroying the stdin pipe object could be refactored quite a bit such that the code doesn't need to be there six times at different places.

If you'd like me to contribute by creating a pull request (please advise!) the translations have yet to be created for fr, ja, ko, ru, pt-BR, zh-CN, zh-TW .

3 Likes

I added translations that I created using Chat GPT-4. They may be not perfect.
Also I added some mocha tests to get some code coverage for testing.

Should I create a pull request or are there any concerns?

Edit: Sorry for bumping my own thread. I hope I’m not too forward with this. Please don’t hesitate to slow me down!

Apologies for not keeping up with your rapid rate of progress. I must admit I'm still not quite seeing what the intention is for this mode. Is it to replace the daemon node ? It can't be as, the stdin parameters are only passed in once with the msg - and if they are just passed in once then why not pass them in as parameters as per existing usage. You may need to escape any " properly etc as is often tha case when calling things like bash from scripts etc.

In my mind the daemon node has a completely different use-case.

Some programs rely on passing data to them via the stdin pipe rather than using command line arguments.
This is a completely different modus operandi and I would argue that having stdin capability is a sensible feature for the exec node. This would e.g. enable us to chain exec nodes and "pipe" data sent to stdout to stdin of another exec node (though this would make only sense when using "exec mode" instead of "spawn mode" as we have no clear way to detect if a process closes its stdout or stdin pipe yet).

You might be right that this feature is unnecessary as no one seems to have been missing it and if so they rely on the daemon node.

How does one use your node? Is all the stdin data passed in one message that runs the command and then passes in the stdin data? If so, which property is stdin passed via?

That‘s exactly right. All data is passed to the stdin pipe directly after creating the process and then the pipe is closed to signal to the process that there is no more data. After the process has run or has been terminated the resources of the stream feeding the stdin pipe are released.

The stream feeding stdin is only created if there is a non-empty property ‘stdin‘ present in the input message that contains either a ‘Buffer’ or ‘string’. That is of course only a proposal by me and could be changed if needed.

OK, I can see that moght be useful for running a command that cannot take everything in the parameters. Is there not a mechanism for passing stdin via the command line though? For example
echo "this string is for stdin" | some_command
which could be done in a standard exec node

1 Like

You are right. That works quite well (just tried it) - I haven't thought of that. The only downsides of that approach I can see are

  • it is not clear what process caused a timeout if you do intend to use the timeout feature in such a case.
  • correct escaping is paramount and dependent on the shell that is used in the underlying system call. That would not be the case when using stdin directly.
  • you cannot pipe binary data originating from node-red directly using this approach. With the stdin feature it wouldn't be a problem as you could use a Buffer() object. However, I can see this being just a corner case.

Valid points.

Please advise me if you think the functionality is redundant, requires more discussion or a pull request is appreciated.

In the latter case I will put more effort into creating tests, testing in different environments and increasing code coverage during testing :slight_smile:

If that is specifically addressed to me, I am not a core developer. I have not been in a situation where the extension would be useful, but I can see that others might.

@dceejay @knolleary Looking at the repository's history I think you are the main contributors of the exec node. Do you think a PR towards this functionality would be beneficial? Otherwise I'd put this ad acta :wink: