How to pass audio chunks through a Node-RED flow

Hi folks,

I started experimenting with audio for this discussion, and now I need to create some extra audio-nodes (e.g. an audio analyser, a volume amplifier ...). However there are multiple ways to pass audio samples between those nodes, and I have no clue which is the best one. The gates have been opened for voting :face_with_raised_eyebrow:

A) What is audio

Raw audio (PCM) is nothing more than chunks of bytes, representing series of audio samples (see here for some basic explanation). However some EXTRA information need to be added, to describe the content of those bytes:

  • Number of channels: the bytes can come from a single microphone, but also e.g. from two microphones (left sample / right sample / left sample / ...).
  • Number of bits: each sample can consist of a number of bits (8, 16, 24, 32 ...).
  • Sample rate: since the sample contain only an (amplitude) value and no timestamp, we need to indicate how many samples are generated per second.

The raw audio bytes can only be interpreted correctly by a receiver (node), if that receiver is aware of this information.

B) How to pass audio in a flow
It seems to me that audio can be passed between nodes (as messages) in the following ways:

  1. Only pass the raw audio bytes in msg.payload. For example the node-red-contrib-micropi nodes generate output messages that contain ONLY raw audio (PCM) samples:

    image

    However e.g. the dashboard is not able to play those audio samples, since the browser doesn't know how to interpret those bytes. So I needed to implement an extra node-red-contrib-wav-headers node to add this information:

    image

    The disadvantage is that the user needs to add the SAME information on every node's config screen, to allow each node to interpret the bytes. So I wouldn't go this way ... Think it is better if e.g. the MicroPi nodes would also send the other information in their output messages.

  2. Pass raw audio bytes in msg.payload and add 3 extra fields to the message.

    image

    This is probably the most Node-RED alike way to go ... But e.g. the audio-out node currently doesn't get any information like this from the input message.

  3. Pass raw audio bytes and 3 extra fields together in msg.payload (as WAV). WAV is well known audio container, which is in fact a series of headers (44 bits length in total) followed by the raw audio bytes.

    image

    So in fact the msg.payload contains a single Buffer, which contains all the information (headers and raw audio samples). And all nodes can easily get the required information from the headers using e.g. NPM's audio-buffer-from node. The advantage is that you can do all kind of stuff with WAV: store the chunks as .wav files, pass the wav files directly to the browser (dashboard) for playing, ... But the disadvantage is that you can not see the header information e.g. in the debug panel. So I would propose to use this wav-headers node only when you really need a wav.

  4. Pass raw audio bytes and 3 extra fields together in msg.payload (as AudioBuffer instance). I want to create extra Node-RED nodes for audio manipulation, and the best option (in plain Javascript) seems the audio.js project (which contains a large series of NodeJs nodes). Those nodes are all based on the AudioBuffer class. However when we would pass the raw bytes as normal Buffers in Node-RED (see three previous options), then EACH of the Node-RED audio nodes should do this:

    • convert the Buffer to an AudioBuffer
    • use the AudioBuffer to manipulate the audio chunk
    • convert the AudioBuffer back to a Buffer

    This will result in worse performance, due to a lot of useless buffer conversions. Therefore it would be better to create an AudioBuffer ONCE and PASS the AudioBuffer instance in the msg.payload:

    This offers a lot of functionalities with as less overhead as possible. However when you look at the message in the debug panel, you won't see any information:

    image

    I'm not sure whether it is a good practice to transfer such instances through a Node-RED flow ?? Haven't tested it yet, but if I should use multiple wires the instance would be cloned by Node-RED. Perhaps that could result in extra problems (e.g. when deep cloning is required)?

I think I would vote for option 2, and use option 3 only when e.g. the audio chunk needs to be stored as WAV. But I would love to have the AudioBuffer from option 4, but I don't now how to integrate that in Node-RED ...

Thanks !!!
Bart

2 Likes

Some good thinking there - so I will now jump in and demonstrate my ignorance :slight_smile:

Would (say) a function node be able to do anything to an AudioBuffer ? How easy would it be to half the volume for example ? Or how would/could other (non-audio) nodes be able to "join in" and help this flow ?

If other nodes wouldn't be able to do things to a payload of type audiobuffer - then maybe put some data useful to the flow in payload and the audiobuffer in another msg.property (that all your audio nodes would use), and then things like switch node can act on the audio to route it ? or... err (I did warn you I was making this up) . Also would audiobuffer then need a special file writer node - or would you just have a "convert back to mp3/wav/..." node then to file ?

WAV (3) is of course appealing - but how well does that stack up playback wise - ie do they "join" back into a stream ok ?
There is no reason of course why you couldn't do 3 with the extra data of 2 alongside so other nodes could act on the metadata without looking into the stream

(and of course happy to consider changes to the audio out node(s) to help accommodate whichever route is works out best)

Good point. I 'think' a lot of non-audio nodes won't be able to use the AudioBuffer instance decently:

  • A function node should do a require of all the used audio.js nodes (via functionGlobalContext), so that is perhaps not user-friendly...
  • Don't think a switch node is able to use the information in the audioBuffer instance.
  • Join node might result in strange effects?
  • ...

I had only looked at the audio-related nodes. So perhaps it is a bit confusing to users when I pass AudioBuffer instances ..,

Haven't tested all my code yet, since I didn't know if AudioBuffer was a good option...
But I assume an AudioVolume node would be as simple as this:

	var AudioAmplifier = require('audio-gain');

	function AudioVolumeNode(config) {
		RED.nodes.createNode(this, config);
        this.volume = config.volume || 1;

        var node = this;
        
        var options = { volume: node.volume };
        node.audioAmplifier = AudioAmplifier(options);

        node.on("input", function(msg) {                
            // Let's assume the input message contains an AudioBuffer instance ...
            var audioBuffer = msg.payload;

            // Send the amplified audio chunks to the output
            msg.payload = node.audioAmplifier.process(audioBuffer.channelData);
            
            return msg;
        });
    }

It is rather simple this way, and in plain Javascript (so no C++ builds required). That is why I wanted to use AudioBuffers, to avoid having to program all the functionality by myself.

Remark : I haven't done any performance tests yet!

Hmm, now we are talking :thinking: Although when talking about sound, this 'sounds' like a workaround (since the 3 information fields will be available twice in the message). But it might indeed do the job since the non-audio nodes can use the normal message fields, and the audio-related nodes use the field with the AudioBuffer. Good proposal !!!!!

Haven't tested it yet, but I would indeed assume (like e.g. in Java) that the AudioBuffer instance will need to be serialized to a text in the file? And when reading the file, the text has to be deserialized again to an AudioBuffer instance... Will need to play with it, because I haven't used that ever in Javascript. Must admit that I don't like the idea that I would have to create extra file nodes. Should work automatically somehow with the normal file nodes ... Of course they can always store a .wav file:

Could you please explain that a bit more? Google-translate exploded when I entered your sentence ...

That is what I was hoping :rofl:
Thanks !

Re WAV. I mean. If a large file is turned into chunks. And each chunk is its own bit of wav, with wav headers etc, then how well do they stick back together ? Is it seamless to the ear ? And likewise can they just be written sequentially to make one big file ? (Or do you need to strip all the intermediate headers?)

Re audiobuffer to file, I guess you would have to have some conversion anyway, (to mp3 or wav etc) and that would then be a normal buffer, which existing file node would handle.

And don't forget we already have the concept of parts metadata used by the split join and other nodes

Wav buffer = buffer with headers (44 bytes) + buffer with raw data. When you want to join multiple wav's into a single larger wav, indeed you need to remove the intermediate headers of all the wav's that you append. With option 2 more standard Node-RED nodes can be used, since you just need to use my wav-headers node only at the end of the audio manipulation.

Think you are right: send the message to some raw-audio-to-mp3 node, which converts the audiobuffer to mp3. Now the msg.payload contains a normal buffer with mp3 data that can be written to a file. Since the file-out node ignores the msg.property field (containing the audiobuffer instance), there is NO issue (of audiobuffer being serialized). And after the file-in node we need to add the create-audiobuffer node to recreate the audiobuffer again from the loaded raw audio.

Never used those nodes. Do you mean I could use the msg.parts somewhere for audio?

Certainly not. That is why I have so much troubles to play the wav chunks decently in the dashboard...

There is one drawback of this approach: the raw audio data is now passed twice through every wire, since raw audio is inside the msg.payload AND inside msg.property. So when you have 2 wires connected to an output, Node-RED will clone both buffers. Again waste of resources, both memory and cpu.

Thanks for discussing...

So yeah that makes using the WAV approach a pain... probably not efficient, or certainly less appealing than it was.

The split node (and I think file in and csv nodes) add a msg.parts property that contains things like the index of that chunk to help when putting back together later... if you are slicing things up it may be useful to follow that pattern.

Not sure why you say the data gets passed twice - I was thinking the audiobuffer would be in (say) msg.audio and ONLY the useful (to other nodes) metadata would be in msg.payload and not the raw data as it had now been converted to the audiobuffer.

Ah ok, thought you ment both raw audio in msg.payload AND audiobuffer in msg.property. When only one of those, there indeed it will be no cloning overhead...

well - if you fork the flow then it will always clone one of them - (but yeah - better than duplicating everything as well)

I am very far from being even a novice at buffers. However, this has me thinking that Node-RED as it stands may not be very good at handling buffers because of its focus on msgs?

So that had me thinking whether there might not be a better way to integrate them as streams? NodeJS is good at handling streams - I think?

So I wonder - pure speculation on my part, I've no idea whether this is at all feasible - whether it wouldn't be better to have some stream processing nodes that didn't use the normal msg passing but used - for example - PIPES? They should, I think, be much better at handling streams.

I know that would introduce a secondary communication method but I think that it should then be expandable to other stream processing which would make Node-RED a lot more useful for handling high-volume data throughput.

Anyway, a random thought that had been swirling through my noggin for a while so I thought I'd put it out there for someone to shout down. Or not.

1 Like

Hi mister Knight,

As long as you don't fork the flow, I don't think there is an issue. As soon as you add a fork, the messages will be cloned. But that is not necessary a bad thing, if you want to do somerhing else with the cloned data. E.g. one audiostream needs to be amplified, and the other one not.

But for high data volumes we need to have a single chain of nodes, without forks. But that means that every node should pass all its input messages on its output. For example I need to create an AudioAnalyzer node that calculates the frequency spectrum. But it should APPEND this data in the msg.spectrum field of the original input message, and pass that UPDATED message on its output port. This way we can avoid a lot of forks..

Did I forgot something?

Bart, don't worry. Julian is talking about a whole different paradigm for the flow within Node-RED. If all inputs and outputs were node.js streams then they could be wired together and data would just flow... hopefully :slight_smile: In practise it make more sense when you are coding lines of code as you can just "dot" them together js style input.read.manipulate.write sort of thing - where you don't then have forks or cloning going on.

Anyway - the file in and out nodes do indeed try to handle streams - and what we found is that even on an in stream then node.js effectively does chunk it up into pieces to hand to the next thing. The chunk size depends on the operating system. Net is - that once chunked we may as well just pass it on as a chunk anyway (which is what the file in node does when set to send chunks) - so meh... not going to worry about it all too much at this point.

2 Likes

Cool. And interesting. Thanks for explaining. One of many corners of NodeJS I don't understand very well.

This is on a slightly different topic, but I ran across this thread when looking for people who had tried what I was about to try: node streams in Node-RED. I didn't really find anything more on it, so I've done my own thing; I've just published my first nodes so I guess I'm now in very early beta, but I'd love some input!

The existing project is gulpetl, and we're adding Node-RED capabilities to a project already built around streams. It's for ETL rather than audio, so design choices are different, but we are running into the types of gotchas you touched upon like message cloning. So far we're sailing through!

Here's the Node-RED section of our docs, with Getting Started and what-not. I hope to get a brief tour finished up in the next couple of days.