I started experimenting with audio for this discussion, and now I need to create some extra audio-nodes (e.g. an audio analyser, a volume amplifier ...). However there are multiple ways to pass audio samples between those nodes, and I have no clue which is the best one. The gates have been opened for voting
A) What is audio
Raw audio (PCM) is nothing more than chunks of bytes, representing series of audio samples (see here for some basic explanation). However some EXTRA information need to be added, to describe the content of those bytes:
- Number of channels: the bytes can come from a single microphone, but also e.g. from two microphones (left sample / right sample / left sample / ...).
- Number of bits: each sample can consist of a number of bits (8, 16, 24, 32 ...).
- Sample rate: since the sample contain only an (amplitude) value and no timestamp, we need to indicate how many samples are generated per second.
The raw audio bytes can only be interpreted correctly by a receiver (node), if that receiver is aware of this information.
B) How to pass audio in a flow
It seems to me that audio can be passed between nodes (as messages) in the following ways:
Only pass the raw audio bytes in msg.payload. For example the node-red-contrib-micropi nodes generate output messages that contain ONLY raw audio (PCM) samples:
However e.g. the dashboard is not able to play those audio samples, since the browser doesn't know how to interpret those bytes. So I needed to implement an extra node-red-contrib-wav-headers node to add this information:
The disadvantage is that the user needs to add the SAME information on every node's config screen, to allow each node to interpret the bytes. So I wouldn't go this way ... Think it is better if e.g. the MicroPi nodes would also send the other information in their output messages.
Pass raw audio bytes in msg.payload and add 3 extra fields to the message.
This is probably the most Node-RED alike way to go ... But e.g. the audio-out node currently doesn't get any information like this from the input message.
Pass raw audio bytes and 3 extra fields together in msg.payload (as WAV). WAV is well known audio container, which is in fact a series of headers (44 bits length in total) followed by the raw audio bytes.
So in fact the msg.payload contains a single Buffer, which contains all the information (headers and raw audio samples). And all nodes can easily get the required information from the headers using e.g. NPM's audio-buffer-from node. The advantage is that you can do all kind of stuff with WAV: store the chunks as .wav files, pass the wav files directly to the browser (dashboard) for playing, ... But the disadvantage is that you can not see the header information e.g. in the debug panel. So I would propose to use this wav-headers node only when you really need a wav.
- convert the Buffer to an AudioBuffer
- use the AudioBuffer to manipulate the audio chunk
- convert the AudioBuffer back to a Buffer
This will result in worse performance, due to a lot of useless buffer conversions. Therefore it would be better to create an AudioBuffer ONCE and PASS the AudioBuffer instance in the msg.payload:
This offers a lot of functionalities with as less overhead as possible. However when you look at the message in the debug panel, you won't see any information:
I'm not sure whether it is a good practice to transfer such instances through a Node-RED flow ?? Haven't tested it yet, but if I should use multiple wires the instance would be cloned by Node-RED. Perhaps that could result in extra problems (e.g. when deep cloning is required)?
I think I would vote for option 2, and use option 3 only when e.g. the audio chunk needs to be stored as WAV. But I would love to have the AudioBuffer from option 4, but I don't now how to integrate that in Node-RED ...