Streaming live audio data via MQTT

Hello,

I am trying to stream live audio via MQTT in Node-red to Raspberry Pi.

For testing microphone and speaker, when I connected them through a "Add WAV headers" (like the below image), the output sound is quite clear.
image

I also tested to stream live audio via MQTT between 2 Raspberry Pi Zero. The output sound is quite clear too.

But when I try to recieve and send audio data via MQTT , between Node-Red and Raspberry Pi, like 2 pictures below, the audio output ( in both PC and Raspberry Pi) is full of noise.
image
and
image

Could anyone tell me the reason why and how I can solve this problem.

Best regards,

Nhien.

1 Like

Welcome to the forum. This is the first time I hear someone trying to use MQTT for streaming real time data. What is the reasoning behind using MQTT instead of a protocol designed specifically for real time data (like RTSP) ? Where the MQTT broker is located and how your MQTT is setup (qos) ?

1 Like

Hi Andrei,

Thank you so much for your reply.
I'm very new in streaming real-time data, and I found MQTT helping me to construct the network conveniently. I have never try RTSP, but I will try soon.

I set another Raspberry Pi as MQTT broker, and also Node-Red installed machine.
QoS for both "MQTT in node" and "MQTT out node" is 0.

Best regards.

1 Like

Well, you could experiment with qos 2 instead of 0 to see if it makes any difference. I am not optimistic though. Your use case is somehow similar to voip in the sense that latency and jitter will have a great impact on the quality of the results. There is no mechanism built in MQTT to help with jitter (delay variations). I wish I had some spare time to read some papers and essays from the experts on this subject. Your testing may provide insightful feedback to the community so please keep on. One more question, which node is the one in purple color, named "Add WAV headers"? Perhaps something from Home Assistant ?

1 Like

Thank for your reply!

I also tried with QoS 2, but nothing was changed.

"Add WAV headers" node adds WAV headers to raw audio data for playing with "audio out" node.
I tested "Add WAV headers" node , and it ran corectly.

Thank for your support and best regard!

in which case the data or format of the data from MQTT may be incorrect.

What does the payload of a working audio clip look like vs the payload you get from MQTT?

1 Like

Check your MQTT input node settings. I've ran into an issue where it defaults to "String" instead of "Auto". Changing it to "Buffer" fixed the problem I was having.

1 Like

Thank for your reply!
The data from MQTT is look like the below image. Data type is "byte"
image.

It is audio raw data, so I think I have to add wav header for "audio out" node to play it.

Best regards.

Thank for your reply!

I tried to change MQTT input node settings as you recommened , but the quality of the audio is not changed.
I can hear what other say, but the sound is clipping. After checking the MQTT data which I send from Raspi Zero, I found that sometimes data is full of 0 byte. Maybe it is the reason of sound clipping.

I'm going to check the code in Raspi Zero.

Best regards.

Hi @nhiennguyenhuy,

I don't think your problem has anything to do with mqtt. The dashboard's audio-out node simply can only play a single audio buffer, for example an mp3 file. You push an infinite stream of audio fragments to it, which it is not what this node is designed for...

I have tried in the past to implement this feature, but it was just to hard in my limited spare time.

  • You can easily fool your eyes by sending separate images to it: your eyes will believe they see fluent video.
  • It is very hard to fool your ears: they will hear every abnormal transition from one segment to the next.

At one moment the audio chunks will arrive too soon, so some of them will be skipped by the audio node. At another moment the audio chunks will arrive not fast enough, which means there will be a gap of silence. In both cases you get steep edges, as transition between succesive audio fragments. And that is what your ears hear.

The audio-node immediately plays a fragment, as soon as you pass it one. I tried to add a buffer to it in the past (i.e. an array where N input fragments are being stored), and then it was already much better. Because if the buffer is large enough, the node will play all the segments nicely after each other. So no steep edges between the fragments.
However the buffer size needs to be small, otherwise the delay will become too noticable. And then you get again more distortions.

Perhaps meanwhile new javascript libraries have been implemented that can solve this problem. Don't know...

When you search on this forum, you will quickly find that others have tried this before you already (e.g. here) ...

Bart

2 Likes

Hi @BartButenaers,

Thank you so much!
Your explanation is very clear and intelligible. I learn so much from that.

As I said above, I also tested to stream live audio via MQTT between 2 Raspberry Pi Zero and the output sound is quite clear.

In the below picture, when I wanted to send raw audio from microphone (plugged to Node-red installed machine) to Raspi Zero. I got steep edges as you said.
image

I mean in the first case, Raspi Zero can reproduce audio nice, but in the second case,
it can't.
I think the way how the audio transfers is the same with which I am trying to do: Raw data from microphone is splited in separate buffer ( size of buffer can be set as you said) before sending by MQTT.
I can't understand what is the different thing here.

Best regards,

Nhien.

This is something I cannot really understand. Of course what do you mean with "quite clear"? During my experiments I had quite some irritating sounds. Based on my experiments there was no way to get a clear audio signal on my speakers this way.

The only solution (at the time being) was smoothing the transitions between (out of order) audio segments to get rid of the steep edges.

Don't know if it is possible to share a short audio sample here, so that we can hear the result of your test?

Although your question is very well documented, I don't understand the difference between the experiments:

  • A live audio stream via MQTT between 2 Raspberry zero's give good quality.
  • A live audio stream between Node-RED and Raspberry Pi is full of Noice.

Isn't that the same? I assume you run Node-RED also on both Raspberry zero's?
A quick sketch of both setups would probably help, so that I can understand which nodes are running where...

In both cases (pc and raspberry) you listen to the audio via your Node-RED dashboard? Do you mean that in the second case you run your browser installed on the Raspberry? Would also help if you explain this in your sketch...

Can you explain a bit more why you think this could be the cause?
From your screenshots I assume that you use the microphone node in all your experiments to capture the voice? Or is that not correct? And is it is this microphone node?

Note that you could perhaps use my node-red-contrib-msg-speed and node-red-contrib-msg-size nodes to calculate statistics about how much segments are travelling through your wires, and the sizes of those segments. If you think data is chunked somewhere, you should different statistics in your two experiments...

To understand where chunking could get messy, it would be very useful to see the entire tranject (from microphone to the speakers) in a single (manually drawn) sketch...

1 Like

Thank you so much for your reply!

I'm so sorry for my unintelligible post.
The below image is the first case sketch, which I have quite clear audio.


and

The below image is the second case sketch.


In both case I use python for processing audio data on Raspi Zero.

No, It isn't. I use this one for capturing audio

microphone.json (518 Bytes)

Maybe the program read the previous buffer to fast and no buffer come ...

This is first case's audio sample.
https://drive.google.com/file/d/1HxfDCtEb15dCDfe6XmM-VNAmW16I-qHo/view?usp=sharing

And this is second case's audio sample.
https://drive.google.com/file/d/1HyIDSqBx8vLbaaLO3-1KJIlw9amWi6sh/view?usp=sharing

Once again thank you for your advices.
Best regards,

Nhien.

OK, so on the second example, that's the difference. There you process the audio data with NR javascript

1 Like

Hi @krambriw ,

Thank for your reply.

Exactly! And when I test like the below example ( in the first post), the result is OK.

That means if I do all in python or NR javascript, it is OK,
But for example, if I send audio by NR and receive audio by python, it is not OK.
And it confused me.

Thank you for explaining it so well!

Experiment 1

From this we can conclude that Python does it very well, both for capturing the audio from the USB microphone and for playing that audio on your speaker. So Python does somehow reassemble the audio chunks fine. Is there a large delay perhaps: because that would indicate that they use a buffer that is large enough. When that is not the case they do some good processing of the chunks.

So we need to know which of those two part (capture or playing) fails in Node-RED, by two extra experiments.

Experiment 2

Perhaps you did this experiment already:

By doing this, we only replace the python library (which captures the usb microphone audio) by the node-red-contrib-micropi node. If this works well, then we at least know that the micropi node is doing a good job.

Experiment 3

By doing this we only replace the python library (which plays the audio on the speaker) by the ui-audio-out node. I don't expect this to give good audio quality, by the web-audio issue that I described above....

2 Likes

Hi Bart,
Thank you for your time on my post!


I didn't use Node "Add headers" for experiment 2. I send raw audio data directly to mqtt-out.

I tested all the experiment above.
The quality order is Experiment 1 > Experiment 3 > Experiment 2 .

I also tested for this case. The audio is very clear.


Does it have any meaning? Does it mean the NR library is good enough for capturing and playing audio?

From your experiment (let's call it experiment 4) I conclude that the node-red-contrib-micropi node captures the audio very well, since it results in good audio after playback.
For some reason the dashboard audio-out ui node is able to play it rather good: I assume the chunks arrive in a pretty decently order, otherwise I cannot explain this. But this won't be the case always.

You could do also experiment 5:

image

Because it seems that mqtt introduces timing issues, which is just too much for the dashboard audio-out node to be able to play it correctly.

1 Like

Experiment 1: only Python & MQTT involved, no distortion added, best quality nbr 1

Experiment 2: Input from mic captured by NR adding distortion, then transferred via MQTT, Python at receiving end not able to "repair" distorted sound, quality nbr 3

Experiment 3: Input from mic captured by Python, then transferred via MQTT, NR at receiving adding distortion, quality nbr 2

Maybe use Python on both ends before transferring to NR -> speaker

1 Like

Hey Walter,
that could indeed be a quick workaround.
But if nobody diggs into this problem, we keep on getting the same issue until ethernity.
Would be nice if we could have a complete Node-RED solution...
Because there are enough use cases for this.

1 Like