[Announce] node-red-contrib-voice2json (beta)

After many late nights by @BartButenaers and me im happy to announce the beta of our nodes to integrate voice2json into nodered:

The included nodes make it possible to bootstrap a (nearly) complete voice command application with nodered and voice2json on a linux device. This is possible for a lot of languages as voice2json integrates them with easily downloadable profiles.
Node-red-contrib-voice2json includes:

  • wait-wake, a node to listen to a stream of raw audio buffers to detect a wake word with the precise wake word listener that is integrated in voice2json
  • record-command, a node that records a speech command from a stream of raw audio buffers and detects when it was finished speaking and only emits the speech part of the recording as a wav buffer using webrtc vad
  • stt (speech to text), a node that transcribes audio commands to text on the basis of sentences and rules defined in a simplified jsgf grammar in the nodes config node.
  • tti (text to intent), a node to parse text and use a basic form of nlu to find the intent and entities also defined in the sentences in the nodes config
  • training, a node to train a voice2json language profile with the sentences from the nodes config

Here it is in action:

We wrote a little bit :wink: of documentation to get the ones brave enough to try it started which you can find in our readme. For further reading on the inner workings i highly recommend you have a look at this whitepaper about the project by the voice2json developer.

I hope some of you will find this suite of nodes useful and can help us find all the little bugs :spider: in it that still need to be fixed.

Best regards from @BartButenaers and me

18 Likes

Hi folks,
Shortly after I started using Node-RED I thought: how cool would it be to control Node-RED with voice commands, without needing cloud stuff (from Google, Amazon...). And now thanks to all the digging from Jonathan, it is possible to run this locally on a Raspberry. Glad this one is removed finally from my todo list :wink:

11 Likes

Thank you both for your hard work.

I will need to set up a separate Pi and test this out next week. Looking forward to trying it!

3 Likes

Thank you @JGKK for your great work ! Its high on my todo list. Like Bart said: its a little dream to get this to work without placing a spy from google / amzon... into our rooms!

2 Likes

what kind of microphones do you use? How far away from the microphone can you be?

Looks really interesting!!

This really depends on the microphone you use.

There is many options but some of the best bang for your money are the respeaker 2 mic or 4 mic pi hats as they work well up to 3-4 meters of distance and have leds on top that can be quite easily used from a python script. Some of the available cheap usb conference mics work fairly good to but often haven’t got the best signal to noise ratio.
I personally use the usb mic array v2 from respeaker which is a lot more expensive (around 65€) but has build in audio pre processing and things like that. That one works really well up to 5 meters.
I would say if you have a pi and want to get startet get one of the respeaker 2 mic pi hats as they can be had for below 10€s.

thanks for your answer. I was asking since I believe the success of your nodes if used as a replacement for the typical devices from the big companies crucially depend on the hardware abilities to capture speech from a long enough distance reliably. Will checkout the mic array.
Do you use its inbuilt audio pre processing?

I hope that the more people will use the node, the more community knowledge evolves about hardware options

1 Like

Yes you can use a 1 channel firmware on the mic array v2 that gives you one output channel that is optimized for voice recognition applications. It’s by far the cleanest audio i ever had from any of the microphones.

yes i hope so too. For now you can head over to the Rhasspy forum as this is the big brother of voice2json by the same developer. There is many helpful people over there when it comes to hardware questions. Respeaker by seeedstudio right now is in my opinion one of the best options because their microphones are widely available and there drivers are quite solid by now (the usb mic array only needs drivers for the leds) .

Last week we also have been experimenting with the node-red-contrib-ui-microphone node from Nick and Dave. Because it would be nice if you could use the microphone of the device running the Node-RED dashboard. However we didn't manage yet to get a wav buffer with sufficient sound quality for good recognition results :woozy_face:

2 Likes

It’s not really even the sound quality but that something strange is going on with the audio where it sound to low like the pitch was shifted due to some funky conversion :thinking:

It all looks quite complex for me, but interesting. Would one be able to feed an audiofile as input instead of a microphone ? Could be interesting to send audio from some device over mqtt and process that in these nodes.

Install voice2json on the same machine as nodered

Is this required ?


This tiny flow chart i made pretty much summarizes all the ways you can pass audio to the nodes and in which formats and at what points.
Yes i for example run wake word and record command on one pi and than send the wav buffer over mqtt to my base for stt. I also have a siri shortcut on my phone that records audio and sends it to nodered via an http request where it gets converted to the right format and fed to the stt node. Its all in the documentation :wink: but yes you can use any of the nodes by themselves.

3 Likes

yes as all the processing happens in voice2json and not in nodered. Nodered just provides the interface in this case to easily tie the components together. So they are codependent in this case. Fortunately voice2json is very straightforward to install. Its all not that complicated really and bart is working on an easy step by step tutorial right now to give an easy starting point.

1 Like

I am using Rhasspy running on a separate RPi 3+ integrated with Node Red through MQTT, which then drives an automation controller.

I have used both of the ReSpeaker Mic 2 and Mic 4 arrays. The 2 works OK up to about 2m away, but is not very immune to any other sound around. The Mic 4 has better discrimination, and is currently just over 4 m from where we sit.

If the music/TV/Film is not too loud (as in a liveable volume rather than 'listening' volume :rofl:), it will pick up commands OK.

The Respeaker Mic 2 will go onto the Node Red unit I will test this node on.

2 Likes

Than you no what to expect as the underlying libraries and the sentences syntax are pretty much the same as Rhasspy 2.5. Just a lot more modular. I hope it all works well as i kept Mike very busy when beta testing for this release of voice2json and snuck in a feature or two and few pull requests just to make the nodes work better :shushing_face:

I think Mike and the team have done a great job with 2.5.

Sure made me look more into MQTT instead of using Web Sockets. Glad I did.

I cant speak for Rhasspy but im very happy with how voice2json 2.0 turned out with features like the precise integration and the slot programs. Im in close contact with mike and these nodes will probly become part of the official voice2json documentation sometime in the future when there is a non beta release.

6 Likes

This is fantastic - kudos to both Johannes & Bart! In under 30 minutes I was able to get a local voice agent added to my home & work demo setup. Currently running on a Pi 3+, but have already done a quick test on my 4 - noticeably faster performance so will be redeploying there. Of course I spent another hour playing with custom voice wake words -- but who wouldn't? Thank you to you both!

3 Likes

Thank you :pray:
Yes the pi 4 is noticeably faster but im still suprised how usable a pi 3b is.
There is one caveat right now when using docker. I just found a bug that when you try to restart the wait-wake node it will hang and leave orphaned processes behind that can only be killed from htop.
So PSA:
Dont use wait wake with docker installs right now but if you do be aware that you cant properly stop the process and this also is true for restarting the flows or stopping nodered, it will work but you will have to kill it manually from htop if you redeploy, restart, stop nodered.
If possible just use the deb package as this has none of those problems

I stuck on this while the voice2json installation:
" * Download a profile and extract it to $HOME/.config/voice2json

after I installed the deb package there are none of these directorys existing on my raspi.
cli:voice2json seems to work.... or not ?

pi@raspi4B:/usr/lib/voice2json/voice2json $ voice2json test-examples
WARNING:voice2json:/home/pi/.config/voice2json/profile.yml does not exist. Using default settings.
CRITICAL:voice2json.core:Missing /home/pi/.config/voice2json/intent.pickle.gz. Did you forget to run train-profile?
Traceback (most recent call last):
  File "__main__.py", line 6, in <module>
  File "asyncio/runners.py", line 43, in run
  File "asyncio/base_events.py", line 587, in run_until_complete
  File "voice2json/__main__.py", line 73, in main
  File "voice2json/test.py", line 27, in test_examples
AssertionError: Not trained
[13649] Failed to execute script __main__
pi@raspi4B:/usr/lib/voice2json/voice2json $