[Announce] node-red-contrib-deepspeech-stt (beta)

There is a good model for deepspeech in german available here:

Well the sky is the limit really :smirk:
This is just one component you would need in a nodered voice assistant compared to voice2json which offers everything in one package.
The thing is I wanted more flexibility and wanted everything to be more native and nodered integrated than voice2json could ever be in the end. (Not to say that voice2json is not a great piece of software as its totally awesome but i just like it the hard way)
So I looked at my awesome voice assistant pipeline flow chart that I made for voice2json


and decided to develop the individual components needed as nodered nodes and or subflows or to contribute to existing nodes like for example node-red-contrib-fuzzywuzzy to make them fit into my grand scheme of building a nearly native voice assistant with node-red / node.js.
You can see All I have done in that direction here:

https://flows.nodered.org/collection/Qn4a6AEtnjAw

So now I have completely node-red based voice assistant toolkit.
If I find the time I will actually write up a tutorial based on a simple example how to use all those tools to build one :see_no_evil:

Deepspeech fills the stt/asr part in that toolkit. I like deepspeech because its much simpler to train a domain specific language model and add new vocabulary to it than it is to do the same for kaldi/vosk. (External scorer scripts — Mozilla DeepSpeech 0.9.3 documentation)
It also offers a native node.js api that offers streaming support. So no more python hacks as I really dont like python.
But keeping all this in mind I will actually cease development on the deepspeech node in the future as mozilla pretty much shelved the program and the outlook for future development is bleak.
But fortunately that is not the end of the story as most of the original developers forked deepspeech and are continuing development on the fork :raised_hands:
This fork is called coqui and can be found here:

or here

and I already have the node which will work as a drop in replacement for the deepspeech node ready:

Its not published yet as the npm support for arm64 and armhf (so raspberry pis) is missing right now but as soon as that arrives which should be soon I will publish the coqui nodes and they will take the place of deepspeech.
The models used for deepspeech and any scorers you train are compatible between the two.

I hope this sheds some light on my motivations, Johannes

3 Likes