[Announce] node-red-contrib-deepspeech-stt (beta)

Hello,
Id like to announce the first beta commit for node-red-contrib-deepspeech-stt:

This node uses the official deepspeech node.js cpu client implementation. So just install the node from your node-red folder (normally ~/.node-red) with

npm install johanneskropf/node-red-contrib-deepspeech-stt

and deepspeech will be automatically installed as a dependency.
The node uses deepspeech 0.9.3 or later. To do speech to text inference you need to download a model (tflite) and a scorer file. For example the official english or chinese model can be found on the release page.
You need to enter the path to both the model and the scorer in the nodes config.
To do inference then send a wav buffer (16000Hz, 16bit, mono) to the nodes input in the configured msg input property.
You will receive the transcription, input length and inference time as an object in the msg.payload or in your configured output property.
If you want to do more accurate and quicker transcriptions of a limited vocabulary and sentences set you will need to train your own scorer file. Documentation on how to do this can be found in the deepspeech readme.

Johannes

4 Likes

interesting .... my old friend strikes back again :wink:
Would you recommend to use DS / replace voice2json ? - As far as I get it, you can use it to recognize free speech and not only predefined stuff ?

Btw. Why does this topic has so few hits. Everybody was looking for a solution to replace alexa. Any limits I missed ?

For me the "Dutch" language support unfortunately...

1 Like

Dutch ppl are good at english and german languages. have you tried german ? (what I would need) :slight_smile:

There is a good model for deepspeech in german available here:

Well the sky is the limit really :smirk:
This is just one component you would need in a nodered voice assistant compared to voice2json which offers everything in one package.
The thing is I wanted more flexibility and wanted everything to be more native and nodered integrated than voice2json could ever be in the end. (Not to say that voice2json is not a great piece of software as its totally awesome but i just like it the hard way)
So I looked at my awesome voice assistant pipeline flow chart that I made for voice2json


and decided to develop the individual components needed as nodered nodes and or subflows or to contribute to existing nodes like for example node-red-contrib-fuzzywuzzy to make them fit into my grand scheme of building a nearly native voice assistant with node-red / node.js.
You can see All I have done in that direction here:

https://flows.nodered.org/collection/Qn4a6AEtnjAw

So now I have completely node-red based voice assistant toolkit.
If I find the time I will actually write up a tutorial based on a simple example how to use all those tools to build one :see_no_evil:

Deepspeech fills the stt/asr part in that toolkit. I like deepspeech because its much simpler to train a domain specific language model and add new vocabulary to it than it is to do the same for kaldi/vosk. (External scorer scripts — Mozilla DeepSpeech 0.9.3 documentation)
It also offers a native node.js api that offers streaming support. So no more python hacks as I really dont like python.
But keeping all this in mind I will actually cease development on the deepspeech node in the future as mozilla pretty much shelved the program and the outlook for future development is bleak.
But fortunately that is not the end of the story as most of the original developers forked deepspeech and are continuing development on the fork :raised_hands:
This fork is called coqui and can be found here:

or here

and I already have the node which will work as a drop in replacement for the deepspeech node ready:

Its not published yet as the npm support for arm64 and armhf (so raspberry pis) is missing right now but as soon as that arrives which should be soon I will publish the coqui nodes and they will take the place of deepspeech.
The models used for deepspeech and any scorers you train are compatible between the two.

I hope this sheds some light on my motivations, Johannes

3 Likes

Hello Johannes!
Very much thanks for your detailed report. - And glad to see someone who dislikes python like me :wink:
If all things with coqui and your great work come together as planed I am getting really exited with that new upcoming possibilities, Jens.

1 Like

You should experiment to use Dutch in your own home automation. I'm pretty sure your wife and children would ask you very friendly to activate German again as soon as possible :slight_smile:
But now we are too much off-topic ..

3 Likes

I think we will have to wait for these super duper DS fork with Pi support and then continue the language discussion. :wink:

1 Like

Deepspeech already has Pi support its only coqui that’s missing it and as the deepspeech and coqui nodes and models are at this point in development interchangeable.
So if you want to play go ahead and install the deepspeech nodes and play with them on a pi because as soon as I will release the coqui nodes they will work as a nearly identical drop in replacement.

Johannes

2 Likes

Hello Johannes,

did you noticed / tested that:


source: Bug: Node version - npm install, empty index.js in node_modules/stt · Issue #1830 · coqui-ai/STT · GitHub

Does your coqui-stt (GitHub - johanneskropf/node-red-contrib-coqui-stt: a node-red node to perform speech to text inference using coqui stt) work with that ?
I am still a bit lost on all the stuff :slight_smile:

Greetinx

Hello,
This is not a problem anymore and both the deepspeech and coqui nodes work on the Raspberrypi no problem now.
I have actually published the coqui nodes last month:

which means that my Deepspeech nodes will probably not see any further development at this point as I will focus on the Coqui branch.

Johannes

2 Likes