[Announce]: node-red-contrib-nano-tts (alpha)

clickworkorange · 3 October 2019 16:39

I've been working for a while on a node that utilises the excellent and very compact SVOX Pico Text-to-Speech library to turn a Node-RED string into a WAV file or PCM buffer, or for immediate playback on the host machine. To communicate with the Pico TTS engine the node uses the nanotts C++ API wrapper written by Gregory Naughton, which is also what gives the node its name. The node can handle strings in US & UK English, German, French, Spanish and Italian, and the algorithm can be tweaked by supplying values for speed and pitch.

Some examples of how this can be used:

The strings sent to the node can include Pico TTS markup tags, which include

<pitch level="..."> ... </pitch>
Sets the pitch level for the enclosed block.
<speed level="..."> ... </speed>
Sets the speed level for the enclosed block.
<volume level="..."> ... </volume>
Sets the volume level for the enclosed block.
<break time="..."/>
Inserts a pause with the duration specified by the time parameter (e.g. "1s" or "1000ms").
<ignore> ... </ignore>
Completely ignores the enclosed block (it will not be read out).
<phoneme ph="..."/>
Provides a phonemic or phonetic pronunciation for a word to be inserted into the text in the place of the markup. The value of ph should use the X-SAMPA phonetic alphabet to define the phoneme.
<play file="..."/> | <play file="..."> ... </play>
In the first form, this will play an audio file at the position where the tag appears. In the second form the audio file will play instead of the enclosed block of text.

Due to what looks like a bug in the nanotts library, this node is not yet functional with nanotts master branch - you need to modify and build your own copy as discussed in the linked issue. I am posting this here in the hope others will have enough interest in this node to collaborate on resolving this issue and help push it over the line.

dudleyjosh · 4 October 2019 02:02

Very cool, I recently built a flow using PicoTTS via the Exec node that can queue and playback messages as they come in... it uses the new 'Complete' node to manage the queue by watching the Exec node.

I'll have to check out this new node you are working on

TheHypnoToad · 4 October 2019 02:20

Nice I didn't know that local TTS could be done like this, I'd much prefer this over having to send all my requests to Google and back.

Is it possible to run NanoTTS on a RPi 4 or is an x86 CPU needed to build?

bakman2 · 4 October 2019 06:08

If you use chrome, there is also speech recognition available within the browser (local, using speechSynthesis and webkitSpeechRecognition, see documentation), there are some js libraries that make it a bit easier to implement, including tts.

clickworkorange · 4 October 2019 09:39

Oh absolutely. The Pico TTS engine (as the name indicates) has been designed with very modest hardware requirements and will build & run just fine on ARM systems. I haven't tried but my guess would be you could run multiple simultaneous instances on a Pi 1.

Edit: If you want to give it a quick try, just install the libttspico-utils package which includes the pico2wave binary. Pretty sure this will be in the Raspbian repositories.

Steve-Mcl · 4 October 2019 09:49

Are there any good (like Google good) TTS voices for Pico?

Links please.

clickworkorange · 4 October 2019 09:55

No. But Pico TTS comes with no strings attached, and runs on the whiff of an oily rag.

TheHypnoToad · 19 October 2019 00:44

Hi, I finally managed to get my RPi 4 up and running and wanted to try NanoTTS.

It looks like the libttspico-utils used to be included in the older 'stretch' release of Raspbian but is no longer in the latest and greatest 'buster' release.

For anyone else wanting to try PicoTTS on Buster, there is a solution here:

https://www.raspberrypi.org/forums/viewtopic.php?p=1353160

clickworkorange · 19 October 2019 00:59

Thanks, I did not know it had been removed, or that it was considered "non-free". AFAICT NanoTTS sources include all or at least a significant portion of the PicoTTS code, and this appears to have an Apache v2 license attached. I looked at this a little bit because I'm considering ditching NanoTTS and doing a direct implementation of the PicoTTS C API in Node.JS. Not only would that get around the problem with command line argument parsing in PicoTTS, but it would remove the need to use child_process.exec. It should also make it easier to bundle it so NPM can install everything in one go (e.g. using node-gyp). I am slightly daunted by the task though...

TotallyInformation · 8 March 2020 10:49

Works fine on the new Microsoft Edge as well. Probably on Vivaldi, not sure about Brave.

Topic		Replies	Views
[Announce] node-red-contrib-pico2wave (beta) Share Your Nodes	11	635	28 August 2020
[Announce] node-red-contrib-pico2wave Share Your Nodes	1	780	23 January 2021
Node "Play Audio" and "Text-to-Speech" on a Raspberry Pi General	6	1992	30 October 2021
[Announce] node-red-contrib-deepspeech-stt (beta) Share Your Nodes	10	1504	19 October 2021
[Update] node-red-contrib-voice2json (beta) Share Your Nodes	3	661	9 January 2021

[Announce]: node-red-contrib-nano-tts (alpha)

<pitch level="..."> ... </pitch>

<speed level="..."> ... </speed>

<volume level="..."> ... </volume>

<break time="..."/>

<ignore> ... </ignore>

<phoneme ph="..."/>

<play file="..."/> | <play file="..."> ... </play>

Related topics