Node-red-contrib-voice2json

JGKK · 29 December 2020 07:00

I had it on my back log for a while now to train a new robust model for hey pips which is what my girlfriend and me are using.
If you would like to contribute some samples I could include your data in my dataset and try train a model and share it with you.
I would need:

10 samples of each person
some five minute recordings of typical household sounds that are to expect at the points where you will have your microphone

To record the samples you can use this bash script if you have sox installed:

#!/bin/bash

declare -i n=$1
declare -i c=1

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

bold=$(tput bold)
normal=$(tput sgr0)

((p=p+1))

printf "\nThis script can record wake word samples for training of a wake word model.
       \nThe recording starts when you press return and ends automatically each time.
       \nOnce you have started a recording by pressing return say the wake word naturally
       like you would in daily use.\nLets start:"

for i in {0..9}
do
    printf "${normal}\n\nPress Enter to record number ${c} of 10 wake word recordings.\n"
    read -s -n 1 key
    if [[ $key = "" ]]; then
        sox -t alsa default -r 16000 -c 1 -b 16 -e signed-integer -L ${DIR}/hotword.${n}.wav \
        trim 0 4 vad -p 0.2 reverse vad -p 0.6 reverse
    fi
    ((n=n+1))
    ((c=c+1))
    printf "\nRecorded successfully.\n\n"
done

printf "\n\nFinished, thank you.\n"

Just save this as for example record.sh into the folder you want to record the wake words in.
Than use it like this:

bash record.sh 1

Where the first argument will be the number to start the enumeration of the 10 files that will be recorded on this run. So use a 1 On the first run and than a 11 on the next than 21 And so on or you will overwrite your previous files.
It’s important to record those samples in a quiet environment.

For recording the 5 minute pieces of random audio you can use the sox record node set to record directly to a wav file and set to stop after 300 seconds.
Make sure the recorded random audio does not contain the wake word.

Tell me if you would like to do that.

Johannes

Edit & Ps: if you should want to try training yourself here is what I do:

I split all the random noise audio into one minute chunks by running this command in the random folder:


for f in *.wav; do sox "$f" "split.$f" trim 0 60 : newfile : restart ; done

Than delete all the original long files and only keep the new noise-split files.
I duplicate each wake word file with added random noise from those files with this script:

#!/bin/bash

NOISEDIR=$1

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

for f in *.wav
do 
    NOISEFILE=$(find ${NOISEDIR} -type f | shuf -n 1)
    
    sox -m $f ${NOISEFILE} noise.$f trim 0 `soxi -D $f`
done

the Script needs to be run in the folder with the wake words and has one argument which is the path to the noise files (delete the script from the folder when your done)
copy about 10-20% of the wake-word files to test/wake-word
copy the whole wake-word folder to the precise folder (always work with a copy so that you can start over at any point)
do a baseline training with:

precise-train your-wake-word.net wake-word-folder/ -e 100 -s 0.5

this will give you a start that will listen to pretty much anything
the real training now happens with (this Part can take a while):

precise-train-incremental your-wake-word.net wake-word-folder/ -r path/to/noise-folder -e 50 -th 0.4 -s 0.5

once finished you can optionally copy the generated test not-wake-words to to the generated not-wake-words and retrain with the first command:

cp wake-word-folder/test/not-wake-word/generated/* wake-word-folder/not-wake-word/generated

and again:

precise-train your-wake-word.net wake-word-folder/ -e 100 -s 0.5

now convert to pb:

precise-convert your-wake-word.net

copy your-wake-word.pb, your-wake-word.pbparams and your-wake-word.pbtxt to a new folder:

mkdir your-wake-word
cp your-wake-word.pb* your-wake-word/

and your done and can now try the result

Topic		Replies	Views
[Update] node-red-contrib-voice2json (beta) Share Your Nodes	3	661	9 January 2021
[Announce] node-red-contrib-voice2json (beta) Share Your Nodes	50	5712	26 August 2020
[Announce] node-red-contrib-pico2wave (beta) Share Your Nodes	11	635	28 August 2020
Node-red-contrib-alexa-remote2 General	6	1660	6 January 2020
[Update] node-red-contrib-deepspeech-stt v0.4.0 (added streaming inference support) Share Your Nodes	2	448	31 March 2021

Node-red-contrib-voice2json

Related topics