Does anyone do a lot of string word parsing?

cymplecy · 18 May 2020 21:12

If I switch over to using "Alexa Simon Says" for my house #IoT stuff then I'm going to have to handle all such Alexa requests myself and cope with things such as saying:

"alexa simon says skybox pause"

and getting
"alexa simon says sky bot paws"

So I'd be wanting to do stuff like

contains "sky b" or "skyb" followed by "pause" or "paws"

or even cope with

"alexa simon says pause skybox"

So - has anyone done any lexilogical parsing type stuff using NodeRED?

Or if not, got a good direction I should look to go in?

E1cid · 18 May 2020 23:15

one way, no need for parsing

create a command object, that is stored in the context data.

//function to create context data need to edit settings json to make it survive reloads
var utterances = {
"skybox_pause": "command a",
"sky_bot_paws" :"command a",
"pause_skybox": "command a",
"another_1": "command b",
"another_2": "command b",
"more_1": "command c",
"more_2": "command c",
"more_3": "command c"
};
global.set("utterances", utterances);
return;

You can then add multiple phrases to one command. By removing the simon says from the incoming alexa phrase, you can retrieve the command for that phrase from the context data.

//function retrieve command
var incoming_phrase = "simon says skybox pause"
msg.payload = global.get("utterances." + incoming_phrase.substr(11).replace(/ /g, "_"));

output msg.payload returns "command a"

Or do similar with a database

Edit/ to fix white space and syntax error

JGKK · 19 May 2020 05:26

I have You can use fuzzy string matching to cope with some variations and get the most likely/close result from a list of possible sentences. When I did it i included fuzzball in my settings js file and used it in a function node. Its pretty simple to use and really fast even if you compare to a list of thousands of possible strings.
The other way you could go would be a lot more sophisticated. You could use an nlu engine of some kind. There you could either use something external to nodered like rasa-nlu which is python based or there is node-red-contrib-ecolect which only works for english which is why i have no experience with that one as I needed it in german for my use case.
Another way i did which is probably best for very few commands is to use regexes with test() statements in a switch statement. The regex approach works best in a cascading style where you first do a rough filter/sort in a first function and than do the actual intent parsing in a second function for each intent.
I hope this gives you some ideas
Johannes

cymplecy · 19 May 2020 22:03

Like the idea here but its not working for me
PS _ I'm stripping off the simon says before it gets to your function node

I think we can't have spaces in the global.get line

i.e global.get("utterances." + "skybox pause")

isn't valid because of the space in "skybox pause"

cymplecy · 19 May 2020 22:08

That looks very promising but for having to require fuzzball in settings.js

I always try to avoid things like that as although I could do it - its not easily repeatable for others and I like to share and enjoy anything I do as much as possible

Maybe it'll prompt me into writing my first contrib node - although after several years - I've still not done that

E1cid · 19 May 2020 22:18

Spaces are fine there.
Have you stripped the simon says correctly , i.e. have you left white space at beginning

edit/ strick that, it may not like space as i never tested with your phrases.

try Using _ in comand object keys, then replace all white space with _ , in the simon says phrase.

cymplecy · 19 May 2020 22:19

Well -I'm manually setting the string

and without a space it all works

So I'm thinking that spaces aren't allowed in object properties (but I'm no JS expert BTW)

My utterances def BTW

E1cid · 19 May 2020 22:23

edited previous post

cymplecy · 19 May 2020 22:27

Obvious solution - ta

cymplecy · 21 May 2020 10:40

One issue I'm having is the when I say "alexa simon says turn tv sound off" - I sometimes get it returning "simon says turn t. v. sound up"

And when I say "alexa simon says turn tv sound on" - I sometimes get it returning "simon says turn t. v. sound up" as well

So I'm going to have to track whether the sound is on or off at the time and act accordingly!

ristomatti · 21 May 2020 15:35

Spaces are OK when accessing the properties using a string key in brackets, e.g. obj['property key']. So in this case instead of:

global.get("utterances." + "skybox pause");

you could write:

global.get("utterances")["skybox pause"];

E1cid · 22 May 2020 18:25

Clarification

Does this one just fetch the value of utterances.skybox_pause from context?

global.get("utterances." + "skybox_pause");

Does this one fetch the whole object from context and then select the value of "skybox pause" from the object.

global.get("utterances")["skybox pause"];

What i am asking is does one have more overhead?

ristomatti · 22 May 2020 18:30

You are correct in that the first one fetches the value directly and the second one fetches the whole object. The performance difference should be neglible though as only the object reference is fetched. My gut feeling is that something like this could matter only If there were tens or hundreds of thousands of these operations happening in a seconds time.

I wouldn't even think about compromising the code readability for something like this.

E1cid · 22 May 2020 18:32

No matter the number, i would say it is best to use as little overhead as possible.
It does not affect the code just the way you store the keys.

ristomatti · 22 May 2020 18:43

There's a famous quote "premature optimization is the root of all evil" related to this. If I would be worried about performance on a microsecond level, I would not be using Node-RED.

E1cid · 22 May 2020 18:47

“Every minute you spend in planning saves 10 minutes in execution; this gives you a 1,000 percent return on energy!”

Quote wars!!!

ristomatti · 22 May 2020 19:23

Exactly. Out of curiosity I wrote a simple function to test this:

const utterances = {
  "skybox_pause": "command a",
  "sky_bot_paws" :"command a",
  "pause skybox": "command a",
  "another_1": "command b",
  "another_2": "command b",
  "more_1": "command c",
  "more_2": "command c",
  "more_3": "command c"
};
global.set("utterances", utterances);

// Test 1 - fetch nested context value directly
console.log('Fetch nested value directly');
let key1 = 'skybox_pause';

console.time('test 1' );
for (let i = 0; i < 10000; i++) {
  global.get('utterances.' + key1);
}
console.timeEnd('test 1');

console.log();

// Test 2
console.log('Fetch full object, access property');
let key2 = 'pause skybox';

console.time('test 2' );
for (let i = 0; i < 10000; i++) {
  global.get('utterances')[key2];
}
console.timeEnd('test 2');

console.log();

return msg;

If you test this with the Node-RED console open you might find the results interesting .

E1cid · 22 May 2020 19:38

Or you could post your results to save everyone else who might be interested the time.

ristomatti · 22 May 2020 20:01

I modified the test to allow the JS engine to optimization to settle a bit by running both tests 5 times:

const utterances = {
  "skybox_pause": "command a",
  "sky_bot_paws" :"command a",
  "pause skybox": "command a",
  "another_1": "command b",
  "another_2": "command b",
  "more_1": "command c",
  "more_2": "command c",
  "more_3": "command c"
};
global.set("utterances", utterances);

for (let i = 1; i <= 5; i++) {
  console.log('--- Round ' + i)
  test1();
  test2();
} 

return msg;


function test1() {
  console.log('Fetch nested value directly');
  let key1 = 'skybox_pause';
  
  console.time('test 1' );
  for (let i = 0; i < 10000; i++) {
    global.get('utterances.' + key1);
  }
  console.timeEnd('test 1');
  console.log();
}

function test2() {
  console.log('Fetch full object, access property');
  let key2 = 'pause skybox';
  
  console.time('test 2' );
  for (let i = 0; i < 10000; i++) {
    global.get('utterances')[key2];
  }
  console.timeEnd('test 2');
  console.log();
}

Result:

--- Round 1
Fetch nested value directly
test 1: 101.016ms

Fetch full object, access property
test 2: 15.530ms

--- Round 2
Fetch nested value directly
test 1: 25.770ms

Fetch full object, access property
test 2: 8.852ms

--- Round 3
Fetch nested value directly
test 1: 18.072ms

Fetch full object, access property
test 2: 9.439ms

--- Round 4
Fetch nested value directly
test 1: 16.101ms

Fetch full object, access property
test 2: 7.403ms

--- Round 5
Fetch nested value directly
test 1: 13.952ms

Fetch full object, access property
test 2: 10.313ms

My theory why test 2 performs better is the fact that it doesn't involve string concatenation. But the reason could be also related to how nested object fetching is implemented in global.get.

E1cid · 22 May 2020 20:22

Now try it with 1000 entries in the object
and write the obect[property] to a var, then write the var to null. Do same with direct property to.

Fetch nested value directly
test 1: 24.651ms

Fetch full object, access property
test 2: 46.148ms

so for smaller object
global.get('utterances')['key']

for large objects
global.get('utterances.' + 'key')

I removed the concat for the small test, it did reduce, but still the full object was better for small objects.

I put it back in for the 1000 test

Topic		Replies	Views
Parsing txt file General	4	3278	3 April 2019
How to get all Alexa requests WITHOUT any other skills Share Your Projects	8	2976	16 July 2020
Unusual error - loading node credentials General http-request	21	223	21 July 2024
Alexa ai node red Phrasen does somebody has any idea...? General	19	461	7 November 2020
String formatting in a function General	12	1449	14 September 2022

Does anyone do a lot of string word parsing?

Related topics