Weird behaviour when processing an array of arrays having null values

janvda · 19 June 2020 14:26

I am using node-red-contrib-spreadsheet-in to read an excel file into node-red.
This produces a json array of arrays having null values for those cells that are not populated.

E.g. here below the debug output of the sheet to json node for a simple excel sheet
So you can see that array[1][1] = null.

FYI the simple excel sheet used as input :

So far so good, but when I pass this through a change node with following JSONata expression:

payload.{
   "row" : $
}

then the "null" cell gets removed as you can see in below output:

... but when I test this with same payload (copy pasted from debug window) in the JSONata expression editor then I get the expected result with the null value:

As workaround - i have put 2 json nodes in between (to convert it to a string and back).

with this workaround I am getting the expected output:

Colin · 19 June 2020 15:09

I believe javascript uses sparse arrays where an element can be completely missing. I wonder whether your input array has the null element completely missing and after converting to JSON and back the element is present with a null value. Or vice versa.
I believe you can test that using for in to iterate the array. If the element is completely missing then it will not be selected in the loop, if it is present then it will.
If my guess is correct then whether jsonata should give different answers I don't know. Probably not. Are you using the latest node-red? I believe there was a jsonata upgrade a little time ago that fixed a number of bugs.

janvda · 19 June 2020 15:53

Thanks for the response.

I am using node-red version v1.0.4

based on your input I have tested it with below simple java script program:

row0_for_of_count = 0;
row0_for_in_count = 0;
row1_for_of_count = 0;
row1_for_in_count = 0;

for (var i of msg.payload[0]){
  row0_for_of_count++;
}

for (i of msg.payload[1]){
  row1_for_of_count++;
}

for (i in msg.payload[0]){
  row0_for_in_count++;
}

for (i in msg.payload[1]){
  row1_for_in_count++;
}

msg.row0_for_of_count = row0_for_of_count;
msg.row0_for_in_count = row0_for_in_count;
msg.row1_for_of_count = row1_for_of_count;
msg.row1_for_in_count = row1_for_in_count;

return msg;

... and indeed the "for in" loop for the 2nd row only sees 2 properties.

... I have also applied the same script after passing the payload through 2 json nodes (to serialize it to a string and back) and the "for in" loop for the 2nd row sees now 3 properties:

So that most likely explains the root cause of the issue.

Colin · 19 June 2020 16:27

I am feeling rather pleased with myself now for that inspired guesswork (the weather is miserable here today so thanks for cheering me up )
I don't know whether that is a bug in the jsonata code or not. It is certainly confusing. Are you using a recent version of node-red?

janvda · 19 June 2020 16:58

node-red version 1.04:

Colin · 19 June 2020 17:24

I think this needs a jsonata expert.

Paul-Reed · 19 June 2020 20:19

I thought that person was you Colin

Colin · 19 June 2020 20:37

Absolutely not. Once it gets above payload=0 ? true : false then I am lost.

janvda · 19 June 2020 22:33

Not sure if it is a JSONata issue.

It seems that the debug tab represents "sparse" arrays in exactly the same way as arrays with null values.
I also would expect that serialization and deserialisation with json node would return exactly the same payload but this is not the case when the array is a "sparse" array.

Colin · 20 June 2020 06:10

There are TWO questions then:

Should jsonata behave differently for sparse and full arrays?
Is the JSON node correctly converting a sparse array to JSON? With the corollary that if it is correctly converted then is it converting it back correctly?

Colin · 20 June 2020 07:54

I have done some experiments with interesting results. The function below creates a sparse array with element index 1 set to null and index 3 set to 7. Using forEach() with node.warn finds only those elements and displays the values null and 7 as expected. Looping through using indexes 0 to length-1 then node.warn shows the empty elements as undefined and the null element is null. However, returning the array in the payload and displaying via a debug node shows the undefined ones as null, which is arguably a bug in the debug display. Perhaps @dceejay would like to comment on this.
The function node contains

array = []
array[1] = null
array[3] = 7
array.forEach( function(val, i) {node.warn(`each index: ${i}  value: ${val}`)})
for (i=0; i<array.length; i++) {
    node.warn(`for length index: ${i}  value: ${array[i]}`)
}
msg.payload = array
return msg;

[{"id":"ad3cb23a.59648","type":"inject","z":"bdd7be38.d3b55","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":140,"y":480,"wires":[["168ec116.5dd1cf"]]},{"id":"168ec116.5dd1cf","type":"function","z":"bdd7be38.d3b55","name":"sparse array tests","func":"array = []\narray[1] = null\narray[3] = 7\narray.forEach( function(val, i) {node.warn(`each index: ${i}  value: ${val}`)})\nfor (i=0; i<array.length; i++) {\n    node.warn(`for length index: ${i}  value: ${array[i]}`)\n}\nmsg.payload = array\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":330,"y":480,"wires":[["261e7cc8.e77e8c"]]},{"id":"261e7cc8.e77e8c","type":"debug","z":"bdd7be38.d3b55","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":530,"y":480,"wires":[]}]

dceejay · 20 June 2020 09:58

at first glance yes it looks like the debug display is not doing the right thing... we do handle a simple msg.payload = undefined so it should be possible to fix - I guess somewhere in the encodeObject function in util.js.

However more generally - JSON does not have a representation for undefined so JSON.stringify replaces undefined by null.

janvda · 20 June 2020 10:01

I think the underlying issue is that a JSON array requires all elements to be specified (at least as null) while javascript supports sparse arrays where not all elements are specified.

So not every javascript array can be unambiguously mapped to a JSON array.

Colin · 20 June 2020 10:04

Agreed, as @dceejay says, looking at the json spec it seems that a sparse array cannot be represented exactly in JSON. Null is allowed as the value for an array element but undefined or missing are not allowed. That means that converting a sparse array to json and back must fill in the missing elements with nulls.

That still leaves the question as to whether the effect that @janvda is seeing when he transforms the sparse array with jsonata is expected behaviour or not. Googling for jsonata sparse array seems to yield nothing that is obviously relevant, which is rather surprising to me.

janvda · 20 June 2020 10:15

I think JSONata is designed to work only with JSON objects.

As I am using a sparse array (which is actually not a proper JSON object) you may say that this is not valid input for a JSONata expression.

... nevertheless the JSONata implementation is using javascript map and filter functions under the hood which also in javascript would skip the elements in a sparse array that are not specified. So the JSONata implementation appears to be consistent with that.

dceejay · 20 June 2020 10:19

the other open question from your OP - is whether the spreadsheet node should represent blank cells as undefined or null or should use an empty string as it would do if it was csv

Colin · 20 June 2020 10:20

True, but that does not mean that the implementation is performing according to the spec. An alternative implementation might convert into valid JSON first, so the result would be as you originally expected. Presumably there is a jsonata spec, but again it would need someone who knows jsonata well to understand it and see whether the implementation here is according to the spec.

janvda · 20 June 2020 10:44

I have raised following issue for that:

strange behaviour when processing sparse arrays · Issue #458 · jsonata-js/jsonata · GitHub

janvda · 20 June 2020 10:51

I guess that node GitHub - sakai-to/node-red-contrib-spreadsheet-in: Node-RED nodes to read data from spreadsheet (Excel, ODS, etc.) file is using a node.js module that is doing the conversion to a sparse array.

Using sparse arrays instead of undefined or null might not be a bad choice to safe memory space for those excel sheets having a lot of empty cells.

It would indeed make sense that debug window is not showing null for missing array elements or undefined array elements.

TotallyInformation · 21 June 2020 20:45

I don't believe that the use of an array of arrays is a good model for a spreadsheet anyway. If you need a proper full model for a spreadsheet, you need to keep the data in the spreadsheet. Or maybe read it into a SQL DB.

The standard model in JavaScript for such data is an array of objects which better models the spreadsheet schema. While sparse arrays certainly save memory, as you've seen, it doesn't allow you to retain a consistent model of the original data. You have to know the content of your data in order to ensure you don't use the wrong JavaScript function.

If using an array of objects, you no longer need to worry about sparseness since you can safely miss out any column and still process the data in a consistent way without needing to know the contents.

I would like to suggest that is a failing of the underlying library. Of course, also compounded by the other inconsistencies you've found on the JSONata/Node-RED side.

Topic		Replies	Views
Unexpected behaviour when using jsonata expressions like payload[..][] in change node General	10	573	16 November 2020
Working FTP node ? or direct XML input General	18	1431	19 September 2018
Weird JSONata problem General	22	6361	21 August 2018
In node red using JSONata the output is empty General	5	1323	15 March 2019
Change Node issue with JSONata General	3	456	24 July 2019

Weird behaviour when processing an array of arrays having null values

Related topics