[Solved] JSONata - summarise by group (like GROUP BY) & pass through other key pairs

I want to summarise data of a key pair, for grouping based on TWO other json key pairs. And, I want to pass through variables for use further down in the flow.

This is the json array & objects I receive in msg.payload:

[
  {"id":"Irr-1","name":"Irrigation","date":"2018-12-24","duration_s":20,"calctype":"Duration_1vs0"},
  {"id":"Irr-1","name":"Irrigation","date":"2018-12-24","duration_s":25,"calctype":"Duration_1vs0"},
  {"id":"Irr-1","name":"Irrigation","date":"2018-12-25","duration_s":25,"calctype":"Duration_1vs0"},
  {"id":"Irr-2","name":"Irrigation","date":"2018-12-24","duration_s":30,"calctype":"Duration_1vs0"},
  {"id":"Irr-2","name":"Irrigation","date":"2018-12-25","duration_s":30,"calctype":"Duration_1vs0"},
  {"id":"Irr-2","name":"Irrigation","date":"2018-12-25","duration_s":35,"calctype":"Duration_1vs0"},
]

In order to group the incoming msg.payload, I am using a Node-RED change node with the following JSONata expression on the above message, to group based on the "id" key pair:

payload{
  id   : $sum(duration_s)
    }
~> $each(function( $v, $k) {
 { "id": $k, "duration_s": $formatNumber($v, '#,###')}  
})

Which works, and gives the output to msg.payload:

[
  {"id":"Irr-1","duration_s":"70"},
  {"id":"Irr-2","duration_s":"95"},  
]

But, what I need, is the following output:

[
  {"name":"Irrigation","id":"Irr-1",date:"2018-12-24","duration_s":"45"},
  {"name":"Irrigation","id":"Irr-1",date:"2018-12-25","duration_s":"25"},
  {"name":"Irrigation","id":"Irr-1",date:"2018-12-24","duration_s":"30"},
  {"name":"Irrigation","id":"Irr-2",date:"2018-12-25","duration_s":"65"}
]

I do not know how to change the each function for it to group by date as well, nor how to bring the "name" field through too. Any ideas please?

Here is the Node-RED export for this:

[{"id":"3946568c.64ba1a","type":"function","z":"a57275d4.b9ad68","name":"","func":"msg.payload = [\n  {\"id\":\"Irr-1\",\"name\":\"Irrigation\",\"date\":\"2018-12-24\",\"duration_s\":20,\"calctype\":\"Duration_1vs0\"},\n  {\"id\":\"Irr-1\",\"name\":\"Irrigation\",\"date\":\"2018-12-24\",\"duration_s\":25,\"calctype\":\"Duration_1vs0\"},\n  {\"id\":\"Irr-1\",\"name\":\"Irrigation\",\"date\":\"2018-12-25\",\"duration_s\":25,\"calctype\":\"Duration_1vs0\"},\n  {\"id\":\"Irr-2\",\"name\":\"Irrigation\",\"date\":\"2018-12-24\",\"duration_s\":30,\"calctype\":\"Duration_1vs0\"},\n  {\"id\":\"Irr-2\",\"name\":\"Irrigation\",\"date\":\"2018-12-25\",\"duration_s\":30,\"calctype\":\"Duration_1vs0\"},\n  {\"id\":\"Irr-2\",\"name\":\"Irrigation\",\"date\":\"2018-12-25\",\"duration_s\":35,\"calctype\":\"Duration_1vs0\"},\n];\nreturn msg;","outputs":1,"noerr":0,"x":240,"y":1460,"wires":[["113a7a40.16a186"]]},{"id":"b344738b.b3bc98","type":"inject","z":"a57275d4.b9ad68","name":"","topic":"","payload":"1","payloadType":"num","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":90,"y":1460,"wires":[["3946568c.64ba1a"]]},{"id":"113a7a40.16a186","type":"change","z":"a57275d4.b9ad68","name":"group","rules":[{"t":"set","p":"payload","pt":"msg","to":"payload{\t  id   : $sum(duration_s)\t    } \t~> $each(function( $v, $k) {\t { \"id\": $k, \"duration_s\": $formatNumber($v, '#,###')}  \t})\t\t\t","tot":"jsonata"}],"action":"","property":"","from":"","to":"","reg":false,"x":390,"y":1460,"wires":[["825b614c.c856"]]},{"id":"825b614c.c856","type":"debug","z":"a57275d4.b9ad68","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","x":590,"y":1460,"wires":[]}]

The trick is to create a "compound key" that includes both the date and id, similar to a sql "group by" clause (as you noted in the title):

payload {
	id & "@" & date: {
    	"name": [name][0],
    	"id": [id][0],
    	"date": [date][0],
        "duration_s": $sum(duration_s)
    }
}

I'm assuming that each Id only has 1 name, so I can grab the first name out of the array and use it for the summary object. The output compound object now contains all the information in separate named fields (keyed by id@date:

{
  "Irr-1@2018-12-24": {
    "name": "Irrigation",
    "id": "Irr-1",
    "date": "2018-12-24",
    "duration_s": 45
  },
  "Irr-1@2018-12-25": {
    "name": "Irrigation",
    "id": "Irr-1",
    "date": "2018-12-25",
    "duration_s": 25
  },
  "Irr-2@2018-12-24": {
    "name": "Irrigation",
    "id": "Irr-2",
    "date": "2018-12-24",
    "duration_s": 30
  },
  "Irr-2@2018-12-25": {
    "name": "Irrigation",
    "id": "Irr-2",
    "date": "2018-12-25",
    "duration_s": 65
  }
}

which needs to be reformatted into an array of the field data (or does it?). Simplest way to get the array of field values is to use the obj.* syntax -- so wrap the whole expression in parens, like this:

(payload^(date,id) {
	id & "@" & date: {
    	"name": [name][0],
    	"id": [id][0],
    	"date": [date][0],
        "duration_s": $sum(duration_s)
    }
}).*

FYI: I also added the *sort by date first, and then by id" syntax to the payload, in case that was important, and because it's so easy to do inside the expression...

Here is some more documentation for grouping and aggregating in JSONata.

2 Likes

Thank you @shrickus!, I liked the 2nd option better, where it creates an array with the 4 key pairs as answer. As you can see, I am struggling with JSONata, but recognise the power it holds. I still am struggling with the notations, 1st time I see as per your 2nd option - "name": [name][0], ..., still has no idea what this notation does - and how the 2nd option creates such a neat answer. I hope I will get this notation sometime soon...

I will now study your other reply to my other JSONata issue as well.

Thank you again for taking the time out to assist.