Convert text to json

niclas.w82 · 7 September 2020 16:58

Hi.

Any idea how I should do with this output. I should want to convert it to json format somehow so i can point out the path to different values easily.

[{"id":"339b5903.704d06","type":"debug","z":"7d7120c8.6f011","name":"debug","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","x":950,"y":200,"wires":[]},{"id":"c860a983.4c6558","type":"template","z":"7d7120c8.6f011","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":" id=\"10070985\" name=\"controller.door.motorlock.locked\" type=\"controller\" timestamp=\"1599227983000\" domain=\"4\" domainName=\"Great Security KBA Energi gatan 3\"><argument value=\"LCU\" type=\"controller\" id=\"11\"/><argument value=\"Kundentré\" type=\"dac\" id=\"20\" externalId=\"ID:20_20181002_150003\"><![CDATA[address=1]]></argument><argument value=\"Kundentré\" type=\"door\" id=\"27\"/></event></batch><batch name=\"controller.door.motorlock.locked.batch\">","output":"str","x":780,"y":200,"wires":[["339b5903.704d06"]]},{"id":"753be189.cf969","type":"inject","z":"7d7120c8.6f011","name":"","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":580,"y":200,"wires":[["c860a983.4c6558"]]}]

knolleary · 7 September 2020 17:12

Where has that text come from? It looks like a fragment of XML. If you had the complete XML string, you could use the XML node to parse it to a JavaScript object.

But as it's only part of the XML, it'll be hard to parse reliably.

niclas.w82 · 7 September 2020 17:53

Hi. From the beginning somewhere it's an xml.But i receive it in an hex buffer, The convert it to utf8,

See topic Http request (open connection)

where Bart helped me a lot to get this far!

dceejay · 7 September 2020 18:03

Well really you ought to put the chunks back together as you go. Then if necessary split it back into separate documents (as you originally said the chunks never end). And then process that new doc and keep accumulating the next.

BartButenaers · 7 September 2020 20:21

Yes, it is entirely correct what Dave says (of course ...).
The server sends an endless stream of chunks to your http-request node.
But unfortunately it seems that there is no 1-to-1 relationship between a chunk and an xml ...
A single xml can be divided across multiple chunks, or a chunk can contain multiple xmls.

I did something similar in the node-red-contrib-multipart-stream-decoder, as explained here. However that was optimized since an mjpeg stream involves LOTS of data to be processed (e.g. appending data is impossible in that case). However I assume the data rate will be much lower in your case??? If so I would try to keep it as simple as possible, with a bit of performance-lost as result ...

What you need to do is (out of my head...):

Declare a (string) variable:
```
this.data = "";
```
When a chunk arrives, you need to append it to that variable:
```
var data = data.concat(chunkAsUtf8);
```
Then you need to search in the data string whether you find the beginning of the next xml string:
If not found, you do nothing (i.e. wait until the next chunk arrives)
If found, extract the previous xml from the data variable and send it in the output message.
Remove the previous xml from the data variable.
Repeat 5 and 6 until you have processed all xml strings in the data string

I assume something like this:

var nextIndex;

// Find the index where the next xml string starts inside the data.
// Ignore index -1 (i.e. not found) and index 0 (i.e. the data starts with a new xml)
while ((nextIndex = data.indexOf("<?xml")) > 0)  {
    // Get the previous xml from the data
    var previousXml = this.data.slice(0, nextIndex-1);

    // Send the previous xml
    node.send( {payload: previousXml });

    // Remove the previous xml from the data
    this.data = this.data.slice(nextIndex);
}

This code might contain syntax/logical errors !!!
So you will need to test/debug it, since we have no access to your stream ...

niclas.w82 · 7 September 2020 21:36

Thanks for all the help. I Will Continue trying this tomorrow.

BartButenaers · 7 September 2020 21:37

I'm now trying it. But not sure whether I will be able to finish it tonight because I need to get up early tomorrow...

BartButenaers · 7 September 2020 22:45

For anybody following this discussion: got access via a private message to the stream to test.
I tested it with following snippet:

var request = global.get('request');

debugger;

var node = this;
node.data = "";

request({ 
    method: 'GET',
    uri: 'some_private_url',
    gzip: true
},
function (error, response, body) {
      console.log('Most probably the stream has been ended ...')
    }
  )
  .on('data', function(chunkAsBuffer) {
    // Convert the (decompressed) buffer to an utf8 string
    var chunkAsUtf8 = chunkAsBuffer.toString('utf8');

    node.data = node.data.concat(chunkAsUtf8);
    
    var firstIndex;

    // Find the index where the next xml string starts inside the data.
    while ((firstIndex = node.data.indexOf("<?xml")) > -1)  {
        // When the next xml doesn't start yet in the current data, the entire
        // xml hasn't arrived yet.  So wait until the next chunk arrives ...
        if (firstIndex === node.data.lastIndexOf("<?xml")) {
            break;
        }
    
        // Get the previous xml from the data
        var previousXml = node.data.slice(0, firstIndex-1);
    
        // Send the previous xml
        node.send( {payload: previousXml });
    
        // Remove the previous xml from the data
        node.data = node.data.slice(firstIndex);
    }
})

Using a combination of indexOf and lastIndexOf is not really good for performance, but seems that not much data is involved in this particular stream.

When testing the stream, it seems that there is a huge time interval between the xml's (i.e. only received one since I started my test...). Only the chunks of a single xml will arrive very quickly after each other. And I also get frequently (at periodic intervals) keep-alive heartbeats:

So those should be ignored anyway ...

But this means that there is a big problem with my code snippet: I only send an xml when the START OF THE NEXT XML is detected. But since that next url will only arrive much later, the previous xml will keep waiting to be send. Which results in unacceptable delays!

So it would be required to send an xml as an output message when the END OF THE CURRENT XML is detected. However when looking at an (56K long) xml, that is impossible (since the events closing tag doesn't arrive):

<?xml version="1.0" encoding="UTF-8"?>

<events>
<batch ...>
<event .../>
</event>
</batch>
...
<batch ...>
<event ...>" 
....
</event>
</batch>
<heartbeat timestamp="1599516241221"/>
<heartbeat timestamp="1599516301227"/>
<heartbeat timestamp="1599516361234"/>
<heartbeat timestamp="1599516421240"/>
...

So I "assume" the heartbeats also imply that the xml is completed...
But that is a wild guess...

But if my assumption is correct, it can be implemented by adding this (before the 'concat' line):

    if (chunkAsUtf8.trim().startsWith("<heartbeat")) {
        if (node.data !== "") {
            // Send the previous xml, which has probably arrived completely
            node.send( {payload: node.data});
            node.data = "";
        }
       return;
    }

Have tested this and seems to work. There is only a delay of the xml (between retrieval and sending it in an output message) of 1 heartbeat interval length.

I had hoped to be able to get a reusable component from this discussion, but it seems a VERY proprietary protocol

niclas.w82 · 8 September 2020 07:20

Hi Bart.

The url sends events from the system. This is a test site with low activity. Maybe it´s an explanation why the delay is wery long?

Colin · 8 September 2020 08:10

Do you have any control over the data coming in? If so it would be much better if it sent correctly terminated xml, or even better if it had an api that returned json.

BartButenaers · 8 September 2020 08:27

Absolutely. But now you have at least learned that - even in your fast production system - there might be a keepalive chunk arriving from time to time.

But now you know how the protocol works, and you have a code snippet that should be enough to get you started. You need to run the code on your production system, and debug it until you are sure it work fine for a sequence of xmls. Perhaps you need to do a few small tweaks. Based on the stuff you are busy with, I assume you have enough technical skills to do that. In case you have no idea how to debug a function node, I wrote a short tutorial some time ago.

niclas.w82 · 8 September 2020 10:54

Hi Bart.
I think i have what i need to continue testing and modyfi the code.
Big thank for this and your time!

/Niclas

system · 7 November 2020 10:54

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Convert buffer into string and then json General	3	261	5 October 2023
XML Data returned as UTF16 from HTTP Request doesnt parse in the XML node General	12	1676	15 June 2020
Changing String to JavaScript Object General mqtt	18	284	27 July 2023
Http request (open connection) General	12	1783	20 September 2020
Extract a variable of a Json General	18	1665	22 February 2021

Convert text to json

Related topics