Extarcting values from XML files

I'm using Node Red to extract data from reports generated by some medical equipment. These reports are in general XML files. And while these devices do their medical job properly, they are really bad at making reports. Meaning XML files are often faulty. While data I need to extract (process status, tempreature, user id etc.) are always there, some sections are often missing. The file is still converted by XML node to JS object but paths to values I need always change depending on what is missing.

Is it possible to extract a value that is once in msg.payload.Signature.Object[0].Illumination[0].Header[0].User[0], then in msg.payload.Illumination.Header[0].User and then somewhere else? Meaning it's always .User but its position inside an object is always different?

You might be better keeping the XML as a string and trying to process it that way. If the XML changes to that degree, it is going to be next to impossible to rely on a general purpose XML->JSON conversion.

You could try feeding the XML into an HTML node and use a CSS selector to find the User tag.

Always different or predictable different ? (like in 3 or 4 variations ?)

Is the path stable enough to craft an XPath for it to read? I’d suggest to keep it as xml, load it in a (function) node and find the value through an xpath search. Without having seen what your XMLs look like I can’t give stable suggestions but looking at the xpath specifications won’t hurt :slight_smile:

1.0 XML Path Language (XPath)
2.0 XML Path Language (XPath) 2.0 (Second Edition)
2.1, later renamed to 3.0 XML Path Language (XPath) 3.0
3.1 XML Path Language (XPath) 3.1

It will depend on compatibility what version you will need, I’m not sure which version is available in browser-based javascript, nor do I know from mind if node has compatibility for it or if you need an additional library.

or convert to json and use jsonata, which is made exactly for this purpose, traversing messy objects.

I couldn't agree more -- check out this section of the JSONata docs:

From what I'm reading, if you just need all the User objects (or strings) at any level in your structure, you should be able to get a list of them using the expression payload.**.User

However you end up doing it, please do update this thread to let us know how you got on.