How to support Extended UTF8 characters in MQTT Payload

Hello Everyone,
I am developing a Dashboard on Node-Red. I am receiving MQTT messages
This is an example of the message I can receive :
image
There is a field _name that I cannot control and that can contain extended UTF8 characters (as é, ê, à, ... )
in that case, Node Red provides me this :
image
and is not able to filter on payload keys, etc...
I am using the MQTT node-red node in version 3.0.2
I have tried all the Output options of the MQTT node without success
Do you have an idea on how I could receive MQTT messages with extended UTF8 characters and having Node Red able to decode them as UTF8 ones ?
Many thanks in advance

When this happens, could you copy the payload as JSON by using the copy value button that appears over the payload when you hover over it in the debug panel AND tell us what that value should be?

When I try to expand the msg, I have this :

When there is no special character (UTF8 only) in the field _name

Here is the full buffer :
[123,34,36,116,97,103,34,58,48,44,34,36,116,105,109,101,115,116,97,109,112,34,58,34,49,54,57,52,55,54,48,54,53,50,46,53,56,54,51,49,51,52,53,50,34,44,34,65,114,109,78,105,103,104,116,34,58,48,44,34,65,115,115,111,99,105,97,116,101,100,80,97,114,116,73,100,34,58,91,93,44,34,66,121,112,97,115,115,34,58,48,44,34,67,104,105,109,101,34,58,48,44,34,68,105,115,112,84,111,107,101,110,34,58,91,50,48,48,48,44,48,44,56,53,56,93,44,34,77,101,109,111,34,58,34,34,44,34,78,97,109,101,34,58,34,34,44,34,78,117,109,34,58,49,44,34,80,97,114,116,34,58,49,44,34,82,101,112,111,114,116,34,58,49,44,34,83,117,112,101,114,118,105,115,101,100,34,58,49,44,34,85,115,101,114,34,58,48,44,34,90,111,110,101,84,121,112,101,34,58,49,44,34,95,83,116,97,116,117,115,34,58,123,34,65,108,109,67,111,34,58,48,44,34,65,108,109,70,105,114,101,34,58,48,44,34,65,108,109,77,101,100,105,99,97,108,34,58,48,44,34,65,108,109,84,97,109,112,101,114,34,58,48,44,34,65,108,109,90,111,110,101,34,58,48,44,34,70,97,117,108,116,101,100,34,58,48,44,34,84,98,108,65,67,76,111,115,115,34,58,48,44,34,84,98,108,67,111,34,58,48,44,34,84,98,108,67,114,111,115,115,34,58,48,44,34,84,98,108,69,110,100,79,102,76,105,102,101,34,58,48,44,34,84,98,108,70,105,114,101,34,58,48,44,34,84,98,108,76,111,119,66,97,116,34,58,48,44,34,84,98,108,77,97,105,110,116,101,110,97,110,99,101,34,58,48,44,34,84,98,108,83,117,112,101,114,118,105,115,105,111,110,34,58,48,44,34,84,98,108,84,97,109,112,101,114,34,58,48,44,34,84,98,108,90,111,110,101,34,58,48,125,44,34,95,110,97,109,101,34,58,34,80,111,114,116,101,32,69,110,116,114,233,101,32,92,110,80,111,114,116,101,34,44,34,105,100,34,58,49,44,34,117,114,105,34,58,34,64,92,47,82,70,54,92,47,68,101,118,105]

As you can see in the buffer, I have :
95, _
110, n
97, a
109, m
101, e
34, "
58, :
34, "
80, P
111, o
114, r
116, t
101, e
32,
69, E
110, n
116, t
114, r
233, é
101, e
32,
92,
110, n
80, P
111, o
114, r

the _name field contains : "Porte Entrée". The "é" is not a character in UTF8 standard. So the encoded value is 233 (Caractères ASCII)

The encoding for extended characters is most commonly unicode which is a bit of a dogs breakfast since each character can, as I understand it, take between 1 and 4 bytes instead of the fixed 1-byte for UTF8. I think that character can also be represented in UTF16 though as well (2-bytes) as it is one of the extended latin characters.

Hmm, it IS listed in the UTF8 character set however as c3a8. Not sure how that works to be honest.

Yes, it is defined in the ISO/IEC 8859-1 extended ASCII code page.

And here is the extract of the MQTT standard:

  • Topic Names and Topic Filters are case sensitive
  • Topic Names and Topic Filters can include the space character
  • Topic Names and Topic Filters are UTF-8 encoded strings, they MUST NOT encode to more than 65535 bytes

And this seems to do the job for you. In a function node:

msg.payload = msg.payload.toString("latin1")
return msg

It converts the input buffer to a latin1 (ISO/IEC 8859-1) encoded string.

2 Likes

Thanks a lot !!!!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.