MQTT, messages from offline devices

Did you say you had checked in MQTT Explorer that there is nothing retained on the reply topic?

Well, as the broker was power reset the other day, I can not say for sure.
(Hey, I'm being honest)

I have since powered on the Alarm_Clock checked it's MQTT stuff. As said that rogue message's retain flag was blank.

I set it to false and shall have to wait and see what happens now.

With default settings in Mosquitto powering off does not lose retained topics, so the question is valid:

Did you say you had checked in MQTT Explorer that there is nothing retained on the reply topic?

As of now, I can't see the retained topic.

So all the retained messages are stored on the SD card?

Provided you have mosquitto set to save retained data to the file system then it will save it on the file system (wherever it is configured to do that). The default mosquitto is to save it to the file system. Have a look in /etc/mosquitto/mosquitto.conf and you will probably find

persistence true
persistence_location /var/lib/mosquitto/

which I guess is fairly self explanatory.

1 Like

Thanks.

However: since the power cycle I have stopped seeing the rogue messages.

Also to be considered I set the retain flag from being BLANK to false.

So I can't say definitively what is going on as since then all is ok.

I guess you mixed up things there a bit, don't know why, but to my understanding the retain flag should never be BLANK, it is the the payload that should be BLANK if you wanna reset the retained value. The retain flag should be either false or true

Yeah, fair enough.

I was going through all the MQTT nodes and saw that the node had it's retain flag blank.
So I set it to false.

In fact when you add an MQTT Out node the retain field is blank, and in the help text it says it defaults to false, so presumably leaving it blank is ok.

Yeah, but from what I've seen, leaving it blank doesn't make if false.

I'll try to remember to go back and delete that setting and make it blank again and see if it happens again.

I suspect that what you have seen was caused by something else, unless the message you sent overrode the retain setting (perhaps it was left over from a previous node).

Drats....

After setting to FALSE, I can't change the setting back to being blank.

So I can't really test it. :frowning:

Maybe when I have time and I power up another machine I will try to remember to check the MQTT node settings.

That is an interesting thought, I realise. An MQTT In node message has msg.retain set from the received topic. So if you feed the message from an In node (possibly via other nodes) to an MQTT Out node which has Retain blank then it will use msg.retain from the In node, which may not be false.

These are the settings of which I was speaking.

When I opened it a while back they were blank.
After changing them to false the messages have seemingly stopped.

But I don't want to be too quick in saying it has been solved just yet.

Good point, I had forgotten it is LWT messages you were referring to. I have just checked and with node-red 2.2.2 leaving Retain blank for LWT messages does default to not retained.

Alarm_Clock is only 2.1.6....

I have older versions and I may check them when I get time to power up those machines.

I think that your reference to birth and death certificates is a bit confusing. I guess youā€™re actually referring to the LWT functionality?

Think of LWT as being like the thing you see in the movies, where the good guy goes to his solicitor and says ā€œIf you donā€™t hear from me by this time next week, mail this dossier to the New York Timesā€ :grinning:

The LWT instruction you issue from PubSubClient is exactly the same. Itā€™s telling the broker that if it hasnā€™t heard from the client for a while then it should post message x to topic y.
The amount of time before the broker regards the client as ā€œmissing in action, presumed deadā€ is determined by the timeout setting of the broker. Using Mosquitto, that seems to default to around 30 seconds.

The documentation and examples for LWT with PubSubClient are a bit vague and it took a bit of research to learn how to make the most of the functionality.

This is the MQTT connect command I use when a device connects/reconnects to the serverā€¦

if(MQTTclient.connect(mqtt_client_id.c_str(), mqtt_username, mqtt_password, (base_mqtt_topic + "/Status").c_str(),0, 1, "Dead"))
{
    MQTTclient.publish((base_mqtt_topic + "/Status").c_str(),"Alive",true);

As youā€™ll see, the 'connect() command defines the path for the LWT message, and the message itself ("Dead"). If the connection is successful then the next line publishes an ā€œAliveā€ message to the target topic.

The syntax for the .connect command is:

  /* 
  MQTT Connection syntax:
  boolean connect (client_id, username, password, willTopic, willQoS, willRetain, willMessage)
  Connects the client with a Will message, username and password specified.
  Parameters
    client_id : the client ID to use when connecting to the server.
    username : the username to use. If NULL, no username or password is used (const char[])
    password : the password to use. If NULL, no password is used (const char[])
    willTopic : the topic to be used by the will message (const char[])
    willQoS : the quality of service to be used by the will message (int : 0,1 or 2)
    willRetain : whether the will should be published with the retain flag (int : 0 or 1)
    willMessage : the payload of the will message (const char[])
  Returns
    false - connection failed.
    true - connection succeeded
  */

I choose to set the retain flag to true for both the .connect() command and the subsequent publish() command, and this works very well for me.

Just to clarify the base_mqtt_topic in my commands, I organise my devices with a hierarchy like this:

Home
  Lounge
    Table_Lamp
    Main_Lights
    Fan_Controller
  Bedroom
    Bedside_Lamp    
    Main_Lights
    Fan_Controller

So the device that controls the bedside lamp in the bedroom will have a base MQTT topic of Home/Bedroom/Bedside_Lamp so the topic for that device's LWT is Home/Bedroom/Bedside_Lamp/Status. I do this because I use the same core code for all of my devices, and only have to change one variable to adjust the device's MQTT settings.

Having said all of that, I don't actually use the LWT data that much.
Instead, I have each device publish heartbeat data every 5 seconds or so (the actual interval is randomised at start-up, to void all devices sending their heartbeats at the same time). Each device sends its RSSI value as part of this heartbeat, and this is what I monitor using a Timeout node to detect when a device goes offline.

The LWT status is handy though, when looking at devices in MQTT Explorer. It gives an instant feedback on whether the device is alive or dead, and therefore whether all of the other data can be trusted, without having to wait to see if the values are updated in the next heartbeat period.

Pete.

Hi Peter,

It's good to read your reply as I do similar things also.

When starting I knew nothing about the Birth and LWT messages.

Where you send messages from your devices every 5 seconds, I have a central machine pinging everyone every n seconds and the ping replies are used to determine if they are dead/alive.

I'm using the Birth and LWT message more as an exercise in seeing them work after I found out what they are.

We may be lucky and tomorrow I may have some more information on this as I have just superseded a machine with a new one, and though that should not make a difference: I'm hoping some of the Gremlins have survived the change.

But what I also do is a bit more checking as well.
All my Certificates are sent on two topics: SOM and EOM.

But on my main machine I also have a bit more code - of which I just mentioned - and it is listening to the SOM topic.

If it sees one, it publishes a message on a different topic. This basically tells all connected devices to reply with their details. (Not on the SOM/EOM topic. The topic is IFF

What I've been seeing are a few LWT messages from dead machines every now and then.
They are bunched in with other SOM and EOM (or more so: EOM and SOM) messages.

Reading replies: this is normal - supposedly. The broker broadcasts all those messages when it deems it needed as it has just received one of their type.
I can see some good and bad things about that.
So putting that aside, ok.... Here's where it gets confusing for me.

As well as these expected LWT messages (and even BirthCertificates scattered in there as well) I am getting the messages the machine sends when it is asked "Who are you?" (IFF)
This as NOTHING to do with the BS and LWT. Rather they are sent on a different topic altogether.

Now, going slightly off topic:
Why am I saying that tomorrow may be good.
(things have got a a lot better in the past year or so)
On a somewhat predictable basis: every morning at 05:00 my main WAP goes down.
(Why: *Who knows. But it does. Then comes back.)

That has a knock on effect that a few WiFi devices are flagged as Dead by the broker and when the WAP comes back there is some negotiating happening.

So I get a few LWT and BS messages.
And with the BS messages the machine sends out a few IFF messages and gets their replies.

Why/how are machines that are NOT CONNECTED and even POWERED OFF reply to this?

This morning - alas - other things happened and I didn't get the WAP going down as usual.
Of course. I'm looking for the problem, so it won't present itself.

They are not certificates.
How are you sending them to two topics? It is only possible to specify one topic for each message.

Sorry I think that is more a confusion of their names. It is a birth CERTIFICATE... The LWT is a death CERTIFICATE.

That is what I am meaning by that statement.

Topics:

I can see two here.