MQTT "close timeout" issue using Aedes

I found this thread after upgrading from a pre 1.3. version of NR to 2.2 a few days ago:

Error stopping node: Close timed out - MQTT nodes
https://discourse.nodered.org/t/error-stopping-node-close-timed-out-mqtt-nodes/52830/10

Unfortunately I can't reply to the thread as it is closed, however @Colin can you please advise why you wrote this comment?

"Generally, in my opinion, it is better to use Mosquitto. It is very easy to install and configure."

The background of my question:

  • I have been using aedes broker for a limited amount of devices for sometime now and it works perfectly.
  • I upgraded to v2.2. a few days ago and the timeout issue occurred, which I have now found out is a common issue
  • I am ok to workaround the issue, even though it leads to a 20-30 second delay to the clients finishing their subscription with the broker on a restart
  • I have recently setup a master/slave NR setup, whereby the Master NR instance has the Aedes broker installed (node-red contrib) and the slaves are connecting to it
  • The plan is for 1x Master and 8x Slaves to be setup to support about 205 physical zwave devices
  • Each of the slave NR instances have a zwave controller installed (nr-zwave-js - node-red-contrib-zwave-js (node) - Node-RED) and the zwave devices are being added to those slaves
  • So far from testing across a setup of 1x Master and 3x Slaves, Aedes is working perfectly sending and receiving zwave messages from the Slave NR instances to the Master NR instance, however after reading your comment I am suspecting that I may/will run into issues as I transfer more devices across to the Slave NR instances

My nest step is to see how to setup a MQTT broker outside of NR, but I'd like to understand why I am doing it before I potentially run into issues with my current setup.

Found this link on installing MQTT and agree that it doesn't look too hard:

I've re-opened that thread for you if you want to discuss your issues with Aedes contrib node

(and then you can just use this one for your transition to using Mosquitto)

Just general comment - Aedes is great to get a broker up and running without having to go down into the engine room of your computer :slight_smile:

A local LAN mosquito install based for when you want to use MQTT seriously

Hi @cymplecy

Thank you for re-opening the thread, however I think you have summarised the response I need without going into specifics.

I intend to use MQTT seriously and from what I've read am ready to install it locally.

I think the topic is closed on that basis.

2 Likes

Switching to Mosquitto running locally has been an interesting journey.

For anyone planning to do so, the guide here is very helpful, however you need to do more to switch across.: How to Install The Mosquitto MQTT Broker on Linux

Lessons learnt:

  1. Disabling your node-red-contrib-aedes node in NR does not stop the Aedes MQTT broker from working
    Even after rebooting both the client (Pi) and server (Deb 11) the connections persisted.
    This was a problem for me as I had setup MQTT locally and thought it was working, but it was just Aedes persisting
  • Perhaps removing the node from your flow would work, but I just jumped straight to uninstalling it completely
  • On the Instance of NR with aedes installed, you then need to setup a new MQTT server in every MQTT out node - individually.
  • I also rebooted all machines after doing all the config and starting the MQTT service, and then the magic just happened => all seemed to work
  • This is different to how MQTT Aedes works, once you configured the server once, you could pick it from a drop-down in each MQTT out node in any flow within an environment
  1. I've used systemctl and pm2, however I did not know that "service" actually worked on its own.
    Appreciate this is a complete noob comment, but I wasn't trusting (until I tested it) that this command would actually always start the MQTT service, even after reboot
sudo service  mosquitto start

  1. How many connections do I have?
    There's talk of 1024 max connections maybe somehow not being enough, but how to tell how many connections you actually have. I found this works give provide a count of connections:
sudo netstat -natp | grep ESTABLISHED.*mosquitto | wc -l

or if you are a nosy-parker and need to see them individually:

sudo netstat -ntp | grep ESTABLISHED.*mosquitto
  1. Increasing the max number of connections
    First, what is the current setting
    Run this to get the PID of "mosquit+"
ps aux | grep mosquitto

then run this with the pid from above

cat /proc/<pid>/limits

This is my output:
image

Hoping someone can help....

QUESTION 1:
Confusion rains... no-body mentioned soft and hard limits. What does that even mean? I've googled, but no luck.
I've also read the following: "Regardless, since mosquitto is single-threaded, we have not found it useable for anything more than about 1000 publisher clients with a reasonable payload rate at 1 / 10 seconds."

QUESTION 2:
I haven't been able to get this to work. MQTT just doesn't start:

LimitNOFILE=4000

I have seen references to this

DefaultLimitNOFILE=65536

But apparently that parameter resides in: /etc/system/system.conf NOT in /etc/mosquitto/ where the
mosquitto.conf file is that I edited for the other suggested updates from the link/tutorial I pasted above.

Either way, it seems to be working now, so unless I hit 1024 open files and the hard limit doesn't allow for more, it will hopefully continue to work.

You should probably provide links to quote sources - I assume this is the source.
It is from 5 years ago - a lot has changed.

For example...

(source)

Definitely not. All your mqtt nodes should be using the same server config node. So if you click the dropdown arrow against the Server field you should see only one entry (assuming you have only one broker). So all you should have had to do was to click the pen against the broker, edit it to use localhost (assuming it is running on the same machine) and saved it and deployed. That would have switched all of mqtt nodes as they are all using the same server node, so they will all have changed.

That starts it, but doesn't tell it to start on boot. The default mosquitto install configures that automatically. To manually disable start on boot you can use
sudo systemctl disable mosquitto
and to re-enable it it is obviously
sudo systemctl enable mosquitto

sudo systemctl status mosquitto
will tell you whether it is running.

The systemd command to start/stop/restart is
sudo systemctl start mosquitto or stop or restart. the service start command is a shortcut for that

You have one for each server node you have configured. So if you have set all the mqtt nodes to use the same server node as described above then you have one connection from node-red. If you have other devices connected to mosquitto then they will also have connections. You would need a massive system to get anywhere near the limit. If you run
sudo tail -f /var/log/mosquitto/mosquitto.log
it will show you the end of the log. You might not need the sudo, if you don't use it and it complains about permissions then you do need sudo. Having got the log showing, leave that open and in another terminal run
sudo systemctl restart mosquitto
and in the log you will see it stop mosquitto and then restart it, and all the devices will reconnect, which you should see in the log, which will tell you how many there are.

As I said, I cannot imagine you are going to get anywhere near that unless you have a system with hundreds of devices connected to the broker. If it was working with aedes then it will work with mosquitto.

1 Like

Hi @Colin

Thanks for the prompt reply.

Good reminder on adding links, will bear in mind. Not at desktop now, so can't paste it in, but you have the right idea what you pasted. Thank you!

BUGS - server drop-down in MQTT node
I had some weirdness today in my Master NR instance.
What I reported is what happened... I had to keep re-entering the server details, however time has passed, reboots and what-not later, and I just looked now to see that the drop down is populated.

I should log it under a separate thread, but I have no evidence to support.... I also had an incident with a subflow today.
I changed some code in the subflow, dropped a new subflow on the canvas and the new subflow reflected the change in code (ie via it's output), the old didn't..... lots of testing later and I was 100% sure the old and new were out of sync as I had them side-by-side in the same flow with debugs on each, and they were outputting old output for pre-code change subflow and new output for subflow pasted/dropped post code change. I deleted the old (pre-code change) and replaced it and all is good now... weird and worrying at the same time. I am very accustomed to NR being rock solid.

I wonder if the issue was linked to the Server drop down not working in the MQTT nodes... unfortunately my desire to "fix" out weighted my common sense to export the flow first for analysis. If it happens again, I'll be sure to do an export first.

SERVICE VS SYSTEMCTL
Thank you for the explanation on service vs systemclt. It perplexed me, hence I had to write it in my post. I'm a lot more comfortable now that I have your explanation.

MAX FILE LIMITS
I currently have around 400 device nodes on my NR flows in my Master NR environment. As I migrate, the number may drop a little as I am tidying up flows, so maybe I end up with 300 subflows (each with an MQTT IN and MQTT Out). Does that mean I can assume about 600 connections in total, so still under the 1024, or is it that additional connections are made be MQTT node?

What is a 'device node'?

Absolutely not. What matters is how many config nodes you have for the mqtt broker. Assuming that you have mosquitto and no other brokers on the system then when you click the dropdown against the server field you should see something like this

image

You should see only one broker listed (mine is named Owl). If you see more that one, and they all contain the same data, then you should delete all except one and use that one. That is what I was referring to in the previous post. If you have multiple entries in the dropdown then each one of them uses a connection. The number of actual MQTT In/Out nodes you use is irrelevant, all that matters is the number of config entries in that dropdown. If you have external devices, such as sensors or other bits of kit, that connect to the broker then each one of those will be using a connection.

I had multiple servers setup, due to the "bug" from earlier. Now I have only one:

Thanks for that pro tip!

"What is a 'device node'?"
Yeah, should have explained that better.

I am receiving messages from SLAVE NR environments, each with their own zwave UZB stick.
The messages comes from nodes => physical devices.
A physical device can have multiple virtual devices within it / or sub-devices e.g. a 4 in 1 motion sensor has 4x virtual devices.

I have created a subflow that manages the messages IN and OUT of the NR Master to the specific NR slave. I call these device nodes.

This image of a device node represents one of the two switches from a single physical z-wave device. It is used to control a heating zone

image

Each device node is a subflow and contains 2x MQTT nodes - an IN and and OUT
image

image

So my question above was, if I end up with what-ever number of device (subflow) nodes e.g. like the "Heating Central L3_Gnd Entrance", will each one create 2x connections (1x for each MQTT node) or will there be another number of connections....

I have been looking and the numbers don't tally, but so far it doesn't seem to be an issue:
I currently have:

  • 17 device (subflows) deployed
  • 6 other subflows deployed (with 1x MQTT OUT only)

And this command is showing 29 connections:

sudo netstat -ntp | grep ESTABLISHED.*mosquitto

The numbers do not add up, as the 6x other subflows only have 1x MQTT out, whereas the 17 device subflows have 1x in and 1x out, but the total number isn't too far off the count of devices I've dropped on the canvas.

My concern was only if it multiplied out somehow...

I'll keep an eye on it as I add more devices and see what happens to the count.

Thank you for all your help. It's always appreciated.

Hey @Steve-Mcl
I didn't see the link you pasted when I checked on my phone... yes, that was the reference.

@Colin amazing how many things I missed when reading from my mobile and dealing with children at the same time.

Re-read your post and saw this comment:

[/quote]
"As I said, I cannot imagine you are going to get anywhere near that unless you have a system with hundreds of devices connected to the broker. If it was working with aedes then it will work with mosquitto."

[quote="Colin, post:7, topic:57583"]

Not a like-for-like comparison.

My setup is as follows:

  • a very small number of devices connected using MQTT
  • probably about 400 devices connected using HTTP requests - these are devices connected to a Fibaro controller

My future setup is to have ALL devices connecting using MQTT - all devices will be connected to nr-zwave-js on SLAVE NR environments and report into 1x Master (as I described in the post above). There could be between 300 and 400 of the device subflows across my Master NR environment.

Also, hence the comment I made to @cymplecy in his OP:
"I intend to use MQTT seriously and from what I've read am ready to install it locally."

PS: Thanks for all the detailed notes on systectl. I knew them already from my previous experience, but good to read again and hopefully will help someone else.
The command to view the tail of the log was handy!

It still isn't clear exactly what devices/processes are connected to MQTT. As I have said multiple times, if you only have one broker configured in node red then that will be 1 connection, even if you have 300 in/out nodes they will all use the same connection. If you have multiple node reds running then each will have a connection. If you have other devices using mqtt directly then each of them should use one connection.

What did the mosquito log tell you about connections when you restart node red, as suggested earlier? It should tell you the IP address and client ID of each connection.

For those following this thread, if you have the opportunity to update to node-red 2.2.2, this "close timed out" problem should now be improved. Please let me know if not.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.