MQTT node doesn't reconnect to MQTT broker after restarting flows

My setup:

On my intel NUC device with host OS ubuntu 20.4 I am running docker and in this case I make use of following docker containers:

  • MQTT broker (without security) based on docker image eclipse-mosquitto
  • several node-red containers (based on docker image nodered/node-red:3.0.2-16) which connect to the MQTT broker using the core node red MQTT node with below configuration (so no security):

The problem is that when I restart the Flows using the Node-Red editor for one of my node-red containers then most of the times it is disconnecting from the MQTT broker but NOT reconnecting.

So when it is working I get following expected log output:

niamondo-map-1         | 7 Sep 20:41:16 - [warn] Using unencrypted credentials
niamondo-map-1         | 7 Sep 20:41:16 - [info] Stopping flows
niamondo-map-1         | 7 Sep 20:41:16 - [info] Stopped flows
niamondo-map-1         | 7 Sep 20:41:16 - [info] Starting flows
niamondo-map-1         | 7 Sep 20:41:16 - [info] [worldmap:37add1f3.3f5cce] started at /worldmap
niamondo-map-1         | 7 Sep 20:41:16 - [info] Started flows
niamondo-mqtt-1        | 1662576076: Client nodered_d8aaeaa349a742d9 disconnected.
niamondo-mqtt-1        | 1662576076: New connection from 172.19.0.10:45648 on port 1883.
niamondo-mqtt-1        | 1662576076: New client connected from 172.19.0.10:45648 as nodered_4e0ec337db03012d (p2, c1, k60).
niamondo-map-1         | 7 Sep 20:41:16 - [info] [mqtt-broker:mqtt broker] Connected to broker: mqtt://mqtt:1883
niamondo-map-1         | 7 Sep 20:41:16 - [info] [debug:MQTT send map/state] not_initialized
niamondo-qr-scanner-1  | 7 Sep 20:41:16 - [info] [debug:MQTT receive map/state] not_initialized

but when I restarted it again 20 seconds later I got following unexpected output:

niamondo-map-1         | 7 Sep 20:41:31 - [warn] Using unencrypted credentials
niamondo-map-1         | 7 Sep 20:41:31 - [info] Stopping flows
niamondo-map-1         | 7 Sep 20:41:31 - [info] Stopped flows
niamondo-map-1         | 7 Sep 20:41:31 - [info] Starting flows
niamondo-map-1         | 7 Sep 20:41:31 - [info] [worldmap:37add1f3.3f5cce] started at /worldmap
niamondo-map-1         | 7 Sep 20:41:31 - [info] Started flows
niamondo-mqtt-1        | 1662576091: Client nodered_4e0ec337db03012d disconnected.
niamondo-map-1         | 7 Sep 20:41:31 - [info] [mqtt-broker:mqtt broker] Connected to broker: mqtt://mqtt:1883
niamondo-map-1         | 7 Sep 20:41:31 - [info] [debug:MQTT send map/state] not_initialized

So you can see in the above log file that my MQTT broker container niamondo-mqtt-1 is only reporting a disconnect but doesn't report a (client) connection.
Although the MQTT broker doesn't report a client connection, the node-red container niamondo-map-1 does report that it has connected to the broker which you can also see in the node-red editor:

I am pretty sure that my node-red container is not properly connected to the broker as it is not receiving any MQTT events that are sent by the other container niamondo-qr-scanner-1.

I have found the root cause.

The problem is that it is connecting to a different MQTT broker container that is also running on the same Host OS.

I have opened a shell in my node-red container and ran netstat -a command.

Normally I should get something like:

bash-5.1$ netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 0.0.0.0:1880            0.0.0.0:*               LISTEN      
tcp        0      0 localhost:37205         0.0.0.0:*               LISTEN      
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54064 ESTABLISHED 
tcp        0      0 localhost:1880          localhost:58900         TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54054 TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54062 TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54058 TIME_WAIT   
tcp        0      0 localhost:1880          localhost:58890         TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54060 TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:53612 ESTABLISHED 
tcp        0      0 ac91111b855f:45654      niamondo-mqtt-1.nuc1_default:1883 ESTABLISHED 
udp        0      0 localhost:59880         0.0.0.0:*                           
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path
bash-5.1$ 

but when the problem happened I saw the following:

bash-5.1$ netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 0.0.0.0:1880            0.0.0.0:*               LISTEN      
tcp        0      0 localhost:37205         0.0.0.0:*               LISTEN      
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54066 TIME_WAIT   
tcp        0      0 localhost:1880          localhost:58924         TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54064 TIME_WAIT   
tcp        0      0 localhost:1880          localhost:58934         TIME_WAIT   
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:53612 ESTABLISHED 
tcp        0      0 ac91111b855f:1880       traefik.nuc1_default:54068 ESTABLISHED 
tcp        0      0 ac91111b855f:48928      nuc1-mqtt-1.nuc1_default:1883 ESTABLISHED 
udp        0      0 localhost:59880         0.0.0.0:*                           
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path
bash-5.1$ 

So you see that in the second netstat log a connection has been made with

  • nuc1-mqtt-1.nuc1_default:1883

while in the first netstats a connection has been made with a different host:

  • niamondo-mqtt-1.nuc1_default

So it appears that after restarting the flows the mqtt hostname is suddenly resolved to a different IP address as you see here below:

bash-5.1$ ping mqtt
PING mqtt (172.19.0.3) 56(84) bytes of data.
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=1 ttl=64 time=0.125 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=2 ttl=64 time=0.120 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=3 ttl=64 time=0.120 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=4 ttl=64 time=0.121 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=5 ttl=64 time=0.122 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=6 ttl=64 time=0.133 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=7 ttl=64 time=0.119 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=8 ttl=64 time=0.118 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=9 ttl=64 time=0.120 ms
^C
--- mqtt ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8165ms
rtt min/avg/max/mdev = 0.118/0.122/0.133/0.004 ms
bash-5.1$ ping mqtt
PING mqtt (172.19.0.15) 56(84) bytes of data.
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=1 ttl=64 time=0.110 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=2 ttl=64 time=0.121 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=3 ttl=64 time=0.081 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=4 ttl=64 time=0.120 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=5 ttl=64 time=0.143 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=6 ttl=64 time=0.194 ms
64 bytes from niamondo-mqtt-1.nuc1_default (172.19.0.15): icmp_seq=7 ttl=64 time=0.180 ms
^C
--- mqtt ping statistics ---

So it is not a node-red problem but a docker problem.

I have fixed it by renaming my mqtt docker container into mqtt-broker so that it doesn't clash with existing mqtt docker container.

So ping mqtt-broker will always return the proper IP address which is different from the IP address returned by ping mqtt :

bash-5.1$ ping mqtt-broker
PING mqtt-broker (172.19.0.16) 56(84) bytes of data.
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=1 ttl=64 time=0.211 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=2 ttl=64 time=0.120 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=3 ttl=64 time=0.180 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=4 ttl=64 time=0.194 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=5 ttl=64 time=0.120 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=6 ttl=64 time=0.133 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=7 ttl=64 time=0.121 ms
64 bytes from niamondo-mqtt-broker-1.nuc1_default (172.19.0.16): icmp_seq=8 ttl=64 time=0.120 ms
^C
--- mqtt-broker ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7080ms
rtt min/avg/max/mdev = 0.120/0.149/0.211/0.036 ms
bash-5.1$ ping mqtt
PING mqtt (172.19.0.3) 56(84) bytes of data.
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=1 ttl=64 time=0.316 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=2 ttl=64 time=0.196 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=3 ttl=64 time=0.125 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=4 ttl=64 time=0.196 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=5 ttl=64 time=0.121 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=6 ttl=64 time=0.181 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=7 ttl=64 time=0.133 ms
64 bytes from nuc1-mqtt-1.nuc1_default (172.19.0.3): icmp_seq=8 ttl=64 time=0.178 ms
^C
--- mqtt ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7059ms
rtt min/avg/max/mdev = 0.121/0.180/0.316/0.058 ms
bash-5.1$ 
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.