Client Disconnect but status still connected

Show us where you are adding that please.

Which version of node-red are you using? I notice that you have specified V5 of MQTT, the code for that is fairly new. If you are not using node-red 3.0.2 then perhaps you have not got the latest mqtt code.
If you are not actually using the facilities of MQTT V5 then drop back to V3 and see if it helps (but upgrade node red first if you are not on 3.0.2)

Have you put the keepalive back to 60?

Node-red version is the v3.0.2, node JS v16.19.0, Mosquitto v2.0.12. So there shouldn't be a problem with MQTT 5

it's a property within the payload. Anyway, I'm setting the debug node to the console as you suggested, so I can even compare the pi that works with the one tant not, and show their logs
And yes, I put back the keepalive to 60

This looks suspect to me. MQTT disconnecting after a "job" involving file writes. I suspect you are starving the event queue and the issues you are seeing are a side effect.

  • How big a process is the HistSave? Does it take 2+ minutes to complete? Is the file writing synchronous?
  • Are you looping data / calling fs directly?
  • Can you show us this "history save" flow?
  • Can you say where these CSV files are being written to (on device? across network?)

Also, can you add something to your flows that mark the START and END of this HistSave (even something like a debug message sent to console so that we can see a timestamp and a message like "HistSave Starting" and "HistSave Finished"

I assumed that. However, you said that it is the time the message was sent to the broker. That is obviously not strictly correct. You cannot set the timestamp at the instant it is sent, it must be before it is sent. That is why I would like to see where you are setting that in relation to the MQTT node.

There is still the fact that it is new code, so it may be worth going back to 3 as an experiment if nothing else. @Steve-Mcl may well be on a more fruitful track however.
When you run top is the cpu loading always low?

Yes correct. I wanted to say that I have the timestamp of when tha data is full processed and ready to be sent to the mqtt out node. It's a simple timestamp as property within the payload, So, looking at pi node-red debug (when it leave the last node before the MQTT OUT), looking at the Debian PC node-red debug (when it is received) and knowing that it is an on the fly message (not the file), looking at these timestamp totheter with the debugnode ones, it is easy to understand that it is an old message,, Even because each 10 minutes, if the CLIENT is offline, the data are sent as file. So or the data are sent on the fly, or it is a file, That's why it is strange that I receive data on the fly older than 10 minutes, and this is why i'm wondering about the queue messages. Since I can detect thete there are times when the client is connected for node-red but it is not for the broker, in this situation what happens? where are the messages that reach the MQTT? Are fired and forgot anyway? Or it can happen that they are queued in some way and sent when the client is back really connected to the broker?
this is now an example of the debug that I'm sendng to the consol, so that it will be easy to compare with the ones received on the debain PC

21 Apr 00:23:39 - [info] [debug:MQTT OUT]
{
  topic: 'Area1',
  payload: {
    Data_Timestamp: '2023-04-21 00:23:39',
    data: '1682029419000......'
  },
  _msgid: '58c58395e116b747'
}

I'll try this, too

Till now node-red hasn't stopped again, but I see that it starts rising when the the "MQTT queue" increase

I don't think so, maybe it's just a coincidence since there isn't much more within the log. Anyway let's check it

20 Apr 17:50:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 17:50:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 17:50:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:00:20 - [debug] [cronplus:2 Old_in_DB] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:00:20 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:00:20 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:04:29 - [trace] utils.writeFile - written content to /home/pi/.node-red/.sessions.json.$$$
20 Apr 18:04:29 - [trace] utils.writeFile - renamed /home/pi/.node-red/.sessions.json.$$$ to /home/pi/.node-red/.sessions.json
20 Apr 18:04:29 - [trace] comms.close McFxwjCOPdr1de1PsnlQQXA0GQ3g90T107WaIswXIzw=
20 Apr 18:10:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:10:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:10:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:20:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:20:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:20:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:30:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:30:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:30:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:30:20 - [debug] [cronplus:2 Old_in_DB] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:30:20 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:30:20 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:40:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:40:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:40:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:41:34 - [trace] utils.writeFile - written content to /home/pi/.node-red/.sessions.json.$$$
20 Apr 18:41:34 - [trace] utils.writeFile - renamed /home/pi/.node-red/.sessions.json.$$$ to /home/pi/.node-red/.sessions.json
20 Apr 18:41:34 - [trace] comms.open nsMIs4p7+n3Q80M/y+SgRnN0nst98IsublVFKc2HM64=
20 Apr 18:44:32 - [trace] utils.writeFile - copied /home/pi/.node-red/flows.json TO /home/pi/.node-red/.flows.json.backup
20 Apr 18:44:32 - [trace] utils.writeFile - written content to /home/pi/.node-red/flows.json.$$$
20 Apr 18:44:32 - [trace] utils.writeFile - renamed /home/pi/.node-red/flows.json.$$$ to /home/pi/.node-red/flows.json
20 Apr 18:50:00 - [debug] [cronplus:5 HistSave] running 'schedule1' ~ 'topic1'
now time Thu Apr 20 2023 18:50:00 GMT+0200 (Ora legale dell’Europa centrale)
crontime Thu Apr 20 2023 18:50:00 GMT+0200 (Ora legale dell’Europa centrale)
20 Apr 18:52:48 - [info] [mqtt-broker:Client] Disconnected from broker: Local_1@mqtts://www.myexample.com:12345

It's just a node that takes data from context and writes them on a file using read, write, and csv nodes. The full process is synchronous; fs is used to check if the file already exist and delete it. It's not used for.writing. Moreover, there is a gate at the beginning of the process so that only one data per time can be processed while other incoming data are eventually dropped till the full proces is ended and message from yhe last node sent to the MQTT OUT, or stored.
Basically, if there is no MQTT connection data is directly stored on a file. In the picture you can see the line managing the gate, the one that operate on the storing file, and the last one that sends messages on the fly.
The file is written directly on the pi. The proces itself is very simple and quick, as you can see. I stopped the MQTT connection fpr more then 10 minutes so it is the real situation. The problem is that I cannot make this test on the pi that create problem since it's not easy to reach and working remotly via GUI is very slow

image

[{"id":"fc0e244e5b25b0a2","type":"cronplus","z":"e089622.6a906a","name":"HistSave","outputField":"payload","timeZone":"","persistDynamic":false,"commandResponseMsgOutput":"output1","outputs":1,"options":[{"name":"schedule1","topic":"topic1","payloadType":"date","payload":"","expressionType":"cron","expression":"20 0/30 * * * * *","location":"","offset":"0","solarType":"all","solarEvents":"sunrise,sunset"}],"x":900,"y":2680,"wires":[["a9871644bf5490c3","86c0a19ca86d0118"]]},{"id":"a9871644bf5490c3","type":"change","z":"e089622.6a906a","name":"HistSave","rules":[{"t":"set","p":"HistSave","pt":"flow","to":"true","tot":"bool"}],"action":"","property":"","from":"","to":"","reg":false,"x":1080,"y":2680,"wires":[[]]},{"id":"86c0a19ca86d0118","type":"debug","z":"e089622.6a906a","name":"SET HISTFILE","active":false,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":1100,"y":2640,"wires":[]},{"id":"a726167b8aff9dbd","type":"debug","z":"e089622.6a906a","name":"START histfile","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":340,"y":2680,"wires":[]},{"id":"19990da291780865","type":"debug","z":"e089622.6a906a","name":"END histfile","active":false,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":1030,"y":2760,"wires":[]},{"id":"797520538eb79d75","type":"function","z":"e089622.6a906a","name":"context  / file","func":"var data = flow.get(\"LastData\");\nvar msg1={\"from_node\": node.name, \"payload\":true};\nvar msg2={\"filename\":global.get(\"FilePath\")};\nvar msg3={};\ndata.splice(0, FunctionSent(data));\nif (flow.get(\"FromFile\") == false ){\n    if(fs.existsSync(global.get(\"FilePath\"))){fs.unlinkSync(global.get(\"FilePath\"))}\n    msg3={\"payload\":data};\n    return [null, null, msg3];\n} else if (flow.get(\"HistSave\") == true){\n    if(fs.existsSync(global.get(\"FilePath\"))){\n        return [null, msg2, null];\n    } else {\n        return [msg1, null, null];\n    }\n}\n","outputs":3,"noerr":0,"initialize":"FunctionSent = function(my_data) {\n    let ToDelete = false, i = 0;\n    sentData = RED.util.cloneMessage(flow.get(\"sentData\"));\n    for (i = 0; i < my_data.length; i++) {\n        if (my_data[i].col1 <= sentData){\n            ToDelete = true;\n            continue;\n        } else {\n            break\n        }\n    }\n    return ToDelete == true ? i : 0;\n}\n","finalize":"","libs":[{"var":"fs","module":"fs"}],"x":360,"y":2840,"wires":[["9d53bd7c67c1a81d"],["220e4790c8ae3ad3"],["af3ac78ce440ce62"]]},{"id":"07f9a6d11f7c2393","type":"csv","z":"e089622.6a906a","name":"CSV","sep":";","hdrin":"","hdrout":"none","multi":"mult","ret":"\\r\\n","temp":"col1, col2","skip":"0","strings":true,"include_empty_strings":false,"include_null_values":false,"x":810,"y":2760,"wires":[["99c7f4fce49ed5fb"]]},{"id":"99c7f4fce49ed5fb","type":"file","z":"e089622.6a906a","name":"Update file","filename":"filename","filenameType":"msg","appendNewline":false,"createDir":true,"overwriteFile":"true","encoding":"none","x":850,"y":2820,"wires":[["84e0f79be18559cf"]]},{"id":"84e0f79be18559cf","type":"function","z":"e089622.6a906a","name":"Manage gate","func":"msg={\"from_node\": node.name, \"payload\":true}; //Trigger per la gestione led gate PLC\nreturn [msg]","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":1030,"y":2820,"wires":[["19990da291780865","8983ed4eb0e17c8a"]]},{"id":"9d53bd7c67c1a81d","type":"link out","z":"e089622.6a906a","name":"Last A","mode":"link","links":["c931b510e9e66094","fe95ef84946ef375","f8def99beda3db41"],"x":545,"y":2700,"wires":[]},{"id":"fad83934235d97a5","type":"comment","z":"e089622.6a906a","name":"manage gate","info":"","x":650,"y":2700,"wires":[]},{"id":"2f5b99612425cbc6","type":"function","z":"e089622.6a906a","name":"Send MQTT","func":"const msg1={\"payloda\":true};\nvar msg2a = null;\nvar msg2b = null;\nif (msg.payload.length !=0){\n    var my_value = \"\";\n    for (var i = 0; i < msg.payload.length; i++) {\n        my_value = my_value + msg.payload[i].col2 + \",\";\n    }\n    msg2a={\n        \"topic\": env.get(\"ep_ExTpData\"),\n        \"retain\": false,\n        \"qos\": 0,\n        \"payload\":{\n            \"Data_Ora\":\"Last_in_DB\",\n            \"value\": my_value.slice(0, -1),\n            \"Date\":{\n                \"From\": msg.payload[0].col1,\n                \"To\": msg.payload[msg.payload.length-1].col1\n            }\n        }\n    }\n    if (flow.get(\"AlmOnState\")[2] == true){\n        msg2b={\n        \"topic\": env.get(\"ep_ExTpData\"),\n        \"retain\": false,\n        \"qos\": 0,\n            \"payload\":{\n                \"Data_Ora\":\"Alm_on_state\",\n                \"value\": flow.get(\"AlmOnState\")[1],\n                \"Date\":flow.get(\"AlmOnState\")[0]\n            }\n        }\n    }\n    msg2 = [msg2a,msg2b];\n}\nreturn [msg1,[msg2a,msg2b]];\n","outputs":2,"noerr":0,"initialize":"","finalize":"","libs":[],"x":850,"y":2920,"wires":[["84e0f79be18559cf"],["afa2cb8ce5dd2421"]]},{"id":"afa2cb8ce5dd2421","type":"link out","z":"e089622.6a906a","name":"Last ToMQTT","mode":"link","links":["22fca936969fa947"],"x":1075,"y":2920,"wires":[]},{"id":"62098cb1521ea8aa","type":"comment","z":"e089622.6a906a","name":"to MQTT","info":"","x":1110,"y":2880,"wires":[]},{"id":"7664078960494408","type":"q-gate","z":"e089622.6a906a","name":"Gate","controlTopic":"control","defaultState":"open","openCmd":"open","closeCmd":"close","toggleCmd":"toggle","queueCmd":"queue","defaultCmd":"default","triggerCmd":"trigger","flushCmd":"flush","resetCmd":"reset","peekCmd":"peek","dropCmd":"drop","statusCmd":"status","maxQueueLength":"360000","keepNewest":true,"qToggle":false,"persist":false,"x":330,"y":2760,"wires":[["797520538eb79d75","a726167b8aff9dbd"]]},{"id":"3d9b435c8ce20c87","type":"link in","z":"e089622.6a906a","name":"PLANTdata","links":["59362d098318adeb","66b53d0e5cdd96bc","e61cc66de2163dc0","5f4c44ec635e7587","fbdfd203677976ac"],"x":155,"y":2760,"wires":[["7664078960494408"]]},{"id":"05c13f85ac3e389b","type":"comment","z":"e089622.6a906a","name":"message on the fly","info":"","x":590,"y":2940,"wires":[]},{"id":"032682ee48e06165","type":"comment","z":"e089622.6a906a","name":"storing data","info":"","x":610,"y":2880,"wires":[]},{"id":"10e3f50ed1de6638","type":"function","z":"e089622.6a906a","name":"Manage data","func":"var data = msg.payload;\nconst to_delete = FunctionSent(data);\nif (to_delete != 0){\n    data.splice(0, to_delete);\n    const msg1={\n        \"payload\":data,\n        \"filename\":global.get(\"FilePath\")\n    };\n    return [msg1,null];\n} else {\n    msg2={\"payload\":data};\n    return [null,msg2];\n}\n","outputs":2,"noerr":0,"initialize":"FunctionSent = function(my_data) {\n// verifica in flow.get(\"LastDataPLC\") la presenza dell'ultimo dato ricevuto dal server\n    let ToDelete = false, i = 0;\n    for (i = 0; i < my_data.length; i++) {\n        if (my_data[i].col1 <= flow.get(\"sentData\")){\n            ToDelete = true;\n            continue;\n        } else {\n            break\n        }\n    }\n    return ToDelete == true ? i : 0;\n}\n","finalize":"","libs":[],"x":630,"y":2840,"wires":[["07f9a6d11f7c2393"],["2f5b99612425cbc6"]]},{"id":"77dc0f51f928d6ac","type":"csv","z":"e089622.6a906a","name":"CSV","sep":";","hdrin":"","hdrout":"none","multi":"mult","ret":"\\r\\n","temp":"","skip":"0","strings":true,"include_empty_strings":"","include_null_values":"","x":610,"y":2800,"wires":[["10e3f50ed1de6638"]]},{"id":"220e4790c8ae3ad3","type":"file in","z":"e089622.6a906a","name":"Read","filename":"filename","filenameType":"msg","format":"utf8","chunk":false,"sendError":false,"encoding":"none","allProps":false,"x":610,"y":2760,"wires":[["77dc0f51f928d6ac"]]},{"id":"af3ac78ce440ce62","type":"junction","z":"e089622.6a906a","x":520,"y":2920,"wires":[["2f5b99612425cbc6"]]},{"id":"8983ed4eb0e17c8a","type":"junction","z":"e089622.6a906a","x":1180,"y":2740,"wires":[["9d53bd7c67c1a81d"]]}]

What sort of network connection is it?

What is the network path between the pi and the machine running the broker?
If you connect to the pi are you able to connect (SSH) from there to the machine running the broker? If so, does it give you a terminal with a good keyboard response? Echoing characters instantly and running commands ok? I am wondering whether you have an extremely slow connection so messages are queuing up on the input to the mqtt node.

Hi, I have made more testing, even via SSH and yes, as suspected the connection is really bad and this is probably the cause. I tried using MQTT v 3.1 but haven't helped. The slow connection seems to create this huge queue into the mqtt out node. So, from the debug and the log I see the message as correctly sent, but actually it isn't, and this is why only after a while I can see the message passing through the broker. What I don't know is if this behaviour can even cause the fatal error that crash node-red. This is for the main problem.
But still there is something I don't understand, and this happens on several raspberry even with good internet connection. As shown in the previous messages, sometimes the broker sees the client disconnected and I receive the will message. But for node-red it is still connected. Only after a while (it can even reach 40-50 minutes) node-red see the client as disconnected from the broker, so it tries and gets the connection back. Now the keepalive is set to 60s.
Has someone else experienced this behaviour? I even tried with a simple flow that does nothing but send a random number each minutes. This always happens

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.