Have a flow that 'watches' a device on a given (TCP) IP and Port. If the device is turned off (or even turned on later), my flow knows via a successful or failed ping. All good. However, the TCP IN and TCP OUT nodes never changes state... once connected always shows connected. I have stop and start NR, or happen to do a deploy (that resets the current flow), to see the TCP nodes to change state. This seems odd given the MQTT Publish and Subscribe nodes dynamically change if the broker disappears.
So is this a bug or design constraint? Is there a way I can get the TCP IN and TCP OUT nodes to change state more like the MQTT nodes?
If the device comes back online... the TCP IN and OUT nodes do not attempt to reconnect, and I can send messages and nothing happens because the TCP OUT node (as far as I can tell) not really connected, or never sending data successfully. I never see expected response to the TCP IN node. Very much a "Lights are on, but no one is home" scenario?
Is there a way I can via flow logic get the nodes to update connection state? Reconnect?
Update... after the given device was power off for a few minutes, checked the flow TCP IN and TCP OUT nodes... and the TCP IN node finally showed (I think) 'attempting' and then a few seconds later 'disconnected'. The TCP OUT node is still showing connected... 15 minutes later. And the TCP OUT is still show connected.
But do not see a obvious way to get TCP OUT reconnect done, and get data moving via TCP OUT. The TCP IN node does seem to reconnect once device is found again, but since the TCP OUT node is, ah, stuck, no joy?
Forgot to mention... Use closing the TCP OUT node after each message a possible work around, but... feels like a hammer to solve the issue. And..
There is a second or more delay if I use the close connection after each message feature, and the actual state never changes from 'connected' as well, which definitely seems like a bug in of its self.
I tried to simulate a timeout scenario using msg.reset = true send if my ping failed, then later was successful, so sending msg.reset = true to the TCP OUT node, again, the node did not ever change state that I could see, and the communication failed to resume as expected, as does on NR restart or deploy restarts the flow.
Really does seem the solution is to have TCP OUT connection timeout, then have an option to have the TCP OUT node attempt reconnect if previously timed out state. TCP IN node seems to do something comparable now.
The TCP nodes can only respond to events they see on the wire. Once connected if you send nothing then you will get nothing in reply - so if the far end goes away without closing the connection cleanly (IE sending a FIN or RST packet) then we have no way of knowing anything so it still thinks we are connected - If the client does then try to send something - it will then be down to the underlying OS TCP timeout which can be several hours if set at all. If the client does meanwhile try to send something - then it may see the retry-timeout and thus detect it can't send anything... it which case it can know.
Just checking the code - I see we have implemented a 2 minute keep alive timer so in fact it should be checking for us, and indeed if I set up a simple test to another machine and just unplug it to break the connection I do see in the console log a TIMEOUT error and then it goes into trying to reconnect and sets the status appropriately - can you show us what your console log looks like while you are seeing this behaviour - and maybe share a minimal flow so we can check settings to replicate ? And which versions of Node-RED and nodejs you are using.
What is the code logic... that happens when NR is restarted? For the sake of argument, the connect is recreated right? Regardless of the current state? The msg.reset would seem to be a way to force the TCP OUT node to reset/reconnect, but that does not seem to work, in that when I send msg.reset the TCP OUT does nothing, i.e. status just reflects 'connected' not disconnected then connected for example.
That is where I see the bug or such. Or am I misunderstanding how the msg.reset = true is supposed to work? I can setup flow logic to ping for the device, and if the ping fails, for a given time frame, then is later successful, msg.reset = true sent to the TCP OUT would I think do exactly what I need... TCP OUT disconnect, then establish a new connection.. regardless of the current state, no? That would seem to be consistent with what you outlined above.
And yes, I will check the logs, and see if I can create a flow illustrating the scenario. Could you take a look at how msg.reset is implemented or works? Specific to TCP OUT node? I should have some time this coming weekend to do this.
The TCP out node only uses msg.reset if it in Reply-To mode - as you haven't provided any flow or config I can't tell how you are using it, as your description at the start implied that you were connecting out to a device rather than listening and responding. - so yes - please provide an example flow.
As explained a) we already should timeout after 2 minutes - so I'd like to understand that more. and b) as we do (or should) already detect the disconnect there was no reason to need msg.reset for the simple out node.
That's not to say thing can't be improved - but need a few more clues first.
If you are just using a ping test, have you thought about using a different node?
I have used this one in the past with success:
Quick question... is the 2 minute time out application to TCP OUT as well as TCP RESPONSE node?
I am using TCP OUT and TCP IN explicitly because the device I am communicating with does not provide any communication on power up. I really wish it did.
It does provide a response as commands are sent, so TCP RESPONSE would be applicable in all use cases but device power up, and it is only when device is powered down, and later powered up that TCP OUT fails to send data post power down then power up.
When I am using TCP OUT I have it in Connect-To and not in Reply-To mode. Humm.