Using the below flow for guaranteed delivery with FileStore backing. Over time, it gets stuck with state being "waitingForOKFail" and was never able to recover from this. This happens for no reason that I couldn't triangulate - there is no mqtt server failure.
Anyone has faced this issue before - any likely way out of this. Can this be because of file lock issue or something like that?
It should not be able to get stuck. What is the mqtt complete node?
You could install the flogger node and log the messages and see how it gets stuck.
In the meantime you can manually release it by injecting a Fail message with an inject node.
Thanks Colin. It did release all the messages when I injected some timestamp. But I have no clue as to how that happened.
Is it because it was waiting for either Ok or Fail - and in this case Fail (control=fail) helped move it? I saw it so?
I used the flogger to backup the messages. I will try adding another logger purely for debug to find out why it is getting stuck.
Thanks a bunch.
Wanted the data backup in one log file and all debug in another file so that it can be handled separately and easily with different policies.
Yes. A while back I had looked at it - I forgot some of the details. Again I looked at the subflow template (state engine) yday to understand how triggering a fail message would process the pending ones.
Triggering a fail tells it the the previous one failed, so it tries it again, after the timeout you specified.
I am not sure what you are saying about flogger. From my point of view the purpose is to record what is happening so that you can look back and see why it got stuck. Can you export and post here the flow please, the nodes around the subflow. I want to make sure it is correct.