I've been using node-red to monitor data from a few dozen machines in an industrial plant. Some of these machines are loaders that get the material from a carrier rack and put it into a production line, or unloaders that get the material from the line and put it into a carrier rack. We also have 3 smart storage units that store both full and empty carrier racks.
We have some AGVs (mobile robots) that handle full and empty carrier racks between the lines and smart storage. The AGVs are controlled by their own Fleet Manager that can receive instructions from a REST API. The control program by the AGV integrator is unstable, and the programmer will quit the company at the end of this month, so we're not sure if we'll get any more support after that. Since the data-collection application was successful, I started building a parallel AGV control system ialso using Node-red.
So far, I started by using the node-red-contrib-finite-statemachine node to create the states for a single AGV throughout a job. There are basically three types of jobs: go home, go charge, or a put/get job.
- Go home just involves moving to the home position and wait in an idle state.
- Go charge involves going to the charger, plug in, wait for the charge to be finished, and detach from the charger.
- Get/Put job involves navigating to a machine port, docking, transferring a carrier rack from machine to robot or vice-versa, and undocking.
Therefore, taking a rack from a machine to the storage is a get job plus a put job. And since the empty port needs to be refilled, a full cycle is actually 4 jobs: GET a full rack, PUT the full rack into storage, GET an empty rack, and PUT it into the line so it can be filled up next.
This, of course, is from the perspective of a single robot. The layer on top must be a dispatcher system that monitors the machine statuses and creates the jobs as needed.
And here comes my dilemma: I can create the jobs as messages, put them in a queue for each robot, and once a robot has finished one of the partial jobs, unqueue them and go to the next one. This is a handy option, because if there is an error or malfunction of some sort in the robot (the battery dies or the WiFi network has an error), I know exactly which job it was doing, so once the robot is back online, I can relay it again to the robot and pick it up from there.
However, let's say there is a crash in the node-red server, or that the server needs rebooting because of some updates being installed, network maintenance, or flow redeployment. If I didn't back-up the queues, I will lose all the job information, and chaos will be ensued. The current application is using a MSSQL database to track the jobs, but they missed some of the parameters and the job tracking is lousy, in general.
I am using a MySQL DB to log the commands being sent to the robots for logging purposes, so I could use the same DB to back up all the messages sent to the queue and delete the entry when a job is finished.
So far, the idea I had is as follows:
- Store the job queues directly in JSON format with a timestamp and a code as an ID when they are generated.
- Whenever a queue is updated, I overwrite the whole entry with the updated queue.
- if something happens, upon powering up, the node-red flow will start up by reading if there are any job queues for the AGVs and load them into memory, then reset all the FSMs to the initial state, and let them load their job queues, and repeat the last assignment if it was unfinished.
I think this approach would work, but might be too convoluted. Can anyone think of a more straigntforward method?