Moving file only when finished

I have a network folder linked to a FTP server. Files are being dropped in here from another service. These files can get pretty big. I have node-red watching this folder and with a 5 seconds interval it checks the folder and moves everything from it to another folder.

During this move files -sometimes- seem to get corrupted. My feeling is it starts the move when the file isnt fully yet in the source folder.

Is there a way to avoid this? Or atleast check this isnt the case?

Thanks!

Welcome to the forum @Broewn4PS

One way would be to check whether the file is open. How to do that depends on which OS you are using.

When you say you move the file, do you mean you delete the original?

Thanks!

It's running on windows, and yes, once the move has happend it needs to be removed from the FTP so im moving it instead of copying

I don't know how to do it on windows, sorry. You need a way of determining that the transfer to the original folder is complete before you move it. That isn't really a node-red problem. If you knew the maximum time that it might take to write the file then you could just delay the copy/delete operation by that amount of time so you would be reasonably confident that the file write was complete, but that doesn't sound like a good solid solution to me.

Why do you need to move them to a different folder?

Yea I was afraid for that. I upped the timer of how frequently the folder is checked so the files have more time to be copied to the FTP but was looking for a more waterproof solution since it can still break like this. Thanks for the help though.

As for your question; It's part of quite a big flow for a print company, the file needs to go to several stages of checks and fixups and the moving makes it easy for them to see where in the flow the job is. Which they can check in a dashboard.

The files arent small so I cant just leave them else the system will fill up quickly.

Can the service that drops the files there initially create them with a different name (different extension, or with an underbar on the front or something)? If so then if it could write the file then when it is complete rename it to the proper name, then if you move code ignored files with the temporary name then it should be ok. File renaming is an instant action so it should work fine then.

Just a thought, could you not get the created time and only move files of a certain age?

Would that be significantly different to adding a delay between seeing the file written to (via Watch node) and doing the move? That is just a matter of adding a Delay node in the right place.

It may if the file has a significant delay in writting.

is that different than only checking the folder for example every minute?

Is that something like "oh new file", now wait 1min, then copy?

No it more like check files , are they older than x minutes if so move.

1 Like

That is my suggestion, once you decide that you want to move a file then send that message through a Delay node to delay the message 1 minute, at which point the message will be released and the move initiated.

Isn't that also what the Delay node achieves? When the watch node triggers then that means the file has just been created, so if the flow then waits one minute before initiating the move for that file then the create time must be at least one minute ago

1 Like

Final question though

Does a watch node watch the folder or every file specificly.
For example:

  • file X comes in
  • gets noticed by the node, it waits 1minute
  • file Y comes in (lets say 30sec after file X)

will both files be moved when X moves or does file Y wait another 30sec

Not sure, as the delay is not counting from when file is written. assuming the create timestamp is written when file finishes writing.

You must make sure that each file move is delayed from the time at which it is noticed. Presumably at some point in your flow you have nodes that move a file whose name is passed to it. Put the delay in there so that the move for that particular file is delayed as long as you think necessary to guarantee the file write is complete.

The create time of a file is when the file was initially created. You are correct though, it would be marginally better to do it using the time modified (rather than the time created). It depends on how the write is done. If it is via something like FTP, so could take a long time, then testing against the modified time would be better. If the write is a local write and one has an estimate of how long it takes then it probably doesn't matter. The Delay node should be simpler as I assume it should just be a matter of dropping a Delay node in at the appropriate place.

Just tried it and it's working as intended, the higher you make the delay the less you'll have issues but this will never be a bulletproof fix I think (the moment the copy takes longer than your delay), but I think i'll go this route though.

Only annoying bit is if the file get move it triggers the watch node again. Any way around this?

I have the need for the same type of use case, in my case moving pictures and recorded video files. But I do it in a Python script running on a Raspberry Pi and it works great. If you do not find a good solution for Windows, could you not invest in a raspberry pi and let it handle the transfer for you?

Anyway, in Python I use the following code to loop through a directory, check if the file is open or not before moving it. I do not know if this would work also in Windows
EDIT: I should have been able myself to say this will NOT work in Windows...

for f in os.listdir(workdir):
    if f.endswith('.jpg'):
        path = os.path.join(workdir, f)
        if os.path.isfile(path):
            status_code = os.system("lsof -t "+path+" | wc -l")
            if status_code == 0:
                return_code = subprocess.call("sudo mv -f "+path+" /DRIVE", shell=True)

Do you mean that the file delete triggers the watch node? If so then is that a problem?