Node-red crash with "node-red-node-tail"

mariomol · 27 September 2023 14:06

Hi, I'm trying to build an industrial application to collect data from 20 machines in a production line for a medical device manufacturer company, each machine has a Windows PC with a "Log.txt" where the last line of this file contains the current machine status, reading with the "node-red-node-tail" v0.4.0, then a Function node is used to decode the message and report machine running, stop, Lot Number, operator, etc. in the dashboard. The server is also a Windows 10 PC with Node-red v3.0.2.

Everything work fine for few hours and then Node-red crashes, every time I run, with this error:

27 Sep 09:16:14 - [red] Uncaught Exception:
27 Sep 09:16:14 - [error] Error: ECONNRESET: connection reset by peer, watch
at FSEvent.FSWatcher._handle.onchange (node:internal/fs/watchers:207:21)
C:\WINDOWS\system32>

Question, how can I troubleshoot or correct this error?

Thank you for the help!

TotallyInformation · 27 September 2023 14:50

How big is the file? I see that you are accessing the file on what is presumably a shared network drive (Y:)? Is that connection reliable? What device is that drive on - is that device reliable?

Also, does the connection recover? If so, I think you can work around the issue.

Colin · 27 September 2023 15:01

Whatever the cause (which is likely a network or device problem as @TotallyInformation suggests) it should not crash node red. Please upgrade to node-red 3.1.0 and make sure it still happens (which I expect it will) then submit an issue describing the problem at Issues · node-red/node-red · GitHub

mariomol · 27 September 2023 15:30

TotallyInformation Thank you for the replay. We have 20 PCs (Windows 10) on an industrial wired network (1 GB, very reliable) with shared network drives. The "Log.txt" file is 416KB (6,561 lines of short messages). We don't have any issue at all with the computers or network, I always can see and access the shared drives. After the Node-red crashes, can restart it and everything comes back to live for few hours, then stop again.

Thank you!

mariomol · 27 September 2023 15:36

Colin Thank you for the response, will try to upgrade to the 3.1.0 version and report the findings, but as you mention, I think will be the same.

Colin · 27 September 2023 16:10

Have you configured the node red service to auto-restart on crashing? That would at least ensure the system starts up again automatically.

TotallyInformation · 27 September 2023 16:10

There appears to be an outstanding bug that may be related to this:

github.com/node-red/node-red-nodes

node-red-node-tail causes node-red termination after size check for /var/log/messages error: ENOENT: no such file or directory, stat '/var/log/messages'

opened 11:44AM - 10 Apr 21 UTC

djiwondee

### Which node are you reporting an issue on? `node-red-node-tail` is used in this flow for displaying error messages in a node-red-dashboard <img width="1049" alt="image" src="https://user-images.githubusercontent.com/37173958/114268073-81e65c00-99ff-11eb-8a3a-1918613d5531.png"> ``` [{"id":"e84c1031.b43f2","type":"tail","z":"d14800ce.426d4","name":"","filetype":"text","split":"true","filename":"/var/log/messages","inputs":0,"x":130,"y":1140,"wires":[["15fbcaac.5e0b35"]]},{"id":"15fbcaac.5e0b35","type":"switch","z":"d14800ce.426d4","name":"Filter Errors","property":"payload","propertyType":"msg","rules":[{"t":"cont","v":"Error","vt":"str"}],"checkall":"true","repair":false,"outputs":1,"x":330,"y":1140,"wires":[["34b6985e.3703f","62b4aeaa.115f78"]]},{"id":"34b6985e.3703f","type":"counter","z":"d14800ce.426d4","name":"Count Errors","init":"0","step":"1","lower":"","upper":"","mode":"increment","outputs":2,"x":510,"y":1140,"wires":[["ec3d71c8.81c3c8","990394e1.b136a"],["fdf03a43.c8ac2","59598d5a.a91a9c"]]},{"id":"ec3d71c8.81c3c8","type":"ui_gauge","z":"d14800ce.426d4","name":"CCU Fehler","group":"b2a5740.e3c2f1","order":1,"width":6,"height":4,"gtype":"gage","title":"Anzahl","label":"Fehler","format":"{{value}}","min":0,"max":"500","colors":["#00b500","#e6e600","#ca3838"],"seg1":"125","seg2":"250","x":690,"y":1080,"wires":[]},{"id":"fdf03a43.c8ac2","type":"function","z":"d14800ce.426d4","name":"Rotate Entries","func":"var dashboardLog = context.get('dashboardLog')|| [];\n \ndashboardLog.push(msg);\nif (dashboardLog.length > 20) {\n // Delete oldest message if > 20\n dashboardLog.shift();\n dashboardLog.length = 20;\n} \n\nif (msg.resetlog) {\n dashboardLog = [];\n}\n \n// store the value back\ncontext.set('dashboardLog',dashboardLog);\n \n// make it part of the outgoing msg object\nmsg = {};\nmsg.payload = dashboardLog;\nreturn msg;","outputs":1,"noerr":0,"x":700,"y":1200,"wires":[["f3be4f6c.225958"]]},{"id":"59598d5a.a91a9c","type":"join","z":"d14800ce.426d4","name":"Prepare Message","mode":"custom","build":"string","property":"payload","propertyType":"msg","key":"topic","joiner":"\\r\\n","joinerType":"str","accumulate":false,"timeout":"","count":"5","reduceRight":false,"reduceExp":"","reduceInit":"","reduceInitType":"num","reduceFixup":"","x":710,"y":1140,"wires":[[]]},{"id":"f3be4f6c.225958","type":"ui_template","z":"d14800ce.426d4","group":"97aaa7e1.c96128","name":"Error Dashboard Log","order":1,"width":12,"height":7,"format":"<ul>\n <li ng-repeat=\"x in msg.payload\">\n <p style=\"color:red\">{{x.topic}}</p>\n <ul>\n <li><p style=\"font-size:11px\">{{x.payload}}</p></li>\n </ul>\n </li>\n</ul>","storeOutMessages":true,"fwdInMessages":true,"resendOnRefresh":false,"templateScope":"local","x":940,"y":1200,"wires":[[]]},{"id":"9688061b.b2dc48","type":"ui_button","z":"d14800ce.426d4","name":"Reset Error Count","group":"b2a5740.e3c2f1","order":5,"width":0,"height":0,"passthru":false,"label":"Zurücksetzen","tooltip":"","color":"","bgcolor":"","icon":"mi-clear","payload":"true","payloadType":"bool","topic":"","x":130,"y":1080,"wires":[["229fc580.c38dc2"]]},{"id":"229fc580.c38dc2","type":"change","z":"d14800ce.426d4","name":"Set Reset Message","rules":[{"t":"set","p":"reset","pt":"msg","to":"true","tot":"bool"},{"t":"set","p":"topic","pt":"msg","to":"Reset Error Count","tot":"str"}],"action":"","property":"","from":"","to":"","reg":false,"x":350,"y":1080,"wires":[["34b6985e.3703f"]]},{"id":"83893482.d56ca8","type":"ui_button","z":"d14800ce.426d4","name":"Clear DashLog","group":"97aaa7e1.c96128","order":2,"width":0,"height":0,"passthru":false,"label":"Protokoll leeren","tooltip":"Anzeige der letzten 20 gespeicherten Fehler löschen","color":"","bgcolor":"","icon":"mi-delete_sweep","payload":"true","payloadType":"bool","topic":"","x":120,"y":1200,"wires":[["ad9da38f.553c98"]]},{"id":"ad9da38f.553c98","type":"change","z":"d14800ce.426d4","name":"Set Reset Message","rules":[{"t":"set","p":"resetlog","pt":"msg","to":"true","tot":"bool"}],"action":"","property":"","from":"","to":"","reg":false,"x":350,"y":1200,"wires":[["fdf03a43.c8ac2"]]},{"id":"990394e1.b136a","type":"ui_text","z":"d14800ce.426d4","group":"3917802.e83608","order":2,"width":0,"height":0,"name":"","label":"CCU Fehler:","format":"{{value}}","layout":"row-left","x":690,"y":1040,"wires":[]},{"id":"62b4aeaa.115f78","type":"debug","z":"d14800ce.426d4","name":"Filtered Errors","active":false,"tosidebar":true,"console":false,"tostatus":true,"complete":"payload","targetType":"msg","statusVal":"payload","statusType":"auto","x":700,"y":1260,"wires":[]},{"id":"b2a5740.e3c2f1","type":"ui_group","name":"Fehler","tab":"a46e92f.58191f","order":6,"disp":true,"width":"6","collapse":false},{"id":"97aaa7e1.c96128","type":"ui_group","name":"Fehlermeldungen","tab":"a46e92f.58191f","order":7,"disp":true,"width":"12","collapse":false},{"id":"3917802.e83608","type":"ui_group","name":"Ereignisse","tab":"4e9b13b2.12840c","order":6,"disp":true,"width":"6","collapse":false},{"id":"a46e92f.58191f","type":"ui_tab","name":"System","icon":"settings","order":10,"disabled":false,"hidden":false},{"id":"4e9b13b2.12840c","type":"ui_tab","name":"Home","icon":"home","order":1}] ``` ### What are the steps to reproduce? There seems not to be an easy way to reproduce. I experience this problem several times with no visible cause. The only guess I have could be the daily rotation of the /var/log/messages file executed by a cron job. But this happens at a different time (each Midnight) as the crash of the node-red daemon process. May be a simultaneous access by a second process, since the relevant file is used for logging messages from the system running node-red. ### What happens? `node-red-node-tail` log a node-red error, after that the node-red daemon terminates: ``` Apr 9 01:56:12 {{HOSTNAME}} node-red[7898]: [tail:e84c1031.b43f2] size check for /var/log/messages failed: Error: ENOENT: no such file or directory, stat '/var/log/messages' Apr 9 01:56:12 {{HOSTNAME}} node-red: 9 Apr 01:56:12 - [red] Uncaught Exception: Apr 9 01:56:12 {{HOSTNAME}} node-red: 9 Apr 01:56:12 - Error: ENOENT: no such file or directory, stat '/var/log/messages' Apr 9 01:56:12 {{HOSTNAME}} node-red: at Object.statSync (fs.js:1086:3) Apr 9 01:56:12 {{HOSTNAME}} node-red: at Tail.latestPosition (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:74:23) Apr 9 01:56:12 {{HOSTNAME}} node-red: at Tail.change (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:118:22) Apr 9 01:56:12 {{HOSTNAME}} node-red: at Tail.watchEvent (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:183:18) Apr 9 01:56:12 {{HOSTNAME}} node-red: at FSWatcher.<anonymous> (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:143:101) Apr 9 01:56:12 {{HOSTNAME}} node-red: at FSWatcher.emit (events.js:315:20) Apr 9 01:56:12 {{HOSTNAME}} node-red: at FSEvent.FSWatcher._handle.onchange (internal/fs/watchers.js:186:12) Apr 9 01:56:12 {{HOSTNAME}} node-red: Node-RED exited with non-zero exit status 1 ... Apr 9 12:15:48 {{HOSTNAME}} node-red[24971]: [tail:e84c1031.b43f2] size check for /var/log/messages failed: Error: ENOENT: no such file or directory, stat '/var/log/messages' Apr 9 12:15:49 {{HOSTNAME}} node-red: 9 Apr 12:15:49 - [red] Uncaught Exception: Apr 9 12:15:49 {{HOSTNAME}} node-red: 9 Apr 12:15:49 - Error: ENOENT: no such file or directory, stat '/var/log/messages' Apr 9 12:15:49 {{HOSTNAME}} node-red: at Object.statSync (fs.js:1086:3) Apr 9 12:15:49 {{HOSTNAME}} node-red: at Tail.latestPosition (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:74:23) Apr 9 12:15:49 {{HOSTNAME}} node-red: at Tail.change (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:118:22) Apr 9 12:15:49 {{HOSTNAME}} node-red: at Tail.watchEvent (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:183:18) Apr 9 12:15:49 {{HOSTNAME}} node-red: at FSWatcher.<anonymous> (/usr/local/addons/redmatic/var/node_modules/node-red-node-tail/node_modules/tail/lib/tail.js:143:101) Apr 9 12:15:49 {{HOSTNAME}} node-red: at FSWatcher.emit (events.js:315:20) Apr 9 12:15:49 {{HOSTNAME}} node-red: at FSEvent.FSWatcher._handle.onchange (internal/fs/watchers.js:186:12) Apr 9 12:15:49 {{HOSTNAME}} node-red: Node-RED exited with non-zero exit status 1 ``` ### What do you expect to happen? Even if the log file is not accessible by node-red-node-tail, the error should be caught and not causing termination of node red daemon. There is no indication of deletion of the message log file, neither by a process nor manually while node-red is running. ### Please tell us about your environment: - [X] Node-RED version: 1.2.9 - [X] node.js version: 14.16.0 - [X] npm version: 5.6.0 - [X] Platform/OS: Linux 4.14.34 arm LE - [X] Browser: Safari, Chrome, Firefox on macOS 11.2.3

The issue is a crash of the tail.js external library. However, the node's runtime should have a try/catch around the use of Tail:

github.com

node-red/node-red-nodes/blob/master/storage/tail/28-tail.js#L20-L25


      
          if (node.filetype === "text") {
              node.tail = new Tail(node.filename,{separator:node.split, flushAtEOF:true});
          }
          else {
              node.tail = new Tail(node.filename,{separator:null, flushAtEOF:true, encoding:"binary"});
          }

So that is a bug anyway and should be reported to that repo.

Of course, that does not resolve the issue completely. However, it might let you use a catch node to simply trap the error and wait instead for the next change after which might work unless some kind of network error is generated.

mariomol · 27 September 2023 17:27

Colin Not sure how to do that, is it a node red service configuration or the operating system?

mariomol · 27 September 2023 17:33

TotallyInformation Thank you! Do you suggest to add those lines to my existing node tail node code? No sure here.. node-red/node-red-nodes/blob/master/storage/tail/28-tail.js#L20-L25

TotallyInformation · 27 September 2023 19:16

Well that would be only a temporary work around. It needs raising as a bug on that repo for someone to fix properly.

Colin · 27 September 2023 19:25

Neither do I, I don't use Windows. I expect it is a configuration setting in whatever system you have used to run node red automatically on boot.

TotallyInformation · 27 September 2023 19:29

We would need to know how Node-RED is being started on the PC. On Windows there are various ways to run things and each has their own way of auto-restart.

Might also be good to look at what kind of drive, Y: actually is. Is it a network drive or some kind of external drive or something else. It may be that is causing the accessibility blip. Not possible to know without knowing more about the architecture in use.

mariomol · 27 September 2023 20:36

I'm staring up the node red service manually, no auto start at the moment, will try that once find a solution for the crashing issue.

As mentioned, the basic architecture is 1 PC with node red as server and 20 PCs with a shared drive over a wired LAN, all using Windows 10, each with a "Log.txt" file we need to read the tail. The shared drives of the 20 PCs is where the OS live (C:).

TotallyInformation · 27 September 2023 22:48

I think I might try accessing the log files from the PC's C: drive and manually copy from the network drives - just to try and eliminate another variable.

Topic		Replies	Views
File truncated message General	7	1145	27 November 2018
Node red stopped working General	14	1506	7 April 2024
Node-red crashes after startup General	17	1491	10 November 2021
How to Clear editor node status "Error" on "node-red-node-tail" node? General	9	228	15 February 2023
Node-red crashes after a while and no html view -- please help -- SOLVED General	15	4906	28 September 2021

Node-red crash with "node-red-node-tail"

Related topics