Writing sensor data to file - how to limit file size?

I am using the "append to file" function in the file node to write sensor data to a file and I would like to limit this file to two years of data. It would be perfect if there was something like a max file size or similar but I have not been able to find anything. How can I limit the file size - or limit number of entries so the data older then this limit are pushed out?

The file node allows you to set a msg.filename.
You could use a function node just before your file node that does a check with javascript whether the year has changed by comparing it with a savedYear flow context variable ... and change the msg.filename accordingly. Maybe this is not exactly what you need since it changes file every year.

Something like :

let savedYear = flow.get('savedYear')
let currentYear = new Date().getFullYear()

if (currentYear > savedYear) {
    msg.filename = `/home/pi/Desktop/${currentYear}.txt`;
    flow.set('savedYear', currentYear)
}
else {
    msg.filename = `/home/pi/Desktop/${savedYear}.txt`;
}

return msg;

Example flow : (untested .. i didnt wait for a year) :wink:

[{"id":"871ff69c.a1a948","type":"file","z":"76cceccc.844784","name":"","filename":"","appendNewline":true,"createDir":false,"overwriteFile":"false","encoding":"none","x":830,"y":1060,"wires":[[]]},{"id":"fbc2145a.e68a5","type":"inject","z":"76cceccc.844784","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"$random()\t","payloadType":"jsonata","x":340,"y":1060,"wires":[["da82fd65.15dca8"]]},{"id":"da82fd65.15dca8","type":"function","z":"76cceccc.844784","name":"","func":"let savedYear = flow.get('savedYear')\nlet currentYear = new Date().getFullYear()\n\nif (currentYear > savedYear) {\n    msg.filename = `/home/pi/Desktop/${currentYear}.txt`;\n    flow.set('savedYear', currentYear)\n}\nelse {\n    msg.filename = `/home/pi/Desktop/${savedYear}.txt`;\n}\n\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":590,"y":1060,"wires":[["871ff69c.a1a948"]]},{"id":"a3b9090b.d73df8","type":"inject","z":"76cceccc.844784","name":"init","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":true,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":340,"y":960,"wires":[["f26d1d55.ee7608"]]},{"id":"f26d1d55.ee7608","type":"function","z":"76cceccc.844784","name":"","func":"flow.set('savedYear', new Date().getFullYear())\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":530,"y":960,"wires":[[]]}]

ps . change the path to represent your test file and OS.

I'd say that is a perfectly reasonable and simple workaround however, there is really no need to even save the year. Simply use the year as filename and every year change, data will be written to that years file.

If the op really wants a 2y file, then do something like fileyear = year - (year % 2);

Also, if there is really a need to check file size first, install the fs-ops nodes and use the "file size" node & if necessary, the fs-ops move or fs-ops delete nodes.

1 Like

Thank you guys for the tips. The flow we are trying to develop is written for ventilation systems and I we want to be able to show the data in a graph where the user should be able to select - or chose between different periods e.g. 1 day, 1 week, etc. but no longer back in time as 2 years of data. I found this video https://www.youtube.com/watch?v=ccKspiI8FRw

I have not worked with SQL before, but it seems to be a good way to go. The video is a couple of years old. Do tiy know if there is new nodes / projects that makes this easier in node red?

Maybe you should have opened with that info.

I certainly would agree - use a database if you have a requirement to query the data as you state.

As for which database...
You don't say how much data will be written (how often & how many). Nor do you say which hardware node-red will run on. Both of these points are useful in advising which database to use.

You are absolutely right. I have too little experience with this, and I am kind of fumbling in the dark for the right solution. In the current flow we have approximately 20 data points (sensors and values) expressing the speed of the ventilators, set points, etc. we are currently reading approximate these 20 sensors / values every 2 sec for use in the real time dashboard. Currently, we are using the aggregator node to calculate an average value for each data point every 5 minutes. This might change a bit, but it will be around this size and frequency. So every 5 min. we will store a new data point consisting of app. 20 values. I am not sure about the terminology, but I think about it as a 2 dimensional database with a total of max. 210.240 lines with each 20 values. We don't have any experience if this is too large for the node-red graph node to handle. If it is, then we will reduce the total data set.

As this is a commercial project, you may wish to look after your customers (and possibly avoid future law suits) and hire someone to help you get started.

I would be really easy to leave your customers open to security and/or reliability issues which would reflect badly on your company.

For example, if using a single-board computer such as a raspberry pi for a compute platform, it would be very easy to cause an embedded database to exceed a file size (a crash, possibly corrupted data) or cause excessive memory usage (everything slows to a crawl and/or crashes). These things might not show up for a couple of years.

Similarly, having an embedded compute platform on a customer network without sufficient security could lead to some very nasty consequences if they were ever hacked and even worse consequences if the platform gets exposed to the Internet.

1 Like