I am working on a flow that takes data from MQTT (mostly rtl_433 sensors) and converts the inputs to meaningful data that then is output to an Influx database. The tutorials I've read have gotten me started and after a couple questions here, I have most of the basic functionality to at least store appropriate data.
My next goal is to figure out retention policies such that I store the high fidelity data for a short time and then down sample to hourly averages for long term storage. I am currently working on the RPi onboard SD while learning, but I believe a better approach would be to have the high fidelity data stored on a USB flash drive and the long term data stored on my RAID server.
I have been reviewing a few tutorials, but not having experience managing databases, I'm not sure if I am on the right track. The following tutorial seems excellent when discussing setting up RP on the Influxdb side: https://docs.influxdata.com/influxdb/v1.8/guides/downsample_and_retain/
I don't know if what I am describing requires a separate database set up on the Windows server or if it can be handled by the RPi alone via SAMBA or similar. If someone could post a tutorial link or a high level recommendation on how to set things up I can continue my research.
I'm only 1 week in on NR, InfluxDB, and Grafana. These tools are absolutely amazing though!
Be careful there, not all flash drives have wear levelling (indeed, not all SD cards do either). In fact I would say that it is better to use a decent quality SD Card. Flash drives, especially cheap ones are notoriously unreliable.
There is a thread from a month or so back where I gave some detailed examples of what you are trying to do for the retention policy side.
I would avoid using SAMBA for this if you can. It isn't really designed for this kind of work. Since InfluxDB uses HTTP links, I would stick with that even though it requires you to run InfluxDB on the Windows server as well as the Pi.
Honestly, I would keep it simple for now. Even a modest Pi can cope with a lot of data in InfluxDB. My Pi3 has 296MiB of data in it at the moment, several years of environmental and system performance data.
I am not sure whether the usual downsampling technique will allow the downsampled data to go to a different system. Normally it is all within one influxdb server. If you don't want it on the SD card then you could put a USB drive (hard or SSD) on the Pi and put it all there.
Since you are starting from scratch you might want to consider going straight to Influx 2.0 as that is the way Influx is going. The node-red influx nodes support Influx 2.0 now. A problem might be that you will have more difficulty finding anyone that will help. Several of us would be learning with you I think. The advantage of starting with 2.0 is that if you go for 1.8 you will likely have to migrate the data at some point which may not be trivial.
I was thinking that it would be better to put the high fidelity/short term data on a local USB. The primary reason for local rather than on the server is that the server being RAID5 with 4 conventional disc drives is the startup time. If I write to the the server once a day, it's no problem since the time it takes to spin each up in series isn't a big issue. But if I write every bit of sensor data to the server it will keep the drives spinning all the time and cause earlier failure. My intent was to have the current day/week stored locally and then uploaded to the server for long term storage. Keeping it local would be fine, but a backup would be preferred. If I hammer a USB of any kind, the media will eventually fail. If I lose the high fidelity data I'm out a day or week which is not the end of the world. If I lose the full set, I'd be starting over.
@Colin I looked at a couple descriptions of Influx 2 and it appears it's still considered Alpha though it is available to anyone. I don't have a problem being an early adopter so long as it works, but without much database background, that may be a bit of a hurdle if there's no support on this or their own forum. I'll do some more reading.
Urm! I think you will find that server disks are designed for that A lightly loaded server disk will probably keep going for well over a decade. My NAS drives (WD Red) have done 7 years 24x7 and they show no signs of wear.
You are certainly over-thinking things. Server drives are designed to be always on. USB Sticks are designed for occasional use. But either way, my SD Cards have lasted for years and get constantly hammered. We have covered this many times in the forum.
However you store it, if it is important data, then you must have a routine backup schedule. Whatever device you use it may fail at any time without warning.
@Colin I've been thinking about your comment and realized that the best approach is probably to have the database stored local to the RPi on some form of media and then to back it up to a server for safe keeping. If the RPi storage goes bad, I'd restore the whole database from the backed up location. This would probably be easier and within the scope of how Influx is designed anyway. I will have to consider what medium to use for storage further. I wasn't suggesting that the USB drive would be better/worse than the RPi SD card so much as I was suggesting that if the SD were used and it failed the whole setup is down whereas if I use an external device (USB stick, USB->SD adapter, SSD, etc) I would only lose the data on that device. That's kind of a wash though since I intend on using rpi-clone regularly/weekly anyway.