Reading csv file

Hi everyone
I'm trying to get my heatpump logging into my smarthome. Have set up Node-red up and running.
My heatpump loggings are accessable over http in my local network. The file is generated everyday and has this pattern:
fixed_filename_part-YYYY-MM-DD.log
The file is about 2.5MB per day.

I tried to start with file node and pointed it to the http adress, connected with csv node but got nothing in debug window.
pls help

If you have a http address look at the http request node.

how can I handle the variable part of the filename?

As per the nodes info panel ( on right hand side when click on node)

msg.url
If not configured in the node, this optional property sets the url of the request.

So define the url just as you were doing before your file node and then either use a change node or a function node to move the url to msg.url

ok thank you got the readings but I also get:
"TypeError: url.indexOf is not a function"

ok, couple basic questions

  1. version of Node-RED?
  2. version of node.js?
  3. which csv node are you using?
  4. do you know which node in your flow is getting the error? (use a 'catch' node -> debug node may help)
  5. can you provide a stripped down version of your flow demonstrating the error?

ok nevermind the error occured because of a inactive Flow - sorry for that.
now my next question: my csv is updated every minute - how do I only read the new lines since last execution?

So you are creating a CSV file that grows by 2.5 MB per day - or do you recreate it every day?

What device are you running NR on?

let's see, if the file grows by 2.5 MB/day, that's 17.5 MB/week, 70 MB every 4 weeks, or 912.5 MB or 1 GB/year. What are you doing with all that data?

A CSV file is just a flat file with data added to the end. Somehow you have to read the file and extract the data at the end.

Now if you were storing the data in a database you could save the execution time somewhere (in the DB or in a global) and use that time to select all records after that that time.

My heatpump generates that file. It starts at zero and grows to about 2.5MB per day. In the memory of the heatpump are the last 60 days stored.
this csv includes all data of the heatpump (temps, flows, pressure, errors etc...) I only need a few of those.

Ahhh so the CSV file is stored on the heat pump, not on the system running NR.

so what do you get when you use the http request?
does the heat pump have an api?

I'm just requesting the file - no sorry no API.
but the first column is a timestamp in the csv - can I use this somehow as filter?

yes, store the last lines timestamp as a context value. then in a function node only pass on values where timestamp is > than stored value (& update stored value)

Details of how to store context value are in the docs on the website on the writing functions page

Just to be clear, it doesn't sound as though you can only request the lines you want, sounds like you have to get the whole file each time and then trim it down in Node-RED.

If that is the case, depending on what you want to do with the data next, you might simply process the whole file each time. Obviously though, that won't work if you are feeding the data into a database for example.

Yesterday I tried with a finished log file of the day before and I start believing that Node-Red is not the tool for this. execution takes forever...
I want to keep this values as accurate as possible - let's say monitor the csv file for changes. That's not possible with Node-red as the file is simply too big.

So how big is the file? What are you trying to process it on ? Do you know how many lines you add per day. You should be able to use a combination of techniques to help. Eg just use curl or wget to fetch the file direct to disk, rather than load it all into memory in Node-RED. Then either use a tail -n xx command to fetch the last xx lines, or the file on node set to read a line at a time and some sort of time filter as has already been suggested.

On my laptop I can handle extracting metadata from hundred MB video files in real time no problem.

At the least, you can use NR to prototype the processing you need to do. As you are restricted by the source device to only ever getting the whole file, this seriously limits what you can do. Normally, the way to get round this is to process the file as a stream - I think there may be some nodes to work with big files though I'm not sure how maintained they are.

Alternatively, something like the AlaSQL nodes may help as you can leverage AlaSQL itself to treat the file as a database.

And, as a final resort, write a small utility in Node.JS that grabs the file to disk and then trims it. You can call that from Node-RED passing in a timestamp perhaps to aid in the trimming. You could even extend that to write to a DB. Again, AlaSQL might help you there, it is a Node.JS module.

If you want to do that using shell commands, you can drive all of it from NR in the way that Dave has suggested.

So lots of options, it is a case of using the right tools for the job. If nothing else, NR is great at coordinating those activities in a more visual way.

wow so many options.
I will start trying to go with curl-trim-tail and only process the changes, if that fails I will try AlaSQL
Thank you!

1 Like