How to read file line by line

Hello,

I am working on Datalogger using Node red programming tool.

Project scenario:
Process 1:
Read Data from serial node -> Store parameters in a csv formatted file.
Process 2:
Read data from Csv formatted file and Push data to the cloud using MQTT Node.

Here, i am following blob like folder structure to store data in a csv file(Year/Month/Day/Filename.csv).
For example: 2020/July/12/2020July12.csv.
Data stored in the file per day is minimum 512 KB.

Now i am successfully completed the Process1 by storing the data in the file the way i wanted.

But when it comes to reading, file-in node read the complete csv file and load into memory and send line by line in a single shot.

Assume am storing data every 30 seconds and am sending 5 records every one hour.
Here the storing and retrieving process is completely independent.
Sometimes due to poor network, i may not be able to push data to the server for 3, 4 days.

I can't use Watch node / delay node (rate limiting) for my scenario

i need a node that should read a file line by line for every inject.

The file in node has an option to output a line at a time. Does that not work ?

Thanks for your response.

That doesn't work in that way. In a single inject it read complete file and send all the lines to the debug node.
Is that the efficient way to read whole file into RAM and process the records?

In python, we are able to read a specific number of lines from a text file.

Assume I have 100 lines of data in a file.
I Inject flow every 1 minute.

In my first inject, i read first 5 lines(1,2,3,4 and 5) and publish the data to the server and wait for acknowledgement.
If getting, the ack, I will store the line number as 6 (for Next inject i have to start read from line number 6)in a context memory.

In my second inject, I read another 5 lines starting from the 6th line up to 10th line and publish the data to the server and wait for acknowledgement.

If not, getting the acknowledgement, then i will reload the same lines of data on my thrid inject. And i keep update the line number am currently reading(i.e. Read pointer) in a Context memory.

If this is the scenario, for every inject it loads whole file.

What will be the result, if the file size is too large.

I am looking forward to hear your favourable advices.

so - the file-in read by line mode does work in streaming mode so it shouldn't read the whole file into memory in one go - but yes does only read the whole file at a time. So yes you could do something similar to what you want but would indeed be re-reading the whole file each time. You may be better using the tail node that does a file tail function and only sends the last added lines as they are added. You could then just hold that relatively few lines in local context memory until you have sent them - or maybe use one of the queue nodes to hold them until processed.

image

I have tried the above method to filter the specific lines from a file with the help of index property using switch node.

As u said, this would be re-reading the whole file each time.

Instead, let me try to solve this scenario with the help of Queue node.

Thing am going to do..

Assume i am having 1000 lines of code and i am going to send 5 lines for every 1 minute.

In my first inject, i will load first 100 lines of data into queue with the help of switch node to filter the index(this time 0 to 100). And there will be only 100 records in a queue.
After the queue becomes empty, again i will load another 100 lines of data with the help of switch node to filter the index**(this time 101 to 200)**. And i will repeat the same process for the rest.

Queue package: node-red-contrib-simple-message-queue

Would that be a efficient way to read records??

Read Data from serial node -> Store parameters in a csv formatted file.

Why don't you write it to a database instead ? Is easier to query afterwards.

1 Like

I am working with Embedded linux devices which has limited memory and constrained environment.

In my view, using DB is not recommended or efficient for embedded applications. So that's why I am trying to implement file based Data logging.

Please correct me if I am wrong.

sqlite was made for embedded devices.

The way I would do this is to use a FIFO buffer in file backed global context memory to buffer the data prior to sending to your server. No need to mess about formatting, writing, and reading files.

Or maybe even better use node-red-contrib-simple-gate in persistent mode to queue the messages until they can be sent to the server.

[Edit] This is a flow that uses that technique for storing email messages and sending them to the email server when it becomes available.
https://flows.nodered.org/flow/05e6d61f14ef6af763ec4cfd1049ab61

1 Like

Thank you all for your valuable responses..

I have implemented Data logger using Sqlitedb.

It works perfectly the way i wanted.

But i have one question,

In MQTT, I have selected QOS-1.
After publishing data to the server, how do i confirm that my message is successfully sent to the server?

note:
In MQTT for publishing the packets, server will return PUBACK to the client. Likewise how node red notify us after getting ack from server.

Looking forward to hear your advices...

Thats a good question.

You may be able to use a status node pointed at the MQTT out node (not certain though - I've not visually witnessed a failed QOS1/2 publish)

Perhaps you might want to raise a feature request in the fourm - see if it gets any interest?

Some thoughts on how it would function would be wise to consider before raising your feature request

Some food for thought...

  • add an output pin to the MQTT OUT node ?
  • QOS0 - send payload immediately after SEND ?
    • what value / status info would you hope or expect to see in the payload?
  • QOS1 - send payload after PUBACK or RAISE ERROR* if timeout / failed?
    • what value / status info would you hope or expect to see in the payload?
  • QOS2 - send payload after PUBREC or PUBACK (or both?) or RAISE ERROR* if timeout / failed?
    • what value / status info would you hope or expect to see in the payload?
* I am not certain here but an error may already be raised for a failed QOS1/2 publish

(Needs more discussion and thought by people with better understanding of the MQTT TCP handshaking than me - but i think its a fair request?)

add a mqtt-in node subscribing to the topic you send out.

Hey Paul, while what you are saying is certainly a very reasonable workaround (and will likely satisfy 99% of requirements), it doesn't 100% confirm it was your mqtt pub (or your node-red msg) that published it. (not TRUE QoS)

If you want true QoS (in the MQTT definition of QoS), the best answer would be for the client lib making the pub call to return to the caller after the PUBACK.

And anyhow, theres nothing quite as good or useful as attaching a node to an output (in a serial wiring fashion) to guarantee (in node-red terms) it was your message travelling down the wire (so any extra stuff you have in the msg object would get passed through too).

1 Like

Status is generally meant to indicate things like the state of the connection not individual messages especially when it can handle 000s of msgs a second. ( plus the editor is not a dashboard etc)

You can use a Complete node linked to the MQTT node to show when it has completed action on each message. A quick test suggests that it doesn't signal Complete if it is unable to send the message to the broker.

1 Like

Nice one. Never used the complete node. Does the msg contain the contents you sent into the MQTT out node?

I never realised that feature/node had been added!

@knolleary @dceejay Great feature :slight_smile:

Great!! That works fine.. :handshake:

Suggestion: Would be great, If this is mentioned in the MQTT node help content.

Is there any way that i can check the MQTT connection exist or not before pushing the data to the server.

As previously mentioned the status is there to show the state of the connection - so a status node will report that.