Using Node-RED to transfer data from InfluxDB 1.7 to 2.3, but receiving ENOBUFS error when dataset is too large

I am using Node-RED to convert my InfluxDB 1.7 data to populate InfluxDB 2.3 using the attached flow.
Influx1.7 to 2.3.json (2.7 KB)

In both databases, the frequency is 1 reading per minute. If I run the flow and limit the query (to retrieve the data from 1.7) for, say, a 2-hour window, it works perfectly:

Influx 1.7 (viewed in Grafana)

Influx 2.3 (viewed in Grafana)

However, when I widen up the query to a full month's worth of data, I get the following error after a minute or so (and only a portion of the data appearing in 2.3).

image

I Googled the ENOBUFS error and found some discussions & solutions, notably to throttle the data flow to get the job done (apparently my system's memory can no longer handle the buffer's size). I put a 5-second delay node where the blue circle is below, but the ENOBUFS error still happens and only part of the data makes the jump to 2.3.

Does it make sense that the ENOBUFS error is happening when writing the data to InfluxDB 2.0? Maybe a delay node is not the right solution?

For future reference, when posting a flow then paste it directly here, using the </> button at the top of the forum entry window when pasting it.

I don't see a delay node in the flow.

As I understand it, for a months worth of data, you are fetching 43200 values and then sending all of them flat out one after another to the influx out node. I guess that is overloading something somewhere. You would be much better to write them out in one go using the influx batch node. That is what it is for.

Does influx not provide tools or techniques for migrating to 2.0?

Answering my own question Upgrade from InfluxDB 1.x to 2.3 | InfluxDB OSS 2.3 Documentation

1 Like

Thanks @Colin. When I uploaded my flow as an attachment (instead of pasting directly), I thought "that's different", and of course now I realize I did not follow the correct protocol. Anyway, I elected to not use the tools that you linked to because I am actually massaging my data considerably (adding / removing tags, changing field names, etc.) and I find using function nodes for that is much easier for me to work with. The flow that I posted did not include the delay node (I tried that, but it did not make any difference). And you correct, 43300 values for about 1 month's worth of data.

I will try to InfluxDB batch node, which admittedly I have never used. Will report here with some results.

OK, tried the batch node using this flow

[{"id":"262a3923.e7b216","type":"influxdb in","z":"e34b3a780748064f","influxdb":"eeb221fb.ab27f","name":"","query":"SELECT last(\"HTFN1Z1T\") FROM \"stations\" WHERE time >= 1609480800000ms and time <= 1610773199000ms GROUP BY time(1m) fill(null)","rawOutput":false,"precision":"ms","retentionPolicy":"","org":"my-org","x":340,"y":280,"wires":[["467fc95d165f998e"]]},{"id":"803d82f.ff80f8","type":"inject","z":"e34b3a780748064f","name":"","repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":240,"y":200,"wires":[["262a3923.e7b216"]]},{"id":"467fc95d165f998e","type":"split","z":"e34b3a780748064f","name":"","splt":"\\n","spltType":"str","arraySplt":1,"arraySpltType":"len","stream":false,"addname":"","x":450,"y":360,"wires":[["14f2646498b1cb03"]]},{"id":"14f2646498b1cb03","type":"function","z":"e34b3a780748064f","name":"","func":"msg.payload = [\n    {\n    measurement:\"SingleRotaryFurnaceZoneData\",\n    fields: {\n        temperature: msg.payload.last\n    },\n    tags:{\n        MeasType:\"actual\",\n        EquipNumber:\"1\",\n        EquipZone:\"zone1\"\n    },\n    timestamp: msg.payload.time\n    }\n];\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":580,"y":480,"wires":[["31fda5cb865c09ab"]]},{"id":"31fda5cb865c09ab","type":"influxdb batch","z":"e34b3a780748064f","influxdb":"e55f0c8e.1659f","precision":"","retentionPolicy":"","name":"","database":"database","precisionV18FluxV20":"ms","retentionPolicyV18Flux":"","org":"heattreat","bucket":"junkbucket","x":830,"y":520,"wires":[]},{"id":"eeb221fb.ab27f","type":"influxdb","hostname":"172.31.19.130","port":"8086","protocol":"http","database":"DST","name":"test","usetls":false,"tls":"d50d0c9f.31e858","influxdbVersion":"1.x","url":"http://localhost:8086","rejectUnauthorized":true},{"id":"e55f0c8e.1659f","type":"influxdb","hostname":"3.17.31.71","port":"8086","protocol":"http","database":"DST","name":"","usetls":false,"tls":"","influxdbVersion":"2.0","url":"http://192.168.10.25:8086","rejectUnauthorized":false},{"id":"d50d0c9f.31e858","type":"tls-config","name":"","cert":"","key":"","ca":"","certname":"","keyname":"","caname":"","servername":"","verifyservercert":false}]

and got the same ENOBUFS error. I changed the timespan down to 15 days (21600 values) and got the same error. Moved it down to 5 days (7200 values) and it worked.

I think I understand the concept that I am sending too many values at a time. Perhaps a loop or something that sends 5 days at a time?

You are still writing them one at a time. You can remove the split node and convert the whole input array into the form required by the batch node and output the whole set of data in one write operation. I don't know if there is a limit to how many you can write in one operation.

I had not realized I was writing the values one at a time until you pointed that out. Thanks.

I removed the split node and started by first examining the output from the Influx 1.7 query. I can see it is an array with 21540 objects, with each object being:

payload[*].time is the timestamp of the value
and
payload[*].last is the value
(pardon the use of the *...it is meant to indicate all of the 21540 values in the array)

And the Influx batch node format requires me to simply insert these two values here:
image

How does one convert this whole array so that the function node can take the values from payload.last and payload.time? I tried some JSONata expressions (in a change node, before the function node) but nothing worked.

Have a look at Array.map() which will transform a complete array, given a function that is applied to each element.

Thank you @Colin. I needed some more time to think about this, and I got it all working well using a single function node containing this:

msg.payload = msg.payload.map(function(value) {
    return {
        measurement: "SingleRotaryFurnaceZoneData",
        fields: {
            temperature:value.last,
        },
        tags:{
            MeasType:"actual",
            EquipNumber:"1",
            EquipZone:"zone1"
        },
        timestamp: value.time
    };
});
return msg;

An array with 378,000 objects (about 9.5 months of data, with 1 reading per minute) took < 10 seconds to import into Influx 2.3 using the batch node.

2 Likes

Excellent.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.