InfluxDB query returning no results

This is a question of how to properly construct an InfluxDB query which is out of scope of NR. If there are no experts here I will submit to the InfluxDB forum instead. I generally find this forum awesome and figured I'd try here first.

I have a weather station (Acurite 5in1) sending data wirelessly to a RPi with USB SDR attached. That RPi uses rtl_433 to decode the radio traffic and send the raw data via MQTT. I use NR to convert the raw sensor values to usable data and store that in InfluxDB v1.8. All of that seems to be working as expected without issue.

Next, I want to submit a selection of the weather data to pwsweather.com so I can utilize my local rain data for a smart sprinkler. I have a flow that does what I need to parse the data and build the URL which gets submitted to PWS at 10 minute intervals. I have confirmed with PWS that they are seeing the incoming data but there is no rain info other than "rainin=0" for every submission. I just found that the reason appears to be a bad query, but I don't know why.

Influxdb rain incremental data:

select * from "rtl_433/Acurite-5n1/A/rain_increment_in" ORDER BY time DESC LIMIT 10
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                           value
----                           -----
2021-08-17T09:29:00.735789174Z 0.00999999999999801
2021-08-17T07:57:48.778641909Z 0.00999999999999801
2021-08-08T11:25:35.380248655Z 0.010000000000005116
2021-08-08T11:18:59.57464481Z  0.00999999999999801
2021-08-08T11:09:23.431049542Z 0.00999999999999801
2021-08-08T10:47:47.354700108Z 0.010000000000005116
2021-08-08T09:01:35.735475143Z 0.00999999999999801
2021-08-07T14:35:45.499019773Z 0.00999999999999801
2021-08-07T14:22:33.327031983Z 0.00999999999999801
2021-08-07T14:14:45.155771589Z 0.010000000000005116

You can see there are 2 rain events this morning at 9:29Z and 7:57Z. And my query in the flow is:

select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(10m) ORDER BY time DESC LIMIT 1

But this doesn't seem to return anything. Since there hasn't been rain since 2am, I'm modifying the query to get some data.

select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(10m) ORDER BY time DESC LIMIT 1
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                 sum_value
----                 ---------
2021-08-17T16:10:00Z
select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(14h) ORDER BY time DESC LIMIT 1
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                 sum_value
----                 ---------
2021-08-17T06:00:00Z 0.01999999999999602

Anything less than 14h returns no data. But, it supposedly rained at 2:29am local time, so I don't know why it's not returning data when I ask for less than 14h? It should return data if I use 8 hours since 2:29am to 9:17am is around 7 hours.

Any suggestions? If this is too detailed in the InfluxDB world I'll understand and try asking over there. Thanks in advance!

Increase the Limit number so that you see a number of samples, that may help you to work out what is going on.

Ok, I'll give that a try. Thanks!

> select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(10m) ORDER BY time DESC LIMIT 100
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                 sum_value
----                 ---------
2021-08-17T19:20:00Z
2021-08-17T19:10:00Z
2021-08-17T19:00:00Z
2021-08-17T18:50:00Z
2021-08-17T18:40:00Z
2021-08-17T18:30:00Z
2021-08-17T18:20:00Z
2021-08-17T18:10:00Z
2021-08-17T18:00:00Z
2021-08-17T17:50:00Z
2021-08-17T17:40:00Z
2021-08-17T17:30:00Z
2021-08-17T17:20:00Z
2021-08-17T17:10:00Z
2021-08-17T17:00:00Z
2021-08-17T16:50:00Z
2021-08-17T16:40:00Z
2021-08-17T16:30:00Z
2021-08-17T16:20:00Z
2021-08-17T16:10:00Z
2021-08-17T16:00:00Z
2021-08-17T15:50:00Z
2021-08-17T15:40:00Z
2021-08-17T15:30:00Z
2021-08-17T15:20:00Z
2021-08-17T15:10:00Z
2021-08-17T15:00:00Z
2021-08-17T14:50:00Z
2021-08-17T14:40:00Z
2021-08-17T14:30:00Z
2021-08-17T14:20:00Z
2021-08-17T14:10:00Z
2021-08-17T14:00:00Z
2021-08-17T13:50:00Z
2021-08-17T13:40:00Z
2021-08-17T13:30:00Z
2021-08-17T13:20:00Z
2021-08-17T13:10:00Z
2021-08-17T13:00:00Z
2021-08-17T12:50:00Z
2021-08-17T12:40:00Z
2021-08-17T12:30:00Z
2021-08-17T12:20:00Z
2021-08-17T12:10:00Z
2021-08-17T12:00:00Z
2021-08-17T11:50:00Z
2021-08-17T11:40:00Z
2021-08-17T11:30:00Z
2021-08-17T11:20:00Z
2021-08-17T11:10:00Z
2021-08-17T11:00:00Z
2021-08-17T10:50:00Z
2021-08-17T10:40:00Z
2021-08-17T10:30:00Z
2021-08-17T10:20:00Z
2021-08-17T10:10:00Z
2021-08-17T10:00:00Z
2021-08-17T09:50:00Z
2021-08-17T09:40:00Z
2021-08-17T09:30:00Z
2021-08-17T09:20:00Z 0.00999999999999801
2021-08-17T09:10:00Z
2021-08-17T09:00:00Z
2021-08-17T08:50:00Z
2021-08-17T08:40:00Z
2021-08-17T08:30:00Z
2021-08-17T08:20:00Z
2021-08-17T08:10:00Z
2021-08-17T08:00:00Z
2021-08-17T07:50:00Z 0.00999999999999801
2021-08-17T07:40:00Z
2021-08-17T07:30:00Z
2021-08-17T07:20:00Z
2021-08-17T07:10:00Z
2021-08-17T07:00:00Z
2021-08-17T06:50:00Z
2021-08-17T06:40:00Z
2021-08-17T06:30:00Z
2021-08-17T06:20:00Z
2021-08-17T06:10:00Z
2021-08-17T06:00:00Z
2021-08-17T05:50:00Z
2021-08-17T05:40:00Z
2021-08-17T05:30:00Z
2021-08-17T05:20:00Z
2021-08-17T05:10:00Z
2021-08-17T05:00:00Z
2021-08-17T04:50:00Z
2021-08-17T04:40:00Z
2021-08-17T04:30:00Z
2021-08-17T04:20:00Z
2021-08-17T04:10:00Z
2021-08-17T04:00:00Z
2021-08-17T03:50:00Z
2021-08-17T03:40:00Z
2021-08-17T03:30:00Z
2021-08-17T03:20:00Z
2021-08-17T03:10:00Z
2021-08-17T03:00:00Z
2021-08-17T02:50:00Z

So it's definitely pulling a value. When I use the same command to pull the last 24h of data it seems to grab the last item with 0.02" of rain correctly.

> select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(24h) ORDER BY time DESC LIMIT 3
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                 sum_value
----                 ---------
2021-08-17T00:00:00Z 0.01999999999999602
2021-08-16T00:00:00Z
2021-08-15T00:00:00Z
> select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in" GROUP BY time(24h) ORDER BY time DESC LIMIT 1
name: rtl_433/Acurite-5n1/A/rain_increment_in
time                 sum_value
----                 ---------
2021-08-17T00:00:00Z 0.01999999999999602

I didn't post earlier because I believe this is an InfluxDB query issue (and I still do), but here is the contents of the function node that uses the output of that query in my NR flow:

if (msg.payload[0].sum_value==null)
{
    msg.payload = 0;
} else {
    msg.payload = msg.payload[0].sum_value;
}

//test rainin
//msg.payload = 0.01;

flow.set("rainin",parseFloat(msg.payload));
return msg;

If I uncomment the test rainin line, then PWS gets a value as expected. If there is rain that should be seen, it still seems to set the payload to 0 which suggests that in both cases (rain and no rain) the if statement evaluates to true meaning msg.payload[0].sum_value must always be seen as null. I have the same logic for the 24h period and it seems to return a non-null value correctly.

I just figured out how to output the value of the payload to the debug sidebar so here's what I found:

node.warn("sum_value = " + msg.payload[0].sum_value);

It should output 0.02" if correct.

8/17/2021, 12:49:59 PMnode: d5967a4a.59e3d8function : (warn)
"sum_value = 0.01999999999999602"

Seems legit. So for the 10m value next.

8/17/2021, 12:51:38 PMnode: bf639aa0.eafcfunction : (warn)
"10m sum_value = null"

And we haven't had rain in the last few hours so that's also correct. If I modify the query to pull the last 100 values, object 72 contains a rain measurement. If I direct the warn to print the value of that object I get data.

8/17/2021, 12:56:17 PMnode: bf639aa0.eafcfunction : (warn)
"10m sum_value = 0.00999999999999801"

So for some reason it appears that when I ask NR to pull the data for a single item with the 10m grouping it's not working and I'm not sure why.

I created a simple flow to test my code. I switched to seconds to reduce needless waiting time and added a test measurement so I won't mess up my real data.

[{"id":"a9a303e4.644e","type":"inject","z":"1e343ce7.83c7ab","name":"debug","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":"","topic":"","payload":"","payloadType":"date","x":190,"y":860,"wires":[["ab156073.d13498"]]},{"id":"ab156073.d13498","type":"influxdb in","z":"1e343ce7.83c7ab","influxdb":"5b91dbad.94ba7c","name":"query rainfall","query":"select sum(*) from \"rtl_433/Acurite-5n1/A/rain_increment_in_test\" GROUP BY time(10s) ORDER BY time DESC LIMIT 1","rawOutput":false,"precision":"","retentionPolicy":"","org":"organisation","x":370,"y":860,"wires":[["f7058ebc.67bd28","a26e5a15.7457f"]]},{"id":"f7058ebc.67bd28","type":"function","z":"1e343ce7.83c7ab","name":"","func":"node.warn(\"10m sum_value = \" + msg.payload[0].sum_value);\n\nif (msg.payload[0].sum_value==null)\n{\n    node.warn(\"null\");\n    msg.payload = 0;\n} else {\n    node.warn(\"not-null\");\n    msg.payload = msg.payload[0].sum_value;\n}\n\nnode.warn(\"rainin = \" + msg.payload);\nreturn msg;\n","outputs":1,"noerr":0,"initialize":"","finalize":"","x":620,"y":860,"wires":[["f72d9225.cb45a"]]},{"id":"f72d9225.cb45a","type":"debug","z":"1e343ce7.83c7ab","name":"query rainfall","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":770,"y":860,"wires":[]},{"id":"a26e5a15.7457f","type":"debug","z":"1e343ce7.83c7ab","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","statusVal":"","statusType":"auto","x":610,"y":820,"wires":[]},{"id":"7a2883b.63827fc","type":"influxdb out","z":"1e343ce7.83c7ab","influxdb":"5b91dbad.94ba7c","name":"rain increment test","measurement":"rtl_433/Acurite-5n1/A/rain_increment_in_test","precision":"","retentionPolicy":"","database":"sensors","precisionV18FluxV20":"ms","retentionPolicyV18Flux":"","org":"organisation","bucket":"bucket","x":350,"y":780,"wires":[]},{"id":"a1dd1f81.9f8038","type":"inject","z":"1e343ce7.83c7ab","name":"","props":[{"p":"payload"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"0.01","payloadType":"num","x":180,"y":780,"wires":[["7a2883b.63827fc"]]},{"id":"5b91dbad.94ba7c","type":"influxdb","hostname":"127.0.0.1","port":"8086","protocol":"http","database":"sensors","name":"influxdb","usetls":false,"tls":"","influxdbVersion":"1.x","url":"http://localhost:8086","rejectUnauthorized":true}]

When I use this, it is kind of works. If I inject a single value and run the query it will usually output null. If I inject several in a row and then run the query it seems to sometimes output the correct value but after a few seconds revert back to null.

I switched the query back to 10 minutes and ran it again after injecting test values for the last few minutes. I think the issue may have to do with how it breaks up the readings by time boxes. Each group is divided at even start intervals (12:00, 12:10, 12:20, 12:30, etc). I assumed that it would be dividing by (current time-10m) for this request. If not, then it the first value requested could be (current time-10m) or (12:02 to 12:00) which is actually 2 minutes. If the time I'm running the query is just after the divider time, is it possible I'm querying the last 10 seconds of data (12:00:10 to 12:00:00) instead of what I think I'm asking for (12:00 to 11:50)? Thoughts and ways to fix this if I'm correct?

select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in_test" GROUP BY time(10s) ORDER BY time DESC LIMIT 1

Basic GROUP BY time() syntax

Syntax

SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>),[tag_key] [fill(<fill_option>)]

Basic GROUP BY time() queries require an InfluxQL function in the SELECT clause and a time range in the WHERE clause. Note that the GROUP BY clause must come after the WHERE clause.

time(time_interval)

The time_interval in the GROUP BY time() clause is a duration literal. It determines how InfluxDB groups query results over time. For example, a time_interval of 5m groups query results into five-minute time groups across the time range specified in the WHERE clause.

You don't have a where clause...

I'm certainly not an InfluxDB expert, but it didn't seem like limiting the data in any way would be needed since I'm just asking for all data and then grabbing the most recent. I think your suggestion is much better considering that without it I'm telling InfluxDB to return all data and then throw out everything but the 1 element. I don't know how it works internally, but if that's the case then that's not particularly efficient.

I added

WHERE time > now() - 7d

so I now have

select sum(*) from "rtl_433/Acurite-5n1/A/rain_increment_in_test" WHERE time > now() - 1d GROUP BY time(10m) ORDER BY time DESC LIMIT 10

I requested 10 so I could more easily debug. Right now it does appear this is working but I'll have to test a little more to see if it was a fluke or if the WHERE field was the key.

I spoke too soon. I am sure now that the issue is just the time period that I'm defining isn't right. I set 30 second intervals and injected a value at 14 seconds. Between 0 and 30 seconds of the current minute it showed the injected value. After the current time passed 30 seconds it went back to a null value. I believe I need to query in such a way that I'm asking for the now()-10m and right now I'm asking for something different.

I'll look at the InfluxDB guide more but if anyone has a link to study that would be great. Thanks for the links provided so far!

You are right about the group by interval. If you say group by 1 day then it groups on each day, not by 24 hour periods back from the current time. I don't know how to do what you want in influx, but you could fetch them all for the period and sum then in node red.

I don't actually need to query exactly how I am. What I need is a way to submit the data to PWS and know that I'm not missing anything. If I were to change the NR flow so that it polled at 1 minute after the gap (1, 11, 21, 31, etc) and then request LIMIT 2 and select element 1 of the array, I'm thinking that would get me what I'm looking for, too. Does that appear to be a strategy that would both work and not miss data?

I think the timestamp you get will be that of the start of the period. Check that and see if it is correct. If so then you could use limit 2, in order to send the penultimate data, but also keep a record in context of the timestamp of the most recent set you have sent. Then only send if you have data for a new period.
Alternatively just send it anyway, PWS should hopefully not care if you send the same data again.

PWS has no idea what the real data is obviously. They can take the data at any rate I provide, I'm just arbitrarily choosing 10 minutes. If I send the data as a repeat, PWS will see the same data and assume both are needed as increments and double the rain total though. The goal here isn't to be super accurate, it's just to provide the amount of rain so my sprinkler won't run if it's been a downpour. Right now, it uses a local station that I don't trust since 5 miles from my house could have had rain while my place is dry. Just yesterday the sprinkler had a water drop displayed but it was sunny and clear making "smart watering" kinda dumb.

I can use any period and any report interval I choose, but I'm not clear how to send something that's meaningful with the way Influx sums the data. What alternative method could I use since it appears this is unreliable? You mentioned grabbing all data and summing in NR but I'm not familiar with that approach. I assume grabbing the data would be something like:

select * from "rtl_433/Acurite-5n1/A/rain_increment_in_test" WHERE time > now() - 10m ORDER BY time DESC

The best I've found to sum the data would be the "reduce" function or building my own recursive loop. I didn't see a contributed flow that seemed to work out of the box for some reason though summarizer seemed close. On the up side, the output from the query should work.

In that case keep track of the timestamp for the latest data you have already send.

However, why don't you just send the data every time you write anything to influx, rather than trying to read it back again?

Exactly.

My weather station broadcasts in ~30 second intervals but when rtl_433 processes and transmits via MQTT it's in multiple packets. So in order to trigger off weather data, I would have to do something a little more robust than just trigger off of a single packet. I was going to use 10 minute intervals because it seemed reasonable.

I was originally going to keep track of the data in NR and then transmit that at intervals to PWS until I realized that's kinda complicated. It's also unnecessary since InfluxDB already does this for me. The problem is, InfluxDB can't feed it back to me easily. So now I'm stuck with either reinventing the wheel to keep track of data the way InfluxDB is already doing for me, or figuring out how to get InfluxDB to play nice and give me the data I actually want. This feels like a really silly problem to have!

How are you writing it to influx?

I don't see how your post is related to my suggestion to send it to the server each time you write non-zero values to influx.

I didn't understand your suggestion last week and had to think about it for a few days. I think I can rebuild the flow as you suggest pretty easily actually. I was only using the current method to aggregate the data via InfluxDB because I was concerned it would be too difficult in NR. However, the data is output by the weather station at ~30s intervals which is not too much for PWS to accept. Since I only really care about the rain and the temp is included as a bonus, I can send the data at that faster interval even if there is no rain.

I'm going to trigger off of one packet from the weather station data being received and place a delay on it. I'll have each packet write the data I care about to a variable as I'm currently doing and then fire the needed data to PWS. It will be a lot cleaner this way and I won't have to worry about some strange issues InfluxDB seems to have.

Thanks!

I don't think they are strange issues, it is just that you are trying to do things which influx doesn't do automatically. The current behaviour of group by is correct for most uses.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.