Http request flow

Hi,

Looking for help, i have basic flow that triggers 5hours from a website everyday at 00:00 and write them in a times.txt file. The flow works but on the times.txt file something changed and the hours is not extracted correctly.. Wondering if something changed on the webpage but here is the details..

Screenshot 2023-04-22 at 1.54.22 AM

[{"id":"affc24ff.283a6","type":"html","z":"514f2f91209bfa9f","name":"html select","property":"payload","outproperty":"payload","tag":"#tab-0 > div:nth-child(1) > table:nth-child(1) > tbody:nth-child(3) ","ret":"html","as":"single","x":430,"y":260,"wires":[["db1153ea.fdbd48"]]},{"id":"db1153ea.fdbd48","type":"change","z":"514f2f91209bfa9f","name":"","rules":[{"t":"change","p":"payload[0]","pt":"msg","from":"<td>","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"</td>","fromt":"str","to":",","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"<tr>","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"</tr>","fromt":"str","to":";","tot":"str"},{"t":"set","p":"payload","pt":"msg","to":"payload[0]","tot":"msg"},{"t":"change","p":"payload","pt":"msg","from":"\\s\\n","fromt":"re","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"  ","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"\\n","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"\\n","fromt":"re","to":"","tot":"str"}],"action":"","property":"","from":"","to":"","reg":false,"x":600,"y":260,"wires":[["7c3efd76.8ed5a4"]]},{"id":"7c3efd76.8ed5a4","type":"function","z":"514f2f91209bfa9f","name":"ProcessData","func":"var a = msg.payload\nvar timeWithDate = a.split(';');\nvar timeWithDateFiltered = timeWithDate.filter(function (el) {return el;});\n\nvar out=[];\nfor(let i=0; i< timeWithDateFiltered.length;i++){\n    let times = timeWithDateFiltered[i].split(\",\");\n    let date = times[0];//date is first element in array\n    times.splice(0,1);//remove date\n    let timesFiltered = times.filter(function (el) {return el;});\n    out.push({time:timesFiltered,date:date});\n}\nmsg.payload= out;\nreturn msg;","outputs":1,"noerr":0,"x":770,"y":260,"wires":[["208e89e58cd3a7d7"]]},{"id":"7b121cdb.e41614","type":"inject","z":"514f2f91209bfa9f","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"14400","crontab":"","once":true,"onceDelay":0.1,"topic":"","payloadType":"date","x":110,"y":260,"wires":[["7b3aa38e.0f69c4"]]},{"id":"7b3aa38e.0f69c4","type":"http request","z":"514f2f91209bfa9f","name":"","method":"GET","ret":"txt","paytoqs":false,"url":"https://namazvakitleri.diyanet.gov.tr/en-US/9132/prayer-time-for-montreal","tls":"","proxy":"","authType":"","x":270,"y":260,"wires":[["affc24ff.283a6"]]},{"id":"74acfb0aafb8c5d8","type":"file","z":"514f2f91209bfa9f","name":"","filename":"/config/times.txt","appendNewline":true,"createDir":false,"overwriteFile":"true","encoding":"none","x":1100,"y":260,"wires":[[]]},{"id":"208e89e58cd3a7d7","type":"change","z":"514f2f91209bfa9f","name":"TodayToFile","rules":[{"t":"set","p":"payload","pt":"msg","to":"payload[0]","tot":"msg"}],"action":"","property":"","from":"","to":"","reg":false,"x":930,"y":260,"wires":[["74acfb0aafb8c5d8"]]}]

You've not given much to go on for us so hard to know what you are looking for.

I assume that the times.txt file you've shared should have a time STRING in it?

Looks like the website probably changed their layout and so you are no longer extracting the correct part. Instead you have a string containing a load of tabs with a Ş character at the end.

Hi,

Yes the times.txt needs to have time string. It was working fine until 2-3days. I think that’s caused by the column Hijri Date having special letter Ş..
This is the URL : Presidency of Religious Affairs | Prayer Time for Montreal

OK, so if you get the response and then use an HTML node to capture just #tab-0 > div > table > tbody, that would give you all of the table.

I tried to do a quick test but the response to a simple request is a test for human input so I can't go further unless you can share a simple flow.

Hi, i provided the flow in my first post above

That flow does not work. The website asks for confirmation that the flow is a human - which of course, it is not.

hmm.. I was using since 2 years.. started 2 days ago the output of timex.txt is not complete..

normally it writes in times.txt the daily 6 times from the table..

{"time":["04:03","5:03"...etc],"date":"23.04.2023"}

I did some progress to isolate the issue but can't go further :slight_smile:
Using the flow below

[{"id":"bde0fcf25f4467fd","type":"tab","label":"Diyanet Namaz Vakitleri","disabled":false,"info":"","env":[]},{"id":"affc24ff.283a6","type":"html","z":"bde0fcf25f4467fd","name":"html select","property":"payload","outproperty":"payload","tag":"#tab-0 > div:nth-child(1) > table:nth-child(1) > tbody:nth-child(3) ","ret":"html","as":"single","x":290,"y":220,"wires":[["db1153ea.fdbd48"]]},{"id":"db1153ea.fdbd48","type":"change","z":"bde0fcf25f4467fd","name":"","rules":[{"t":"change","p":"payload[0]","pt":"msg","from":"<td>","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"</td>","fromt":"str","to":",","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"<tr>","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload[0]","pt":"msg","from":"</tr>","fromt":"str","to":";","tot":"str"},{"t":"set","p":"payload","pt":"msg","to":"payload[0]","tot":"msg"},{"t":"change","p":"payload","pt":"msg","from":"\\s\\n","fromt":"re","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"  ","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"\\n","fromt":"str","to":"","tot":"str"},{"t":"change","p":"payload","pt":"msg","from":"\\n","fromt":"re","to":"","tot":"str"}],"action":"","property":"","from":"","to":"","reg":false,"x":360,"y":300,"wires":[["7c3efd76.8ed5a4"]]},{"id":"7c3efd76.8ed5a4","type":"function","z":"bde0fcf25f4467fd","name":"ProcessData","func":"var a = msg.payload\nvar timeWithDate = a.split(';');\nvar timeWithDateFiltered = timeWithDate.filter(function (el) {return el;});\n\nvar out=[];\nfor(let i=0; i< timeWithDateFiltered.length;i++){\n    let times = timeWithDateFiltered[i].split(\",\");\n    let date = times[0];//date is first element in array\n    times.splice(0,1);//remove date\n    let timesFiltered = times.filter(function (el) {return el;});\n    out.push({time:timesFiltered,date:date});\n}\nmsg.payload= out;\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":430,"y":380,"wires":[["208e89e58cd3a7d7"]]},{"id":"7b121cdb.e41614","type":"inject","z":"bde0fcf25f4467fd","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"00 00 * * *","once":true,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":130,"y":60,"wires":[["7b3aa38e.0f69c4"]]},{"id":"7b3aa38e.0f69c4","type":"http request","z":"bde0fcf25f4467fd","name":"","method":"GET","ret":"txt","paytoqs":"ignore","url":"https://namazvakitleri.diyanet.gov.tr/en-US/9132/prayer-time-for-montreal","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[{"keyType":"User-Agent","keyValue":"","valueType":"other","valueValue":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0"}],"x":210,"y":140,"wires":[["affc24ff.283a6"]]},{"id":"74acfb0aafb8c5d8","type":"file","z":"bde0fcf25f4467fd","name":"","filename":"/config/times.txt","filenameType":"str","appendNewline":true,"createDir":false,"overwriteFile":"true","encoding":"none","x":540,"y":520,"wires":[[]]},{"id":"208e89e58cd3a7d7","type":"change","z":"bde0fcf25f4467fd","name":"TodayToFile","rules":[{"t":"set","p":"payload","pt":"msg","to":"payload[0]","tot":"msg"}],"action":"","property":"","from":"","to":"","reg":false,"x":490,"y":440,"wires":[["74acfb0aafb8c5d8"]]}]

The output of the times.txt gets the two cell highlighted in red.. normally it goes up to the right getting the 6 additional times..

I'm not an expert but the flow ProcessData
has this code which may limit the data to capture the rest of the times but again was working before..

var a = msg.payload
var timeWithDate = a.split(';');
var timeWithDateFiltered = timeWithDate.filter(function (el) {return el;});

var out=[];
for(let i=0; i< timeWithDateFiltered.length;i++){
    let times = timeWithDateFiltered[i].split(",");
    let date = times[0];//date is first element in array
    times.splice(0,1);//remove date
    let timesFiltered = times.filter(function (el) {return el;});
    out.push({time:timesFiltered,date:date});
}
msg.payload= out;
return msg;

The request seems to work for me now. Using this in the HTML node returns the whole table #tab-0 > div > table > tbody. Changing the output of that node to only return text gives a string looking like:

    23.04.2023
    3 Şevval 1444
    04:03
    05:49
    12:58
    16:48
    19:56
    21:28
  
  
    24.04.2023
    4 Şevval 1444
    04:01
    05:47
    12:58
    16:48
    19:58
    21:30
  
  
    25.04.2023
 ...

Now we can process that into something structured.

that's good..
my goal is to have this structure in the times.txt like below.

{"time":["04:03","05:49","12:58","16:48","19:56","21:28"],"date":"23.04.2023"}

The flow runs every day at 00:00 and capture the times for the current date

after that I use the times.txt in Home Assistant for some automation but that's already in place, I just need to fix the output in times.txt

After the HTML node, you can add this function node which will return an array of arrays:

const split1 = msg.payload[0].split(`
                                                                                
                                                                                
                                                                                    `)

const split2 = []

split1.forEach( line => {
    split2.push(line.trim().split(`
                                                                                    `) )
})

msg.payload = split2
return msg

Note that the strings for the splits look odd - that's because the content of the page uses a LOT of newline and tab characters. It would be much better to split using regular expressions which would be more robust.

Once you have the structured data, you can throw it into a table on Dashboard (or use the uib-element node set to table output if you are using uibuilder).

You think it's possible to work on the current and change ProcessData flow to have the correct output? Ideally having date array 1 and times in array 6.. I don't have skills in node-red someone built this for me while ago...

OK, I'll take pity on you as it is a quiet Sunday :slight_smile:

Replace these two nodes
image

With a single function node containing:

const split1 = msg.payload[0].split(`
                                                                                
                                                                                
                                                                                    `)

const split2 = []

split1.forEach( (line) => {
    /** @type {array} */
    const day = line.trim().split(`
                                                                                    `)

    split2.push({
        date1: day.shift(),
        date: day.shift(),
        time: day
    })
})

msg.payload = split2
return msg
1 Like

Thanks for your help :slight_smile: but I think I'll give up..
The output of times.txt is even worst now..

Oh, I never looked at that bit. Actually wouldn't matter, it will still work just fine. Just has some tab characters it doesn't need.

I have HomeAssistant automation that looks for the output of timex.txt .. if the structure is not in the format below by automation doesn't work..

expected timex.txt output
{"time":["04:03","05:49","12:58","16:48","19:56","21:28"],"date":"23.04.2023"}

Why not use a template and just add the array indexes you require.
e.g.

[{"id":"af97dd1eb36b36c8","type":"inject","z":"71d8a3f51325a89a","name":"","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":120,"y":60,"wires":[["b9bd128be5681bc5"]]},{"id":"b9bd128be5681bc5","type":"http request","z":"71d8a3f51325a89a","name":"","method":"GET","ret":"txt","paytoqs":"ignore","url":"https://namazvakitleri.diyanet.gov.tr/en-US/9132/prayer-time-for-montreal","tls":"","persist":false,"proxy":"","insecureHTTPParser":false,"authType":"","senderr":false,"headers":[{"keyType":"User-Agent","keyValue":"","valueType":"other","valueValue":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0"}],"x":210,"y":140,"wires":[["c1d11906950c26ff"]]},{"id":"c1d11906950c26ff","type":"html","z":"71d8a3f51325a89a","name":"html select","property":"payload","outproperty":"payload","tag":"td","ret":"text","as":"single","x":290,"y":220,"wires":[["1a4a36bc8fda9e9c"]]},{"id":"1a4a36bc8fda9e9c","type":"template","z":"71d8a3f51325a89a","name":"","field":"payload","fieldType":"msg","format":"handlebars","syntax":"mustache","template":"{\"time\":[\"{{payload.2}}\",\"{{payload.3}}\",\"{{payload.4}}\",\"{{payload.5}}\",\"{{payload.6}}{{payload.7}}\"],\"date\":\"{{payload.0}}\"}","output":"str","x":380,"y":280,"wires":[["63999ac0c1ecc5d5"]]},{"id":"63999ac0c1ecc5d5","type":"debug","z":"71d8a3f51325a89a","name":"debug 265","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":490,"y":220,"wires":[]}]

1 Like

That;s the output I gave you. The output from the function node I shared, you can simply save to the file.

wow the template node did the trick... thanks @E1cid

for my knowledge and if the website change tables in the future.

I see that for HTML Select node you used : td does that mean the node select all the cells with tables on the webpage?

Also in the template node how to match the array indexes? You had them added correctly..
eg: how to know payload.2 is cell that has value of 04:52 ?

Again thank you guys for your patience and assistance..

AS far as i could tell there is one table, the td selector returns all text in the td cells.

The template node does not know, i set the indexes from looking at the debug output of the html nodes. If the html of the webpage change you will have to edit the template.

1 Like