Parsing html data

Hi,
it's first time i try to read data from a web-page. I can read the data, i get ~7700 characters, and when looking into the data manually i find the data i'm looking for somwhere in the middle:
· <a href=results.php?userid=xxxxxxx&offset=0&show_names=0&state=2&appid=>Überprüfung ausstehend (459)
The variable, in this case 459, is what i want to get out of all the stuff. Unfortunately i have no idea how to parse and extract that.
Would be happy for any help!

To get you started.

Match() https://www.google.com/url?sa=t&source=web&rct=j&url=https://www.w3schools.com/jsref/jsref_match.asp&ved=2ahUKEwiN3Zf_xu3mAhWPh1wKHfUXASQQFjAAegQIAxAB&usg=AOvVaw2YJp4ROddwl7vIVk3_Jd1v

and for regex https://stackoverflow.com/questions/13802334/js-regex-to-find-href-of-several-a-tags

If Überprüfung ausstehend ( is unique in the 7700 characters then you could use this method if you want to avoid programming a function node or using Regex

[{"id":"79354ee5.cb5df","type":"inject","z":"35be93aa.c0667c","name":"","topic":"","payload":"userid=xxxxxxx&offset=0&show_names=0&state=2&appid=>Überprüfung ausstehend (459) foo bar ","payloadType":"str","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":670,"y":100,"wires":[["eea2ec68.a8565"]]},{"id":"eea2ec68.a8565","type":"split","z":"35be93aa.c0667c","name":"","splt":"Überprüfung ausstehend (","spltType":"str","arraySplt":"1","arraySpltType":"len","stream":false,"addname":"","x":790,"y":100,"wires":[["f33f628c.167d3"]]},{"id":"49d28a89.dee4d4","type":"debug","z":"35be93aa.c0667c","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","x":1270,"y":100,"wires":[]},{"id":"f33f628c.167d3","type":"switch","z":"35be93aa.c0667c","name":"","property":"parts.index","propertyType":"msg","rules":[{"t":"eq","v":"1","vt":"num"}],"checkall":"true","repair":false,"outputs":1,"x":910,"y":100,"wires":[["564b6f92.d9063"]]},{"id":"564b6f92.d9063","type":"split","z":"35be93aa.c0667c","name":"","splt":")","spltType":"str","arraySplt":"1","arraySpltType":"len","stream":false,"addname":"","x":1030,"y":100,"wires":[["7e78ca37.387f54"]]},{"id":"7e78ca37.387f54","type":"switch","z":"35be93aa.c0667c","name":"","property":"parts.index","propertyType":"msg","rules":[{"t":"eq","v":"0","vt":"num"}],"checkall":"true","repair":false,"outputs":1,"x":1150,"y":100,"wires":[["49d28a89.dee4d4"]]}]

THX for both replys. The code sample works as standalone but not with real data; i need to figure out why.
Currently i'm reading the chapter RegExp from W3schools.

Looks like the HTTP request does not pass all of the data received.
Debug on the output of the http request looks like that:
payload: string

    <html lang="en">
    <head>

    <meta name="viewport" content="width=device-width, initial-scale=1">
<title>Log in</title>

    <meta charset="utf-8">
    <link type="text/css" rel="stylesheet" href="https://setiathome.berkeley.edu//bootstrap.min.css" media="all">

        <link rel=stylesheet type="text/css" href="https://setiathome.berkeley.edu/sah_custom_dark.css">
    <link rel="icon" type="image/x-icon" href="https://setiathome.berkeley.edu/images/logo7.ico"/>

    <link rel=alternate type="application/rss+xml" title="RSS 2.0" href="https://setiathome.berkeley.edu/rss_main.php">
    </head>
<body >
<div class="navbar-... statusCode: 200 headers: object date: "Mon, 06 Jan 2020 19:45:43 GMT" server: "Apache/2.2.15 (Scientific Linux)" x-powered-by: "PHP/5.3.3" expires: "Mon, 26 Jul 1997 05:00:00 UTC" last-modified: "Mon, 06 Jan 2020 19:45:43 UTC" cache-control: "no-cache, must-revalidate, post-check=0, pre-check=0" pragma: "no-cache" content-length: "7792" connection: "close" content-type: "text/html; charset=utf-8" x-node-red-request-node: "217924cc"

When i root this to a function, containing

var str = msg.payload;
var n = str.search(/ausstehend</i);
msg.payload = n;
return msg;

i get a -1
When i change the search string to something more closer to the beginning, like /container-fluid/
the result is 886.
This feeds my assumption, that the payload is shortened somewhere.

Haven't you seen the warnings? You should never parse HTMl using a regex...

Never ever? Why, what will happen? Will the earth end or my PC explode?

1 Like