Scrape page with cookies

I have page from KAESER compressor device.
Simple get request node give me only this

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<link rel="shortcut icon" href="favicon.ico">
<META HTTP-EQUIV="Content-type" CONTENT="text/html;charset=UTF-8">
<META HTTP-EQUIV="Content-Script-Type" CONTENT="text/javascript">
<META HTTP-EQUIV="Expires" CONTENT="-1">
<script type="text/javascript" src="common.js"> </script>
<script type="text/javascript" src="user_management.js"> </script>
<script type="text/javascript" src="sha256.js"> </script>
<title>Sigma Control 2</title>
<link rel="stylesheet" type="text/css" href="style.css">
</head>
<body onload="javascript:USR_CreateLoginReadOnlyPromptBox();">
    <div id="container">
        <div id="header_overlay">
        </div> <!--end id="header_overlay"-->

        <div id="logo_overlay">
            <a href="login.html" title="Home"><img src="images/logomini.gif" style="border:none;" alt="" width="102" height="31"></a>
        </div> <!--end id="logo_overlay"-->
    
        <div id="logout_overlay">
                  
        </div><!--end id="div_logout_txt"-->
    </div> <!--<div class="container">-->
</body>
</html>

If I save as file from browser I have data like this

<tr id="iolabel_air100"><td>AIR 1.00</td><td class="aio_value_output" id="air100">116.30</td><td>Ω</td><td class="aio_value_output" id="airT100">41.9</td><td class="classTUnit">°C</td></tr>
					<tr id="iolabel_air101"><td>AIR 1.01</td><td class="aio_value_output" id="air101">110.30</td><td>Ω</td><td class="aio_value_output" id="airT101">26.4</td><td class="classTUnit">°C</td></tr>

How do I use request with cookies or session?

I dont think the HTML will have any data regardless of wheher your requests have the correct headers/cookies/session data. My suspicion is the data in that table is probably populated by script (that is probably fetched from another endpoint) AFTER the HTML is loaded.

In your browser's developer tools, select the network tab & refresh the page. You should see all of the requests made. One of them will contain this data. Get that URL & use it in HTTP request node instead of trying to scrape the page.

Also, once you find the data, check out what headers were included in the request (you might need to add them to the request)

Looking at the html you posted, I suspect you are going to have to simulate a login.

I have something like this


image
and in node-red

[{"id":"a8faa7af.39251","type":"inject","z":"e0cba348.b7758","name":"","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":100,"y":1040,"wires":[["c9a82937.cfd908"]]},{"id":"1739859a.658f4a","type":"http request","z":"e0cba348.b7758","name":"","method":"POST","ret":"txt","paytoqs":true,"url":"http://10.0.0.68/chema/php/action_sensors.php","tls":"","persist":false,"proxy":"","authType":"","x":470,"y":1040,"wires":[["634cb9cb.180148"]]},{"id":"c9a82937.cfd908","type":"change","z":"e0cba348.b7758","name":"","rules":[{"t":"set","p":"headers","pt":"msg","to":"{\"Accept\":\"application/json, text/plain, */*\",\"Accept-Encoding\":\"gzip, deflate\",\"Accept-Language\":\"pl,en-US;q=0.7,en;q=0.3\",\"Connection\":\"keep-alive\",\"Content-Length\":\"21\",\"Content-Type\":\"application/json\",\"Cookie\":\"PHPSESSID=v5j02gqcu4snuetk3b97tluioa\",\"DNT\":\"1\",\"Host\":\"10.0.0.68\",\"Origin\":\"http://10.0.0.68\",\"Referer\":\"http://10.0.0.68/chema/php/sensors.php\",\"User-Agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0\"}","tot":"json"},{"t":"set","p":"payload","pt":"msg","to":"{\"action\":\"fetchall\"}","tot":"json"}],"action":"","property":"","from":"","to":"","reg":false,"x":280,"y":1040,"wires":[["1739859a.658f4a"]]},{"id":"634cb9cb.180148","type":"debug","z":"e0cba348.b7758","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","x":630,"y":1040,"wires":[]}]

What is wrong?

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.