How to get a webpage, then submit a form using a POST request

Hello,

Here is my scenario: I need to submit a form from an external webpage (POST) and then scrape the output page... on the first page, there is a <form action="/" method="post" id="search_form"> with different options.

What I am trying to do: I am trying to send a first http-request (GET), this gives me the first page where the form is, then I try to send the POST request, but I get an empty response...

Is it possible to simulate this kind of navigation with node-red?

you need to create a http request node that posts the form options and id with the correct headers. The url seems to be the root url. You should be able to do this in one request.
this thread may help

1 Like

There is an input type hidden showing a token that needs I think to be passed... I am trying to. figure out how to scrape it first with scrape-it

It's hard help with no information. you can probably scrape the info with the html node, looking for the "form" tag, that may return all options, id and hidden inputs

Ok, I still struggle, let me explains more details :slight_smile: about the page I try to scrape: https://www.prix-carburants.gouv.fr

First I do manually my search with the webpage, from the Chrome inspect I see:

Html code form information of the page, I also get the input fields necessary to build my request

<form action="/" method="post" id="search_form">
<input type="submit" value="Voir la liste des stations" class="submit_recherche">

Network shows the headers of the request

:authority: www.prix-carburants.gouv.fr
:method: POST
:path: /
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7,de;q=0.6
cache-control: max-age=0
content-length: 314
content-type: application/x-www-form-urlencoded
cookie: atuserid=%7B%22name%22%3A%22atuserid%22%2C%22val%22%3A%223160539b-a16a-456a-9537-1a363993b0a8%22%2C%22options%22%3A%7B%22end%22%3A%222023-01-03T15%3A46%3A25.274Z%22%2C%22path%22%3A%22%2F%22%7D%7D; public=h8ha1a8ajdhjotpnr2nmsg3hu5; atauthority=%7B%22name%22%3A%22atauthority%22%2C%22val%22%3A%7B%22authority_name%22%3A%22cnil%22%2C%22visitor_mode%22%3A%22exempt%22%7D%2C%22options%22%3A%7B%22end%22%3A%222023-01-19T10%3A18%3A56.999Z%22%2C%22path%22%3A%22%2F%22%7D%7D
origin: https://www.prix-carburants.gouv.fr
referer: https://www.prix-carburants.gouv.fr/
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: same-origin
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/xx.0.xxx Safari/537.36

Network shows the payload

rechercher[geolocalisation_long]: 
rechercher[geolocalisation_lat]: 
rechercher[departement]: 01
rechercher[localisation]: 
rechercher[type_enseigne]: 
rechercher[_token]: 7f719d1625af5fee2ca1c.tNkLEhiHAroH4149jTgaTfwYKU9hANpMAX6BJinx49I.ge4-JSzNdPk0rgoQ1GpXAJFIGz4vMY4GYxW3dmSo1KfArGViYNBHy3bVbg

I did the following :slight_smile: : first I sent a get to capture the token from the parsed paged, using HTML node and appropriate selector, then I build the headers and the payload for the POST request, then I shoot, but unfortunately, it simply returns the search page...

headers (simulating a browser)

cookie={};
cookie= 'atuserid={"name":"atuserid","val":"3160539b-a16a-456a-9537-1a363993b0a8","options":{"end":"2023-01-03T15:46:25.274Z","path":"/"}}; public=h8ha1a8ajdhjotpnr2nmsg3hu5; atauthority={"name":"atauthority","val":{"authority_name":"cnil","visitor_mode":"exempt"},"options":{"end":"2023-01-19T09:33:17.178Z","path":"/"}}';

msg.headers = {};
msg.headers['user-agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/xx.0.xx.xx Safari/537.36';
msg.headers['accept-language']= 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7,de;q=0.6';
msg.headers['cache-control'] = 'max-age=0';
msg.headers['content-type']= 'application/x-www-form-urlencoded';
msg.headers['cookie']= cookie;

payload

data = { 
'rechercher[geolocalisation_long]': '', 
'rechercher[geolocalisation_lat]': '', 
'rechercher[departement]': 01,
'rechercher[localisation]': '', 
'rechercher[type_enseigne]': '',
'rechercher[_token]': token
};

msg.url='https://www.prix-carburants.gouv.fr/';

msg.payload = data;

I had a play at this today with no luck. It seems the result of the form is displayed in the map, so there is JS involved here i think. I will try again in the week and will post back if i find any solution.

Hopefully some others may know a trick or two.

1 Like

Thanks, actually when I try manually from Chrome, the result is rendered in a page with a redirection, it seems I do not get to this page after the redirection happens, hence I may have wrong headers or post payload...