Question about extraction of text on HTML webpage

Dear Node-red community.

I read a lot of post on how to use html request and html node and I can't do that I want. Let me explain :
I want to extract and to post 2 text on my dashbord from this web site :

RTE Tempo

The two value is in red on my picture

I save the html webpage on my computer and I see the two value :

Now, I want to extract this 2 text value with node-red. I test a lot of combinaison but It doesn't work.

Can you help me on this problem.

Thanks a lot.

  1. First you need to get the page source (html code) in node-red.
    This you can do with the http request node.

  2. Then you need to extract the relevant info from the page source.
    You can do this by using an html node and specifying the appropriate css-selector.

The appropriate css-selector can be found by opening the page in chrome.
Select the field Jour blue, right click and select inspect.
Then in the elements tab it selects the corresponding page source.
If you then select that line in the elements tab and right click and select Copy > Copy Selector" It will copy the css selector to your clipboard which you then can paste into the configuration of your node-red html node.

Using the same chrome developer tools window you can also test if a css selector is selecting the appropriate information.
For that press Command + Option + F (Mac) or Control + Shift + F (Windows/Linux) in the elements tab to open the search bar at the bottom.
In that search bar you can then paste the css selector and validate if it is selecting the correct thing.

Dear janvda,

Thanks you very much for your help.
I use Chrome to copy the CSS code and I include it in the html note. But, my msg.payload say : [Empty]

There is my export node :

[
    {
        "id": "74c8759c107a9b82",
        "type": "inject",
        "z": "3ae78e8265bcb79e",
        "name": "make request",
        "repeat": "",
        "crontab": "",
        "once": false,
        "topic": "",
        "payload": "",
        "payloadType": "date",
        "x": 150,
        "y": 1320,
        "wires": [
            [
                "da6e7a307965c4f1"
            ]
        ]
    },
    {
        "id": "da6e7a307965c4f1",
        "type": "http request",
        "z": "3ae78e8265bcb79e",
        "name": "",
        "method": "GET",
        "ret": "txt",
        "paytoqs": "ignore",
        "url": "https://www.services-rte.com/fr/visualisez-les-donnees-publiees-par-rte/calendrier-des-offres-de-fourniture-de-type-tempo.html",
        "tls": "",
        "persist": false,
        "proxy": "",
        "insecureHTTPParser": false,
        "authType": "",
        "senderr": false,
        "headers": [],
        "x": 310,
        "y": 1440,
        "wires": [
            [
                "78a76b655807cd90"
            ]
        ]
    },
    {
        "id": "a3878f8e23a7bb9f",
        "type": "debug",
        "z": "3ae78e8265bcb79e",
        "name": "",
        "active": true,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "payload",
        "targetType": "msg",
        "statusVal": "",
        "statusType": "auto",
        "x": 1110,
        "y": 1440,
        "wires": []
    },
    {
        "id": "78a76b655807cd90",
        "type": "html",
        "z": "3ae78e8265bcb79e",
        "name": "",
        "property": "",
        "outproperty": "",
        "tag": "#wrapper > div > div > div.c-editorial-page__container > div.c-editorial-page__content > tempo > div > div.o-grid.c-tempo > div:nth-child(1) > div.c-tempo__bloc-1__body.c-tempo__background--blue",
        "ret": "text",
        "as": "single",
        "x": 780,
        "y": 1640,
        "wires": [
            [
                "a3878f8e23a7bb9f"
            ]
        ]
    }
]

What is worng ?

Thanks

The page source retrieved by the http request node is not the same as the page source you see in your browser.
This is because for that page your browser will do several other GET requests to construct your page.

I found following request in the list of GET requests:

I am wondering if that request is not directly returning the information you are looking for.

Dear janvda,

Thanks for your message.
Actually, I want to extract the color day (like blue) and the tomorrow color (like blue).
If I check your link, I have only a calendar view.

It become very complicated for me lol. I'm a newbie with node-red. Maybe, it exist another way to simply extract this 2 values ?

Indeed the link gives information for other dates as well, but I think that the link also contains the information you want for today and tomorrow.

Here below a flow that is extracting the information for today and tomorrow from url:

[
    {
        "id": "c1771c42fd752d19",
        "type": "inject",
        "z": "a30b805e19d6ab43",
        "name": "",
        "props": [
            {
                "p": "payload"
            },
            {
                "p": "topic",
                "vt": "str"
            }
        ],
        "repeat": "",
        "crontab": "",
        "once": false,
        "onceDelay": 0.1,
        "topic": "",
        "payload": "",
        "payloadType": "date",
        "x": 190,
        "y": 160,
        "wires": [
            [
                "0c424a93e087991e"
            ]
        ]
    },
    {
        "id": "0c424a93e087991e",
        "type": "http request",
        "z": "a30b805e19d6ab43",
        "name": "",
        "method": "GET",
        "ret": "txt",
        "paytoqs": "ignore",
        "url": "https://www.services-rte.com/cms/open_data/v1/tempo?season=2022-2023",
        "tls": "",
        "persist": false,
        "proxy": "",
        "insecureHTTPParser": false,
        "authType": "",
        "senderr": false,
        "headers": [],
        "x": 370,
        "y": 160,
        "wires": [
            [
                "c5b0074a60e66be6"
            ]
        ]
    },
    {
        "id": "558948738112c46b",
        "type": "debug",
        "z": "a30b805e19d6ab43",
        "name": "json output",
        "active": true,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "payload",
        "targetType": "msg",
        "statusVal": "",
        "statusType": "auto",
        "x": 610,
        "y": 120,
        "wires": []
    },
    {
        "id": "c5b0074a60e66be6",
        "type": "json",
        "z": "a30b805e19d6ab43",
        "name": "",
        "property": "payload",
        "action": "",
        "pretty": false,
        "x": 550,
        "y": 160,
        "wires": [
            [
                "558948738112c46b",
                "cfa7222b2b9f5c46"
            ]
        ]
    },
    {
        "id": "cfa7222b2b9f5c46",
        "type": "change",
        "z": "a30b805e19d6ab43",
        "name": "extract today and tomorrow colors",
        "rules": [
            {
                "t": "set",
                "p": "payload",
                "pt": "msg",
                "to": "( \t  $today:=$now(\"[Y0001]-[M01]-[D01]\");\t  $tomorrow:=(($now()~>$toMillis())+(24*60*60*1000))~>$fromMillis(\"[Y0001]-[M01]-[D01]\");\t  {\t      \"today\" : $today,\t      \"tomorrow\" : $tomorrow,\t      \"today_color\" : payload.values~>$lookup($today),\t      \"tomorrow_color\" : payload.values~>$lookup($tomorrow),\t      \"today_fallback\" : payload.values~>$lookup($today  & \"-fallback\"),\t      \"tomorrow_fallback\" : payload.values~>$lookup($tomorrow  & \"-fallback\")\t  }\t)",
                "tot": "jsonata"
            }
        ],
        "action": "",
        "property": "",
        "from": "",
        "to": "",
        "reg": false,
        "x": 800,
        "y": 160,
        "wires": [
            [
                "a49b534377d9bfcb"
            ]
        ]
    },
    {
        "id": "a49b534377d9bfcb",
        "type": "debug",
        "z": "a30b805e19d6ab43",
        "name": "today / tomorrow color",
        "active": true,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "payload",
        "targetType": "msg",
        "statusVal": "",
        "statusType": "auto",
        "x": 1000,
        "y": 120,
        "wires": []
    }
]

So this is the output if I trigger the above flow in the node-red editor:

janvda, Just Waow. It work fine for me.

Now, I must arrange the text to show only the 2 color on my dashboard.

Janvda, thanks you very much for your help.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.