Http request node - gzip encrypted response

Trying to download some statistical data from https://c19downloads.azureedge.net/downloads/msoa_data/MSOAs_latest.json

If I use that url in a browser, I get access to the full data, but when trying using the http request node I either get a lot of what appears to be encrypted data, or a message saying that the service is not available.

Are remote services able to block specific http requests (ie browsers only?).

If you use Devtools you can see what headers are used in the request & you can add those to the request.

Paul,
I have seen similar behaviour already in the passed, due to a missing user-agent . See e.g. this discussion.

1 Like

Have a look at this topic, maybe you have to set the User-agent.

Edit: Bart was a couple of seconds faster.

1 Like

Yes my friend, speed means everything on this forum :sunglasses:

1 Like

I've copied the 'user-agent' entry from the browser Dev tools, and added them to a function node - feeding a http request node, but it's still returning unreadable data or denied.

msg.url = 'https://c19downloads.azureedge.net/downloads/msoa_data/MSOAs_latest.json';
msg.headers = {};
msg.headers['USER-AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36';
return msg;

I get this response when I navigate to the URL using Chrome:

Our services aren't available right now

We're working to restore all services as soon as possible. Please check back soon.

And I get exactly the same result via a httprequest node:

[{"id":"55aff059.4d628","type":"http request","z":"be35bf3.aa6be4","name":"","method":"GET","ret":"txt","paytoqs":false,"url":"http://c19downloads.azureedge.net/downloads/msoa_data/MSOAs_latest.json","tls":"","persist":false,"proxy":"","authType":"","x":650,"y":560,"wires":[["181f6aa1.4a24c5"]]},{"id":"1d819cf8.726d23","type":"inject","z":"be35bf3.aa6be4","name":"","topic":"","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"showConfirmation":false,"confirmationLabel":"","x":450,"y":560,"wires":[["55aff059.4d628"]]},{"id":"181f6aa1.4a24c5","type":"debug","z":"be35bf3.aa6be4","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"true","targetType":"full","x":820,"y":560,"wires":[]}]

image

That's strange, using a uncached browser,I get the full data array of 6792 objects;

reply

If I use https (instead of http) then I get a json file in chrome:

But I get some ugly stuff via the http-request node (which cannot be parsed to json):

image

Yes, that's exactly the same as what I get.
(I am also now using the https url)

The request header is;

Accept	text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding	gzip, deflate, br
Accept-Language	en-GB,en;q=0.5
Connection	keep-alive
Host	c19downloads.azureedge.net
If-Modified-Since	Thu, 16 Jul 2020 14:21:40 GMT
If-None-Match	"0x8D829938E7B18BE"
TE	Trailers
Upgrade-Insecure-Requests	1
User-Agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0

As the json response is large, is it possible that the data is compressed (which would explain the 'ugly stuff'?

I "think" you are right. This is what I get when I debug the http-request node:

So it gets a quite large body, which contains a zipped json.
I have always been assuming that this kind of stuff happens behind the scenes automatically...

1 Like

Found a small code snippet on Stackoverflow, that 'might' solve this:

request = require('request');
zlib = require('zlib');

request(url, {encoding: null}, function(err, response, body){
    if(response.headers['content-encoding'] == 'gzip'){
        zlib.gunzip(body, function(err, dezipped) {
            callback(dezipped.toString());
        });
    } else {
        callback(body);
    }
});

Will try it now ...

1 Like

Adding a node-red-contrib-gzip node to the flow, plus a json node, and the output is correctly formed json!!

Thanks @BartButenaers

output

3 Likes

Ah ok, that is faster compared to a pull-request to request a Node-RED core change :rofl:
Now you have beaten me in speed ...

2 Likes

Very true!

3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.