I'm working on building out some flows using the http-request nodes to download JSON payloads from an API. This API has data that changes every ~5 minutes and it takes ~2500-5000 calls to download all of the data (2million to 5million rows of data, 1000 row per page...).
The only issue I'm running into is that their CDN for the API stops accepting requests at around #1000. Is there a way to re-use the same connection to send multi-requests to the same endpoint?
That won't solve your problem I think? It seems to stop at a 1000 requests, not at a thousand connections, right?
Am I also right in thinking you need to do 2500 to 5000 requests every 5 minutes?
Well, its more than that to this API. This is just one end-point as an example. Its also the fastest caching endpoint on the API.
Basically, the issue this causes is that I now need to use separate flows to generate request URLs to this API and pipe them all into a single request flow so that I don't overload the connections.
Based on a quick look at the code and the logs of my requests, the http-request node is opening and closing the connection every time it is triggered.
The API CDN has two primary public IPs and if the connections were being maintained I wouldn't see both IPs showing up every time the requests get rate limited.
I still find this very confusing. You are looking for a persistent connection right?
Are you sure that the other end supports/allows that too?
What exactly do you mean with 'the API CDN'? Is the API on a CDN? In that case the two IP's that show up are the ip's of the server closest to you right? Can you then maybe do some form of round robin on the entire CDN?
What do you mean with 'Its also the fastest caching endpoint on the API' ? is that the 5 minute thing?
Also,and maybe more importantly, if you are not allowed to go over a 1000 connections and fetching all data would require at least 2500 requests in 5 minutes, then maybe you are not using their API in the correct way to begin with....
Sorry for al the questions.