For using node-red for combining/transforming multiple json microservice apis, some of whose responses can be very large in the 10s or occasionally even 100s of megabytes:
Is there any way to not have the http request node have to read the entire response before sending it on to the next node?
i.e. is there some way it can submit the request and then start streaming in the response and (here I'm assuming the response is a json array) split the array elements sending those individually to the next node, while the response is still streaming in?
I'm not sure if it helps but googling in the node.js world I often see references to the "JSONStream" package as a popular one for this kind of thing...
That is not possible. Do the microservices you are "talking" to have other API connection options (other than REST), like websockets, kafka, MQ etc ? These are made for streaming responses. Or does the REST API have options/parameters to offset the data ?
And no these are just normal REST apis though yeah some (not all) do have pagination options for asking for a specified number of records at an offset.
For those supporting pagination I guess a flow could hit the api repeatedly and split the array that comes back for each page to achieve something similar to what I was thinking?
Though after processing the array elements I will need to join the results back together in the end for my final result.
I should have said I've played a bit with node-red a couple years ago I'm not a total beginner but no expert btw. But I do node.js programming for my day job so I could write function tasks or even a custom node if need be.
We process these json apis I'm referring to using node.js code already but we are having performance challenges there too. That code doesn't use a streaming/dataflow style if I could achieve what I'd described in node-red I'd hoped to see how it's performance compared to our code (though I wish node-red also had some kind of web worker/multi-threaded support it still doesn't right).
fwiw here's one of the articles that made me think theoretically what I described should be possible even in node-red (though I haven't tried maybe I'm missing something):
fwiw here's one of the articles that made me think theoretically what I described should be possible even in node-red (though I haven't tried maybe I'm missing something)
This is just a dirty workaround, writing/reading while the data is coming in. It is kind of weird to have a RESTful API that delivers such huge loads of data, it hurts the performance of the server too i can imagine.
No problem. For those with pagination - which is typical for things like Graph and OData API's, the presence of the paging link is usually sufficient and you can simply put in a link back to the input of your request with the new URL substituted.
There are now many API's that are capable of returning huge datasets. I was playing with the Microsoft Graph API just a few days ago, our Azure Active Directory is "only" less than a 100,000 entries but obviously that is still a whacking great set of data. But not that unusual to need to fetch all of the user entries to do analysis. The paging is the only thing that allows this to be feasible since the back-end processing is also paged so that the servers aren't overwhelmed.
Just remember that you might need to put in a delay node to prevent you from spamming the server otherwise you might get banned. A few seconds is normally more than enough.