Railway hosted Node-RED (3.0.2) instance updated and then 9 days later Deploy attempts return a 502/timeout

utmostGrandPoobah · 8 March 2026 22:25

Expected outcome:

As a user of the Node-RED editor, when I click the Deploy button (to deploy modified flows, nodes, or entire tab) the editor submits the changes I have made to the Node-RED backend which then persists them in my storage, in this specific implementation, by calling the saveFlows() function which saves the Flow definition into an attached postgres DB which the middleware has access to calling and saving into

Current outcome:

As an Editor user, when I click the Deploy button to save and deploy my changes, the 'saving' UI animation displays and the xhr/network POST request hangs for 5 seconds before returning
an error Toast notifs about the timeout.

Inspecting the Railway http logs shows this log data:

{
  "requestId": "gxz6QNYuQOaSeqSLnpoFkQ",
  "timestamp": "2026-03-08T14:52:43.751105835Z",
  "method": "POST",
  "path": "/red/flows",
  "host": "node-red-production-11f2.up.railway.app",
  "httpStatus": 502,
  "upstreamProto": "",
  "downstreamProto": "HTTP/2.0",
  "responseDetails": "",
  "totalDuration": 300000,
  "upstreamAddress": "",
  "clientUa": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36",
  "upstreamRqDuration": 300000,
  "txBytes": 109,
  "rxBytes": 17440,
  "srcIp": "73.241.93.27",
  "edgeRegion": "us-west2",
  "upstreamErrors": "[{\"deploymentInstanceID\":\"c17751b9-2831-451a-aa16-856cf2706cc4\",\"error\":\"i/o timeout\",\"duration\":300000}]"
}

Question for you:

What is the best way to resolve going forward?
If the recommended way is to update the Node-RED instance from 3.0.2 to 4.x.x, are there any footguns wrt to the settings.js file and package.json that I should be aware of? ie any changes I need to make to these or related files to ensure a smoother update?

Maybe I should consider testing an update to 3.1.x and see if that begins to resolve this issue?

Summary of current Root Cause Analysis

Deploy (POST /flows) hangs for ~5 minutes then fails with BadRequestError: request aborted. The saveFlows console.log never appears, confirming the failure happens during body-parser reading the POST body — before Node-RED's storage module is ever invoked.

Summary of things attempted:

This was attempted to resolve npm race conditions which were thought to be the original Root Cause of the Timeouts

File	Change
settings.js:30	userDir: '/data' → userDir: path.join(__dirname, 'data')
data/package.json	NEW — flow package manifest for pre-install
package.json	ADD scripts.postinstall
.gitignore	ADD data/node_modules

Timeline of changes made to this Node-RED instance's repo

2023-06-29 - A fork is made of the node-red-heroku github repo that I have also been using for another project and has been behaved pretty much as expected in the last 5 years; the fork is updated with contemporary dependencies and to work in the context of Railway behavioral deploy + runtime, including setting up Railway postgres that persists flow json data.
2023-January 2026, use this Railway NodeRED instance as a testbed for various light weight flows, some flows are a little more "production grade" largely executing initial data engineering/ETL workloads (<1000 records) and loading in various npm modules including googleapis, mysql2, jsforce
2026-02-26 - Almost 3 years later, we need to robustify this instance a bit more because we want to backup flows in github repos which means we need to be much more serious about removing API keys from flows themselves (always a good practice where the need hadnt become apparent until closer to this timeline) and using the .env file.
3a. All login/key data is identified and moved into env files and referenced as such
3b. when I perform this change, and after restarting the Railway/NR instance, I notice Deploy-stopping errors related to npm errno-39 related to various npm modules, including ulid and keyword-extractor:

npm error path /app/node_modules/keyword-extractor
npm error dest /app/node_modules/.keyword-extractor-VZI29erN
npm error errno -39

3c. I update the package.json, run yarn install (Railway specific requirement) and build a new image/container on Railway for this NR instance without any issue. This also resolves the "Inject/runtime blocker" that a missing npm module imposes on the Node RED editor.
4. 2026-03-06: Current outcome appears; almost 9 days after this yarn install, I attempt to deploy/save flow changes in the editor and encounter this timeout/hanging
5. 2026-03-08: various debugging/troubleshooting leads me to making significant updates to the settings.js and package.json files (along with the yarn.lock file) and while npm race conditions seem to have been handled, the UI hanging/pending xhr/502 timeout issue are persisting

utmostGrandPoobah · 9 March 2026 06:31

2026-03-08: Updated "node-red": "^3.1.15", successfully "logged in" and attempt a simple "move node and then deploy change" and the 502/timeout issue is persisting.

I also inspected the POST'd payload and after view & copying the source into a text file, found the file about 2.7MB in size, which seems to be approaching the upper limits for Railway's ingress proxy which might be killing the network request before it reaches the actual NR middleware for calling things like saveFlow()

and confirming, I had updated the settings.js file with the apiMaxLength: '20mb', kvp as my initial resolution attempt

What to try next?

should I ask Railway to increase proxy limits? I feel like it's less likely they would do this for an apparent edge case of my need
I have tried deleting a bunch of nodes out of the existing canvas but then when I try to save, I still run into this hanging save/timeouts issue
Apparently 4.x.x Node-RED versions have a websocket approach to deploying which may be more efficient and bypasses this issue?

TotallyInformation · 9 March 2026 14:55

Not aware of the process changing between v3 and v4 though, to be honest, I haven't really looked. Here you can see the network tab on an existing open editor where I moved 1 node and then re-deployed. As you can see,

As you can see, the deploy does not use websockets. I tested this separately and a deploy gives just this over websockets:

So hardly anything.

My nodes file is about 2Mb on disk (around 2,200 nodes) so the actual sending of the updated flow at jut 0.4kb seems very efficient - presumably thanks to being gzipped in transit?

Perhaps check the network tab of your browser dev tools to make sure that the server isn't misconfigured and refusing gzipped content?

utmostGrandPoobah · 9 March 2026 18:55

Thanks @TotallyInformation , that's a great point about v4 if 4.x.x versions actually gzip the flows before POST'ing them that might actually help if the root cause is in fact the cloud proxy throttling the network request because the POST'd document is just too big and it's throwing everything off.

For updating (we're jumping from 3.0.2 -> 3.1.15 [as of 2026-03-09]) would you be aware if I need to update any of the packages or explicitly upgrade the adjacent server software that Node-RED is bundled with?

TotallyInformation · 9 March 2026 21:45

Honestly, if you can run up a test system (which your really should have when dealing with production services based on microservice architectures every bit as much as other architectures), I would do a paper exercise to review your existing nodes and check any that will need major version updates for what might break. Then I would update node.js to an LTS version, probably v22 right now and install the latest versions of all the nodes then import your existing flow (or test flow if you have to stub some things out). Node-RED doesn't really change all that fast or that dramatically so it is quite possible that you wouldn't have too many issues - if any. And if that is the case, you can move direct to NR v4 on node.js v22. Moving to another v3 seems like a waste of time if you can avoid it. Especially since, before too long, you should be thinking about NR v5 and node.js v24.

Node-RED only relies on node.js packages which does rather simplify the management. Just note that if you update node.js to a new major version in an existing instance of NR, you should do an npm rebuild of all installed packages - this re-makes any binary packages.

utmostGrandPoobah · 9 March 2026 22:59

Ah yes, thanks for that, I phrased that poorly: I have already updated from 3.0.2 to 3.1.15 (final 3.x.x version) as of 'yesterday' 2026-03-09. This update as I noted earlier did not resolve the issue.

To your latest point, I'll start updating node to v22 locally and push that up as part of the update process to NR 4.x.x

Colin · 10 March 2026 11:30

Make sure that you upgrade any contrib nodes that you use to their latest version, otherwise they might not be compatible with the later nodejs.

utmostGrandPoobah · 10 March 2026 18:23

thanks for the tip @Colin !

I marked @TotallyInformation 's earlier reply as the solution, because effectively that was the resolution. Updating to 4.1.7 resolved the timeout issue, presumably because the flow size has shrank from around ~2.7mb (judging by flow file size) to the ~0.35kb file size and that Timeout issue was likely the cloud host proxy throttling the request.

Thanks everyone!

Topic		Replies	Views
Deploy "Hangs" all of a sudden General	13	927	14 January 2022
Deployment time is long General	23	1316	1 May 2021
Deploy failing. No response from server General	36	9825	25 January 2024
Small issue - update? General	8	502	20 August 2019
NR flows update very slow when >> 500 flows General	14	654	4 August 2024