Railway hosted Node-RED (3.0.2) instance updated and then 9 days later Deploy attempts return a 502/timeout

Expected outcome:

As a user of the Node-RED editor, when I click the Deploy button (to deploy modified flows, nodes, or entire tab) the editor submits the changes I have made to the Node-RED backend which then persists them in my storage, in this specific implementation, by calling the saveFlows() function which saves the Flow definition into an attached postgres DB which the middleware has access to calling and saving into

Current outcome:

As an Editor user, when I click the Deploy button to save and deploy my changes, the 'saving' UI animation displays and the xhr/network POST request hangs for 5 seconds before returning
an error Toast notifs about the timeout.

Inspecting the Railway http logs shows this log data:

{
  "requestId": "gxz6QNYuQOaSeqSLnpoFkQ",
  "timestamp": "2026-03-08T14:52:43.751105835Z",
  "method": "POST",
  "path": "/red/flows",
  "host": "node-red-production-11f2.up.railway.app",
  "httpStatus": 502,
  "upstreamProto": "",
  "downstreamProto": "HTTP/2.0",
  "responseDetails": "",
  "totalDuration": 300000,
  "upstreamAddress": "",
  "clientUa": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36",
  "upstreamRqDuration": 300000,
  "txBytes": 109,
  "rxBytes": 17440,
  "srcIp": "73.241.93.27",
  "edgeRegion": "us-west2",
  "upstreamErrors": "[{\"deploymentInstanceID\":\"c17751b9-2831-451a-aa16-856cf2706cc4\",\"error\":\"i/o timeout\",\"duration\":300000}]"
}

Question for you:

What is the best way to resolve going forward?
If the recommended way is to update the Node-RED instance from 3.0.2 to 4.x.x, are there any footguns wrt to the settings.js file and package.json that I should be aware of? ie any changes I need to make to these or related files to ensure a smoother update?

Maybe I should consider testing an update to 3.1.x and see if that begins to resolve this issue?

Summary of current Root Cause Analysis

Deploy (POST /flows) hangs for ~5 minutes then fails with BadRequestError: request aborted. The saveFlows console.log never appears, confirming the failure happens during body-parser reading the POST body — before Node-RED's storage module is ever invoked.

Summary of things attempted:

  1. This was attempted to resolve npm race conditions which were thought to be the original Root Cause of the Timeouts
File Change
settings.js:30 userDir: '/data' → userDir: path.join(__dirname, 'data')
data/package.json NEW — flow package manifest for pre-install
package.json ADD scripts.postinstall
.gitignore ADD data/node_modules

Timeline of changes made to this Node-RED instance's repo

  1. 2023-06-29 - A fork is made of the node-red-heroku github repo that I have also been using for another project and has been behaved pretty much as expected in the last 5 years; the fork is updated with contemporary dependencies and to work in the context of Railway behavioral deploy + runtime, including setting up Railway postgres that persists flow json data.
  2. 2023-January 2026, use this Railway NodeRED instance as a testbed for various light weight flows, some flows are a little more "production grade" largely executing initial data engineering/ETL workloads (<1000 records) and loading in various npm modules including googleapis, mysql2, jsforce
  3. 2026-02-26 - Almost 3 years later, we need to robustify this instance a bit more because we want to backup flows in github repos which means we need to be much more serious about removing API keys from flows themselves (always a good practice where the need hadnt become apparent until closer to this timeline) and using the .env file.
    3a. All login/key data is identified and moved into env files and referenced as such
    3b. when I perform this change, and after restarting the Railway/NR instance, I notice Deploy-stopping errors related to npm errno-39 related to various npm modules, including ulid and keyword-extractor:
npm error path /app/node_modules/keyword-extractor
npm error dest /app/node_modules/.keyword-extractor-VZI29erN
npm error errno -39

3c. I update the package.json, run yarn install (Railway specific requirement) and build a new image/container on Railway for this NR instance without any issue. This also resolves the "Inject/runtime blocker" that a missing npm module imposes on the Node RED editor.
4. 2026-03-06: Current outcome appears; almost 9 days after this yarn install, I attempt to deploy/save flow changes in the editor and encounter this timeout/hanging
5. 2026-03-08: various debugging/troubleshooting leads me to making significant updates to the settings.js and package.json files (along with the yarn.lock file) and while npm race conditions seem to have been handled, the UI hanging/pending xhr/502 timeout issue are persisting

2026-03-08: Updated "node-red": "^3.1.15", successfully "logged in" and attempt a simple "move node and then deploy change" and the 502/timeout issue is persisting.

I also inspected the POST'd payload and after view & copying the source into a text file, found the file about 2.7MB in size, which seems to be approaching the upper limits for Railway's ingress proxy which might be killing the network request before it reaches the actual NR middleware for calling things like saveFlow()

and confirming, I had updated the settings.js file with the apiMaxLength: '20mb', kvp as my initial resolution attempt

What to try next?

  1. should I ask Railway to increase proxy limits? I feel like it's less likely they would do this for an apparent edge case of my need
  2. I have tried deleting a bunch of nodes out of the existing canvas but then when I try to save, I still run into this hanging save/timeouts issue
  3. Apparently 4.x.x Node-RED versions have a websocket approach to deploying which may be more efficient and bypasses this issue?

Not aware of the process changing between v3 and v4 though, to be honest, I haven't really looked. Here you can see the network tab on an existing open editor where I moved 1 node and then re-deployed. As you can see,

As you can see, the deploy does not use websockets. I tested this separately and a deploy gives just this over websockets:
image
So hardly anything.

My nodes file is about 2Mb on disk (around 2,200 nodes) so the actual sending of the updated flow at jut 0.4kb seems very efficient - presumably thanks to being gzipped in transit?

Perhaps check the network tab of your browser dev tools to make sure that the server isn't misconfigured and refusing gzipped content?

1 Like

Thanks @TotallyInformation , that's a great point about v4 if 4.x.x versions actually gzip the flows before POST'ing them that might actually help if the root cause is in fact the cloud proxy throttling the network request because the POST'd document is just too big and it's throwing everything off.

For updating (we're jumping from 3.0.2 -> 3.1.15 [as of 2026-03-09]) would you be aware if I need to update any of the packages or explicitly upgrade the adjacent server software that Node-RED is bundled with?

Honestly, if you can run up a test system (which your really should have when dealing with production services based on microservice architectures every bit as much as other architectures), I would do a paper exercise to review your existing nodes and check any that will need major version updates for what might break. Then I would update node.js to an LTS version, probably v22 right now and install the latest versions of all the nodes then import your existing flow (or test flow if you have to stub some things out). Node-RED doesn't really change all that fast or that dramatically so it is quite possible that you wouldn't have too many issues - if any. And if that is the case, you can move direct to NR v4 on node.js v22. Moving to another v3 seems like a waste of time if you can avoid it. Especially since, before too long, you should be thinking about NR v5 and node.js v24.

Node-RED only relies on node.js packages which does rather simplify the management. Just note that if you update node.js to a new major version in an existing instance of NR, you should do an npm rebuild of all installed packages - this re-makes any binary packages.

2 Likes

Ah yes, thanks for that, I phrased that poorly: I have already updated from 3.0.2 to 3.1.15 (final 3.x.x version) as of 'yesterday' 2026-03-09. This update as I noted earlier did not resolve the issue.

To your latest point, I'll start updating node to v22 locally and push that up as part of the update process to NR 4.x.x

Make sure that you upgrade any contrib nodes that you use to their latest version, otherwise they might not be compatible with the later nodejs.