Dashboard load times > 30 sec with more than 250 connections to website

Found the 'message length' table in Chrome. Thanks for pointing that out.

Those large messages are a result of using the FIFO subflow (FIFO (flow) - Node-RED).
When a message comes in that matches the conditions for that table, its pushed in on the top of that table and they all shuffle down.
So yeah, we get a message that is 50 table elements long pushed to every connected browser.

I think I am back to the best summary of the whole situation here:

1 Like

Spreading workload on server side helps server side. Client side still needs to parse all the data. Crap or no crap. I'd still start from sanitizing the output and then if not enough figure out next steps.

If there is data in the rows of them 50 items that is not needed, dont send it to the UI.

Pretty simple really.

As @hotNipi states, sanitise the data you push down the pipe to only be the data needed for that page and only when required.

Given the tips and tools you two have given me this morning, I was able to make the following changes this morning:

image

Quite the impact, see how tomorrow goes.

The front page was the worst by far. I did not know that the msg.payload was also going out.
Also it was updating the whole ui_table on any single element update, no need for that.
I put a rate limit and RBE nodes in and its calmed it waaaaay down.

Quick review of the site shows no obvious un-needed data being sent, but I will spend the weekend going over it page by page to be sure.

Thanks so very much for all the quick help guys.
Learning, always learning. So grateful to have the community take the time to teach.

2 Likes

what about that ^

The Raw string property in every row of the msg.payload array does not appear to be used to populate the table - yet it makes up ~80% of the bytes in the payload!

Not necessarily completely correct in this case. There is so much data flowing into the Dashboard, that is going to be causing issues I think.

By splitting the Dashboard, that data won't necessarily always be being sent to 200 users.

I'm not on position to argue about this case. But I eat websocket traffic analysis for breakfast and sometimes also for lunch Monday to Friday. And our server crowd hates me. Sort of.

3 Likes

Just back to this one.
I don't think we got an answer on where these could be adjusted.....
(Im still seeing a bit of activity along those lines in the PM2 monit etc as per the original post screenshot).

Just out of interest, I did an experiment to see how much benefit would be gained by removing the Raw property on large messages

Very curious, what were the biggest impact changes here? Just sending too much over the wire via msg?

(slight aside - in the original dashboard 1 - it originally only sent the msg.payload/value to the front end - which restricted it quite a lot traffic wise (no large .req or .re objects etc) - but then of course people wanted .label and other things - and in the end we sent label, classname, format, colour, units, tooltip and icon... but that was all...)

2 Likes

I saw your point about 'msg.raw' yesterday and removed them all from any table that was not using them.

Thats a solid summary, yes.
That and the home page message speed linear gauges were updating faster than necessary.
I did not know that the msg.payload was also being sent out along with msg.tbl object, so that was the 'too much' aspect and then I did not grasp that the whole table was being sent / updated when any one table element was being updated.
So I dropped all msg.payloads and msg.raw and put RBE and rate nodes on any page with any data table that could be slowed down - really only 1 or 2 of them.

This mornings latency graph is AMAZING!

You can see how we caught yesterdays while it was happening and it only got up to 3 seconds before I deployed the home page changes.

So far this morning, barely 1 second load time. That is amazing.

image

About the usual numbers of live users as well, so I think we can call this solved.
Hard to know which answer to check, but fantastic outcome.
At this stage, I think we can delay cutting the site up into many Node-REDs.
Hopefully the option to change the base URL and spread the load that way will ship soon.

Thanks again everyone. Really really apricate the help.

5 Likes

Still, the errors in browser console. Those happen in main page 10 times (cos there is 10 ui_templates) but majority of pages have same issue.

Here's how to get rid of them.

The data structure you are sending for tables is array of array of objects

image

And then the items defined
image

If you get rid of that extra level of array, it works without any erorrs.

So that the msg.tbArgs is an array of objects

image

And items binding to that array

image

That's for every ui_template where you have same kind of data structure.

1 Like

BTW I think @joepavitt has recently created this ticket as a reminder that this kind of stuff needs improvement. I.e. storage and pushing of input messages.

Im not following here... (did your post get cut short?)
I see you describe the issue twice, but no solution.

I build the array in the msg.payload and move it into the table.
Not sure how to build the table outside of the payload....

If I remove the '[0]' from the end of the msg?.tbArgs, nothing show up.

If you change something on one side, you need to change it on the other side

The point is msg.payload is already an array. Then you put msg.payload into msg.tbArgs inside another array

Ok, sounds like something a programmer needs to look at for me.
I have access to one on Tuesday, thanks for the extra info.

For those following along....

When @hotNipi said that I needed to 'remove' the array, I did not fully follow where it had to be removed from.
The point that I had overlooked was that I had copy/paste some code at some point when getting the table built that already created msg.payload to be an array on line 1.

Then I had square brackets around msg.payload on line 16, creating yet another array.

Then still another array in the ui_template node on line 4:

(These screenshots are with all but one array removed - after my programmer untangled the arrays of arrays).

So it was not really a case of removing the array from both sides (what I took from the meaning):

It was more a case of only have one array at the source.

On the up side, it seems to have reduced the number of browser errors as mentioned it would.
(Dash2 really does seem detached from the 'low code' foundation that Node-RED has built).

I had two Tweets last night (PDT) that caused a spike in visitors to the site (after I pushed the array changes). Both visitor surges caused some added latency to the site load time.

But a peak load of ~2 seconds is a VAST improvement to what it used to be.
The morning peaks are also all but eliminated.

Its clear that these improvements have helped, but I cant help shake the feeling they have only bought me a bit of time. In some ways, I hope the site does not become any more popular or well known as I don't think I can support too many more users.
ie, its only a matter of time till I am going to have to break the site up and have 4-6 Node-RED instances running a few dash2 some how merged into one site to support the traffic.

Thanks again for all the help everyone.
I know my use case is way outside of the norm for this platform, its sorta fun to push it a bit harder than expected.

2 Likes

Okay, I'll bite. As always, feature requests are welcome. If you don't feed back to us what you'd like improved, then we can't focus our attention on those areas.

We're always improving and lowering the entry point for users. It's already lower-code than Dashboard 1.0 was, and we continue to lower that with each release, e.g. low-code Dialogs now supported