Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router pages are huge #228

Open
lemmi opened this issue Dec 31, 2023 · 1 comment
Open

Router pages are huge #228

lemmi opened this issue Dec 31, 2023 · 1 comment

Comments

@lemmi
Copy link

lemmi commented Dec 31, 2023

A single initial page load of a router clocks in at about 3MB of data in total. Through compression of the main page, only 1.5MB are actually transferred over the wire. Each reload after that comes in at about 1MB.

There are a couple of low hanging fruit that can be picked for some easy improvements:

precompress assets

The used assets are transferred uncompressed. Using a compressed transfer-encoding, these assets could yield at least a 0.5MB improvement for the initial page load:

Encoding Transferred Size
Uncompressed ~750KB
gzip ~200KB
brotli ~170KB

use a more appropriate data format for /api/load_netif_stats/

On each page load, at least one call to /api/load_netif_stats/ is made. A single requests requires 500kB-600kB. This can easily brought down by providing another api endpoint that serves a different format.
A fitting choice could be Apache Parquet. It well supported in multiple languages, especially in python and js.
Just using the included delta encoding brings the size down to 60kB. Additionally enabling compression can further improve this to 50kB at the cost of more overhead.
Integration should be very easy. Here is the small test program I used to compare the sizes:

from pyarrow import json
import pyarrow.parquet as pq

table = json.read_json('br-client.json')
pq.write_table(table, 'br-client.parquet', use_dictionary=False, compression='NONE', column_encoding='DELTA_BINARY_PACKED')

split router stats into api

The delivered html embeds a huge portion of the stats inline as javascript variables here . This is problematic for several reasons.

  • The resulting file is in the order of 1.7MB that need to be generated. At least the used compression brings that down to 250kB transferred.
  • It makes caching of the data or the page almost impossible.
  • It forces the use of an inefficient format. Using parquet will yield a size of about 90kB. Additional compression can further improve this to 75kB

Once these or similar changes are made (there might be another file format more suited for example), there is another option to vastly improve the server load and transfer sizes.

caching

Historic data will not change. Therefore there is no reason to keep resending everything. Instead, very deliberate use of caching should be made.

A simple scheme to achieve this could be the following:

  • Instead of always dynamically delivering all statistics, only the most recent data should be generated dynamically.
  • In regular intervals, the statistics could be rendered out once for a time interval and served statically by the webserver. This could be done hourly, daily, weekly, monthly or a combination of those by aggregating 24 hours into a single day file, or several days into a week or month file.

More concretely: Rather than performing a single request to /api/load_netif_stats/XYZ, the client should instead make multiple requests:

/api/load_netif_stats/XYZ                   # still dynamically generated, but only up to the last hour
/api/load_netif_stats/2006-01-02T04:00-XYZ  # contains all data from the second of january 2006 from 4:00 to 5:00
/api/load_netif_stats/2006-01-02T03:00-XYZ  # same but one hour earlier
/api/load_netif_stats/2006-01-02T02:00-XYZ
/api/load_netif_stats/2006-01-02T01:00-XYZ
/api/load_netif_stats/2006-01-02T00:00-XYZ
/api/load_netif_stats/2006-01-01-XYZ        # contains all data for the first of january 2006
/api/load_netif_stats/2005-12-XYZ           # contains all data for the month december of 2005

Everything except the first request can be heavily cached, potentially forever, on the client. The server also only ever needs to provide data for recent events dynamically and can then generate historic data once.

With this, a page reload should be only as much as 60kB uncompressed, or 7kB (!) compressed for the html, and an additional request for the most recent historic data, which should be in the order a couple of hundred bytes to few kilobytes.

@adschm
Copy link
Member

adschm commented Feb 3, 2024

On a quick look, most of these ideas are valid.

I do not consider the page size as dramatic as stated by lemmi, but that does not mean we shouldn't get some of the low-hanging fruits.

Still, somebody will have to invest considerable time in it, and due to the design of the monitoring even "easy" changes might not be so quick after all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants