-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add detailed usage breakdown in estimate() #63
Comments
Would this make it easier to determine the size of opaque responses? This would also benefit a lot from doing #18. |
I don't believe so; an implementation would need to ensure the padded size was included in whatever storage type was holding them.
Yep. |
Just to get it out of my head, the IDL would be: dictionary StorageEstimate {
unsigned long long usage;
unsigned long long quota;
record<DOMString, unsigned long long> details;
}; (Unless we invented partial enums when I wasn't looking) |
It seems we could just have a dictionary there as well. And there's partial dictionaries if we wanted to avoid a central registry. |
Hey @annevk - I work with @inexorabletash on Chrome. We have google properties that are very interested in this, we're ready to move forward with implementation. Here's our current thinking around IDL: dictionary StorageEstimate {
unsigned long long usage;
unsigned long long quota;
StorageUsageDetails usageDetails;
};
dictionary StorageUsageDetails {
unsigned long long indexedDB;
unsigned long long caches;
unsigned long long serviceWorkerRegistrations;
unsigned long long other;
}; Would you be open to a PR for the spec? We're hoping to be able to land this experimentally in Chrome soon with tentative web platform tests. Outstanding questions:
Is there anything else we should resolve before moving forward with an experimental implementation in Chrome? Have we heard about any interest in the overall API from Microsoft or Apple (i.e. are there contacts we should cc?) Please note, Chrome intends on reporting |
@aliams and @hober might have thoughts for Microsoft and Apple, respectively. @asutherland should maybe have a second look for Mozilla. My main concern with this is opaque responses. I'm worried this'll make it easier to determine their size. [@inexorabletash did mention above that he thinks it's unlikely this'll worsen.] (I was actually wondering if we could make the origin opt-in to requiring CORS and only then give them estimates.) Other than that, I guess I'd opt to exclude I'm also not super happy with Chrome planning to include obsoleted/proprietary APIs. That could easily make web compatibility issues worse. |
@jarryd999 Can you further elaborate on how the Google properties intend to use this extra data? Like, is the idea that Cache storage will always be treated as evictable and IndexedDB always as canonical, so Cache can be ignored for storage pressure reasons and only IndexedDB matters? Or is this desired for telemetry so that engineers can say "holy cow, we've got a ton of WebSQL storage still used, we'd better push additional telemetry gathering for WebSQL/push specific cleanup logic/send clear-site-data to anyone reporting that usage and start from scratch", etc. ? I ask because one of the issues that came up a few times at the 2018 TPAC was the idea that eviction is still problematic because there's still only a single storage "bucket" and things like the relationship between a ServiceWorker registration and the storages it used exists only in the mind of the developers who wrote the code so we continue to be faced with all-or-nothing eviction. It seems like both browsers and the various teams on a site would benefit if we implemented multiple buckets and our estimate included usage by bucket. Obviously, more buckets is not a simple change since it would have to be plumbed through all existing APIs and there's still the issue of pre-existing data. So the other question in my mind is how are these top-level summaries more useful than returning size-estimates at a more detailed granularity? For example, in Firefox we have a memory reporting infrastructure "about:memory" (and the underlying API) that provides a hierarchical summation of our memory usage. I could see it being very useful to have entries for "indexedDB/foo" and perhaps "indexedDB/foo/blobs", "indexedDB/foo/key-values", and "indexedDB/foo/indexes". Perhaps even "indexedDB/foo/key-values/large-values" and "indexedDB/foo/key-values/small-values", etc. The idea there would be that we could spec some explicit names for the top-level of the hierarchy and immediate children, but that browsers could go into more implementation-specific detail if they wanted and it was requested. I imagine we'd limit that level of detail to our own devtools initially, stopping at "indexedDB/DBNAME" granularity for content. |
It seems ideal to cordon off nonstandard or implementation-specific keys into a separate method or substructure, with a name like implementationDefined, to at least slightly reduce the chance of developers taking a dependency on those values and writing non-portable code. |
We've received feedback that developers would like a breakdown as well. However, when asked about why they would want to know the breakdown specifically, it seems that the use case is to determine if something would fit into the client-side storage before making an attempt to fetch/create the data. @inexorabletash / @jarryd999 are there other use cases that you can share? |
One use case we have, is to be able to assess how much space we're taking for caching html/js/css resources vs domain model data content. Both can be variable, so it's hard to guess how much space is left for data content for a particular user. Also, it would help us find bugs in the browser, for example if one storage subsystem is not cleaning up resources correctly on disk eg: crbug.com/782869. |
@annevk - This ambiguity around the size of opaque responses should remain intact given that we include the padding in each entry in Cache Storage, and thus the @annevk @domenic - Regarding obsoleted/proprietary storage backends, we would like to propose an alternative: The @asutherland @aliams - The primary goal of this change is to aid in debugging. For example, one can consider an email client that uses IndexedDB to store text and Cache Storage to store attachments. With the proposed change, said app would be able to debug problematic storage scenarios: high Cache Storage usage but low IndexedDB usage would suggest the app forgot to delete attachments when evicting messages from the local cache. If there was high usage for both storage systems, this would mean the app is caching too many messages, suggesting the eviction policy is not behaving correctly. While we’d like to address the all-or-nothing eviction issue, any changes around the single temporary bucket are out of scope for this particular proposal. @ralphch0 Thank you for providing your use case! We’ll be sure to keep this in mind as we move forward. |
My concern about this change is that while it's trivial to implement, it seems biased towards consumers who have exclusive control over the data stored on their origin and use a data storage regime that very cleanly uses different storage APIs for mutually exclusive purposes, like gmail. If you store everything in IndexedDB or have other JS code-bases on your origin, it doesn't seem useful. Which suggests we will eventually want something more sophisticated, but then we still have the specification and developer documentation burden of this enhancement. |
@asutherland - We do expect that, in the future, consumers will have more buckets than the default bucket. We see the proposed change as a complement to said future rather than a redundancy. Each bucket will still have access to all the storage backends, so we will still want an API that, given a bucket, will return the per system usage details. Today’s apps are storing everything in the default bucket, so they can still make use of a per system breakdown of usage. While you are correct in saying that this change is not very useful for apps that only use IndexedDB, we believe that there are enough people who find this change useful that we would like to develop a solution for this use case. |
@jarryd999 If the plan going forward is more buckets, I agree that this approach makes sense and works. I also like the idea of formalizing the values via explicit dictionary keys with the zero values not being present and therefore not exposed to JS. As an aside, Firefox is moving how we store localStorage and so we would plan to expose it via this mechanism. Also, our underlying storage for ServiceWorkers is via Cache API in a separate namespace, but it makes sense to break that out separately from content's use of Cache API so we would do that too. |
@asutherland - re:
What's your timeframe for that, and do you have docs? (Interested in questions like: will there still be a localStorage-specific limit, etc) |
Based on the comments above, I would like to recap what we're thinking right now. dictionary StorageEstimate {
unsigned long long usage;
unsigned long long quota;
StorageUsageDetails usageDetails;
};
dictionary StorageUsageDetails {
unsigned long long indexedDB;
unsigned long long caches;
unsigned long long serviceWorkerRegistrations;
}; The Note: Chrome also intends to ship with an Thank you to @aliams @annevk @asutherland @domenic @inexorabletash @ralphch0 for your feedback. |
We've had a frequent request from users of
estimate()
to provide an per storage type breakdown estimation. For example, something like:This stems from web apps that have supported for years and/or where multiple apps are present within an origin makes reasoning about what is using up storage difficult. Also exacerbating the problem:
Lots of details to figure out, but:
In Chrome we have this plumbed through internally for our DevTools implementation:
Thoughts?
The text was updated successfully, but these errors were encountered: