You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is clear we want to avoid pyactivestorage doing a file open inside each dask chunk if such a file open requires a remote index read each and every time.
A quick hack to fix this (in the pyfive branch) would be to avoid keeping the File instance open (the optimal_kerchunk branch already does this). With that one change, users could at least use active storage instances many times without worrying about the file open count.
A better solution long term may involve lifting the internal s3fs outside so we can take advantage of the s3fs caching.
The text was updated successfully, but these errors were encountered:
It is clear we want to avoid pyactivestorage doing a file open inside each dask chunk if such a file open requires a remote index read each and every time.
A quick hack to fix this (in the pyfive branch) would be to avoid keeping the File instance open (the optimal_kerchunk branch already does this). With that one change, users could at least use active storage instances many times without worrying about the file open count.
A better solution long term may involve lifting the internal s3fs outside so we can take advantage of the s3fs caching.
The text was updated successfully, but these errors were encountered: