Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling with bucket access issues #600

Open
dmpetrov opened this issue Nov 14, 2024 · 0 comments
Open

Error handling with bucket access issues #600

dmpetrov opened this issue Nov 14, 2024 · 0 comments
Labels
bug Something isn't working priority-p1

Comments

@dmpetrov
Copy link
Member

Description

I got a massive stackstrace while a single line is enough like:

[email protected] does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).

DC should carefully handle these types of errors.

The stack trace:

Using cached virtualenv
Listing gs://mpii-human-pose: 0 objects [00:00, ? objects/s]
Processed: 1 rows [00:00, 11.23 rows/s] [00:00, ? objects/s]
Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/dc.py", line 474, in from_storage
    .save(list_ds_name, listing=True)
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/dc.py", line 764, in save
    query=self._query.save(
          ^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 1579, in save
    query = self.apply_steps()
            ^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 1128, in apply_steps
    result = step.apply(
             ^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 572, in apply
    self.populate_udf_table(udf_table, query)
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 492, in populate_udf_table
    process_udf_outputs(
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/query/dataset.py", line 335, in process_udf_outputs
    for row in udf_output:
               ^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/udf.py", line 369, in <genexpr>
    output = (dict(zip(self.signal_names, row)) for row in udf_outputs)
                                                           ^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/udf.py", line 368, in <genexpr>
    udf_outputs = (self._flatten_row(row) for row in result_objs)
                                                     ^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/lib/listing.py", line 35, in list_func
    for entries in iter_over_async(client.scandir(path.rstrip("/")), get_loop()):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/asyn.py", line 238, in iter_over_async
    done, obj = asyncio.run_coroutine_threadsafe(get_next(), loop).result()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/asyn.py", line 231, in get_next
    obj = await ait.__anext__()
          ^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/fsspec.py", line 225, in scandir
    await main_task
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/gcs.py", line 57, in _fetch_flat
    await self._get_pages(prefix, page_queue)
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/datachain/client/gcs.py", line 91, in _get_pages
    page = await self.fs._call(
           ^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/core.py", line 447, in _call
    status, headers, info, contents = await self._request(
                                      ^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/decorator.py", line 221, in fun
    return await caller(func, *(extras + args), **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/retry.py", line 130, in retry_request
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/core.py", line 440, in _request
    validate_response(status, contents, path, args)
  File "/tmp/local/datachain_venv/python3.12/default/lib/python3.12/site-packages/gcsfs/retry.py", line 111, in validate_response
    raise OSError(f"Forbidden: {path}\n{msg}")
OSError: Forbidden: b/mpii-human-pose/o
[email protected] does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
Query script exited with error code 1

Version Info


@dmpetrov dmpetrov added bug Something isn't working priority-p1 labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority-p1
Projects
None yet
Development

No branches or pull requests

1 participant