Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaggle cryo #1947

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

ChristofHenkel
Copy link

@ChristofHenkel ChristofHenkel commented Feb 24, 2025

Description

Added tutorial based on the 1st place solution of kaggles Cryo-ET competition

ChristofHenkel and others added 8 commits February 24, 2025 15:45
DCO Remediation Commit for christofhenkel <[email protected]>

I, christofhenkel <[email protected]>, hereby add my Signed-off-by to this commit: 1db03fd
I, christofhenkel <[email protected]>, hereby add my Signed-off-by to this commit: 2ffe23d
I, christofhenkel <[email protected]>, hereby add my Signed-off-by to this commit: df958e7

Signed-off-by: christofhenkel <[email protected]>

Signed-off-by: ChristofHenkel <[email protected]>
Signed-off-by: ChristofHenkel <[email protected]>
Signed-off-by: ChristofHenkel <[email protected]>
Signed-off-by: ChristofHenkel <[email protected]>
@KumoLiu
Copy link
Contributor

KumoLiu commented Feb 28, 2025

cc @virginiafdez @garciadias

Copy link

@garciadias garciadias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ChristofHenkel,

thank you for your contribution and congratulations on your great achievement.

I am trying to follow your README.md file instructions to run the code myself, but I have encountered some issues.

I leave you some comments. Please let me know if I am on the right track.

Many thanks,

Rafael

This tutorial is build upon the official Cryo ET competition data. It can be downloaded directly from kaggle: https://www.kaggle.com/competitions/czii-cryo-et-object-identification/data

Alternativly it can be downloaded using the kaggle API (which can be installed via ```pip install kaggle```)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you decide to use the Kaggle API you need to create a Kaggle account and configure your token as described here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will also need to follow the competition url and click "join competition" to accept the terms and conditions and then be allowed to download the data with the following command:

def load_one(self, experiment_id):

img_fp = f"{self.data_folder}{experiment_id}"
try:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try statement will defer the error to line 88, where img will not be defined. Consider replacing the print statement with a raise or instantiating img on the exception.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While debugging the code, I found that it was erroring out because of this error:

  File "/workspace/data/ds_1.py", line 83, in load_one
    img = np.array(zarr.open(img_fp + '/VoxelSpacing10.000/denoised.zarr')[0]).transpose(2,1,0)
  File "/usr/local/lib/python3.10/dist-packages/zarr/hierarchy.py", line 511, in __getitem__
    raise KeyError(item)

I confirmed the img_fp + '/VoxelSpacing10.000/denoised.zarr' path exists and it does, but zf is empty:

> zf
array([], dtype=float64)

I may have downloaded the data incorrectly. @ChristofHenkel, could you please take a look at this?

The contents of my folder are these:

$ ls /data/train/static/ExperimentRuns/TS_86_3/TS_86_3/VoxelSpacing10.000/denoised.zarr/.zgroup -la
-rw-r--r-- 1 root root 24 Mar  7 17:06 /data/train/static/ExperimentRuns/TS_86_3/TS_86_3/VoxelSpacing10.000/denoised.zarr/.zgroup


Alternativly it can be downloaded using the kaggle API (which can be installed via ```pip install kaggle```)

```kaggle competitions download -c czii-cryo-et-object-identification```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do this inside of the docker pod, this data will be ephemeral and be deleted when the pod goes down. I suggest adding an option to mount the data in the correct location and instruct people to download the data outside the pod.


```kaggle competitions download -c czii-cryo-et-object-identification```

and adjust path to it in ```configs/common_config.py``` with ```cfg.data_folder```.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mount the folder in the correct location you don't need to adjust this. So this line can be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants