Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incomplete rows in Feature extraction #51

Closed
enricotagliavini opened this issue Mar 30, 2023 · 7 comments · Fixed by #57
Closed

Fix incomplete rows in Feature extraction #51

enricotagliavini opened this issue Mar 30, 2023 · 7 comments · Fixed by #57
Assignees
Labels
bug Something isn't working

Comments

@enricotagliavini
Copy link
Contributor

See #49 (comment)

TL;DR with reference to

except Exception as e:
print_exc()
common_row.update(row)
row_list.append(common_row)

We append the row also if there was an Exception during feature extraction, but the row is incomplete. Add an else statement.

@enricotagliavini enricotagliavini self-assigned this Mar 30, 2023
@enricotagliavini
Copy link
Contributor Author

Question for Nicole: should be just skip the row entirely or try to add Nan in case of ValueError or Joel suggested in #49 (comment) ?

@jluethi
Copy link
Contributor

jluethi commented Mar 30, 2023

Or can we use the same wrapper functions for measurements that I'm writing for the Fractal task wrapper? See #49 (comment)

@enricotagliavini enricotagliavini added the bug Something isn't working label Mar 30, 2023
@enricotagliavini
Copy link
Contributor Author

Your implementation looks fairly similar, the output format is not the same, but should be an easy conversion.

However I'm not sure all the features of the original code are present in the fractal version. For example the FAIM_HCS tags (well_id, path to the experiment, etc.), I can't see anything similar to abs_min . Those could be added via the extra_values parameter, and if they need computation it should be done before (for abs_min this was already the case), is that correct?

disconnected has been renamed to disconnected_components if I understand correctly. While I can guess the reason, it breaks the format, which can be a bit annoying. If Nicole is ok with that, and we promise to help her to get her existing code to work with both the old and new format (not a big issue I believe), maybe we can do it.

I might have missed stuff with the first pass, if I see anything else I'll post another comment.

Thank you.
Kind regards.

@jluethi
Copy link
Contributor

jluethi commented Mar 31, 2023

As always, great analysis from your side @enricotagliavini :) Those are indeed also the points I was aware of that we'd have to tackle.

Input & output format is not exactly the same, because I think it makes sense to abstract a few more calls (the regionprops call & the dataframe conversion) into the wrapper function. The main reason is that we could then add additional measurements to this same wrapper call that don't rely on regionprops if we want to do so going forward.

However I'm not sure all the features of the original code are present in the fractal version. For example the FAIM_HCS tags (well_id, path to the experiment, etc.), I can't see anything similar to abs_min . Those could be added via the extra_values parameter, and if they need computation it should be done before (for abs_min this was already the case), is that correct?

Correct. Things like FAIM_HCS tags are something that is specific to using FAIM HCS 0.1.1, thus not general enough to be used through Fractal. But there will always be some information that needs to be passed through, thus I added the extra_values parameter. It is designed to handle such additional metadata.
abs_min is also something that is specific to Drogon measurements and would make sense to add like this.
If we then merge the two output dataframes, they should match what was there before.

disconnected has been renamed to disconnected_components if I understand correctly. While I can guess the reason, it breaks the format, which can be a bit annoying. If Nicole is ok with that, and we promise to help her to get her existing code to work with both the old and new format (not a big issue I believe), maybe we can do it.

Happy to discuss this and we can also revert back to disconnected if there are strong opinions on this. disconnected_components seems more explicit to me as a feature name.
We should also discuss if we want to rename aspectRatio to aspectRatio_equivalentDiameter or something similar (see discussions at the lab meeting).


On all of those points, lets sit together the 3 of us, e.g. at some point next week? Then we can discuss if this approach makes sense and what decisions we want to take on the few renaming discussions :)

@jluethi
Copy link
Contributor

jluethi commented Mar 31, 2023

Also, another discussion point could be whether you want to give users more control about whether they want to measure morphology or not in the prefect flows. The new wrapper function would make this an easy thing to set :)

@enricotagliavini
Copy link
Contributor Author

For the record: @nrepina fixed this and she will make a pull request soon. She used the wrapper function used by fractal. We had to make some changes (e.g python 3.8 support and slight tweaking to extend it), but it should not be an API break, so the fractal part should, hopefully, still work as before.

@jluethi
Copy link
Contributor

jluethi commented Apr 5, 2023

Sounds great. If there are changes to make it more generally usable that break my API, happy to adapt as well :) The overall goal for me is get to a point where the wrapper function is well usable by both parts. Great that we're moving to this part and looking forward to the PR @nrepina !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants