-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a class that allows storing locations of errors #46
base: master
Are you sure you want to change the base?
Conversation
moenigin
commented
Aug 15, 2023
- add some typing and docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the new class! I left a bunch of comments, mostly regarding code style and formatting issues.
Could you please squash all the commits within the PR into a single one, and also run pyink on the code, using the settings from https://github.com/google/ffn/blob/master/ffn/pyproject.toml ? |
implements requested changes add a class that allows storing locations of errors - add some typing and docstring Update proofreading.py pass actionstate to store_error_location with different mode input directly & remove intermedeiary functions
f787cce
to
54a34dd
Compare
I hope what I did is what you were aiming for |
Thanks! Unfortunately, it looks like it formatted everything with 4 spaces. One problem might have been that our pyproject.toml got accidentally moved to an incorrect location, so maybe that's why the tool didn't pick up the config. This is now fixed. Could you try syncing the repo and running pyink again? (or manually specify the settings that we use in the config file). |
reformat with pyink from toml
correct remaining indent of 4 in docstring
sorry for not checking this properly and all the hassle this created (was to happy about seeing pyink printing "all done" and trusted it did work according to toml). I noticed that the docstrings are not corrected to 2 indents by pyink. I did this manually now, but maybe pyink can do this on its own... |
objects: A list of objects. | ||
bad: A list of bad objects or markers. | ||
seg_error_coordinates: A dictionary of error coordinates. | ||
load_annotations: A flag to indicate if annotations should be loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's a use case where you would provide seg_error_coordinates, but set load_annotations to False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A combination of laziness and generak flaws of the tool as is now. 1. If I pick up the review after a break I don't want to patch the final list of coordinates from different review session. 2. The annotation layer shows all annotations not only for the segment you are currently viewing. If in the last review session you already verfied that all locations you flagged in that session are correct, you may want to visualize only the novel locations.
Hope I am kind of making this clear here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so is the idea that when load_annotations == False, the seg_error_coordinates are just stored (and appended to as you annotate more), but not actually displayed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was my idea behind this because R. had already annotated >100 splits by the time we started this. Large numbers of annotations could be confusing. So if they are not needed because they are all verified from a previous revision round one can leave them out this way...
(Alternatively one could link the annotation to the errorneous segment of the todo list (requiring batch size to be one). By overriding update_segments one could only display the locations associated with a given batch of segments from the todo list. To view all marked location one could create a "show_all" function. However, as mentioned in the email thread I am not sure whether getting to split locations as implemented here is the most efficient way. Assembling segments belonging to one neurite by selection is usually faster. This would yield more coordinate pairs for a given split, but the would not necessarily be in close proximity to each other. If the latter is what you are after, that should be stated explicitly in the instructions.)
- include requested updates - fix bug for annotation selection
thanks for the detailed inspection! |
Thanks for your patience with this. We're very close to having this completed now. |
transform list of annotations id pair to frozenset
I am learning a lot :-) E.g. that conventions are much more "local" than expected :-). |
Could you please try running https://github.com/google/pytype on this code? We're getting a bunch of failures caused by the new annotations. |
This doesn't install on Windows. Unless there is an alternative to this I will not be able to check this for the next two weeks. Did the checks pass beforehand? Is there any hint to where the problem could be? I cannot view the details to this copybara thing either... |
Here are the specific failures that are triggered:
These are not new, we just got to this stage where this could be run. Any chance you could try WSL which the page you linked suggests would make it possible to run pytype? If this doesn't work, you could also try running another type checker like https://mypy-lang.org/ |
correct typing
Were you able to get type checking to work on your end? I'm still seeing some failures with the latest change:
|
I have tried as far as possible without human support to get pytype running - it would not be recognized in WSL. I therefore used mypy. This did not run through error free but the errors that remained are non-overlapping with the ones you indicate: proofreading.py:24: error: Skipping analyzing "networkx": module is installed, but missing library stubs or py.typed marker [import] IIUC this is mypy not understanding some packages, complains about class attributes not being typed and does not understand None is returned in line509. I was hoping your typing tool would not crash there (apparantly the case!). About the errors you get: The type check you are doing is based on pytype? I can try to get help to get this running again but it may take me some time (human experts are scarce) |
- fix/silence type hint errors (I hope) - fix error annotation count - fix error type assignment by only the second location
Thanks for the last round of changes! The latest version passes all our internal type checks, but I noticed that the annotation for the object list was actually not completely correct. I just pushed a commit that provides the correct annotations for that list -- could you please rebase on top of that and adjust the annotations where needed? |
def update_segments( | ||
self, | ||
segments: Union[set[int], list[int]], | ||
loc: Optional[Sequence[int]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use Optional[Point]
@@ -94,39 +117,55 @@ def update_segments(self, segments, loc=None, layer='seg'): | |||
else: | |||
l.equivalences.clear() | |||
for a in self.todo[self.index : self.index + self.batch]: | |||
a = [aa[layer] for aa in a] | |||
a = [cast(dict, aa)[layer] for aa in a] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the cast still needed now that self.todo is typed?
objects: Union[dict[str, int], Sequence[int]], | ||
bad: set, | ||
num_to_prefetch: int = 10, | ||
locations: Optional[list[Sequence[int]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use Point here
def __init__(self, objects, bad, num_to_prefetch=10, locations=None): | ||
def __init__( | ||
self, | ||
objects: Union[dict[str, int], Sequence[int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use Iterable[ObjectItem]
|
||
def __init__( | ||
self, | ||
objects: list, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterable[ObjectItem]
self.make_point_annotation(coord, annotation_id) | ||
|
||
def make_point_annotation( | ||
self, coordinate: list[int], annotation_id: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point?
self.cur_error_type = None | ||
|
||
def annotate_error_locations( | ||
self, coordinates: list[list[int]], error_id: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterable[Point]?
Looks great, just a few minor typing changes suggested above. Thanks! |