-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synonym sync: Abbreviation exclusions #668
Conversation
- Added abbreviation exclusion rules to config - Updated config name to make more sense in light of this change
8f1abd4
to
7a085d8
Compare
- Update: outputs
@joeflack4 what goal was run to generate the data files in this PR? Did you confirm from the curated examples that the list of synonyms that should not be tagged as an abbreviation are not tagged now? |
"""Append additional synonym_type(s) based on logical rules.""" | ||
syn: str = row['synonym'] | ||
syn_type_str: str = row['synonym_type'] | ||
types: Set[URI] = set(syn_type_str.split('|') if syn_type_str else []) | ||
|
||
# Acronym: uppercase, no numbers, no whitespace, <10 chars | ||
if syn.isupper() and not any(char.isspace() for char in syn) and len(syn) < 10: | ||
if syn not in exclusions and syn.isupper() and not any(char.isspace() for char in syn) and len(syn) < 10: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you confirm from the curated examples that the list of synonyms that should not be tagged as an abbreviation are not tagged now?
Yep! I checked a few cases in the new v8 google sheet, and they're longer being tagged as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link to the data file. I checked all the ones in the curated file and none are tagged as abbreviations now as expected with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! All synonyms that should be excluded from being tagged as an abbreviation are excluded.
Resolves #652
Overview
Synonym sync: Abbreviation exclusions
mondo-exclusion-values.yml
-->mondo-exclusion-configs.yml
: I didn't rename it just mondo-exclusions in order to disambiguate from similarly named SOURCE_exclusions files, which serve the purpose of excluding entire terms. I went with the general 'configs' name, because these exclude a variety of different things now, from exclusion of entire synonyms being added, to exclusion of synonym type axioms.Pre-merge checklist
Documentation
Was the documentation added/updated under
docs/
?QC
Was the full pipeline run before submitting this PR using
sh run.sh make build-mondo-ingest
on this branch (afterdocker pull obolibrary/odkfull:dev
), and no errors occurred?New Packages
Were any new Python packages added?
Were any other non-Python packages added?
PR Review and Conversations Resolved
Has the PR been sufficiently reviewed by at least 1 team member of the Mondo Technical team and all threads resolved?
Google sheet