-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up ExonGenomicCoordsMapper #224
Comments
Apparently it’s hard to follow because the previous model was not the best. There is now a better way to represent using VRS 2.0. @ahwagner your time to shine ✨ I'd also like to throw out that example input/output would be nice 😉 |
I think the alignment part makes sense, but have struggled to find a good way to represent what is needed. So going to stop progress until we hear how to represent using VRS 2.0. Initial progress is in this branch. |
Linking the Cool Seq Tool documentation of this method for my own reference. The idea of this method is to help us go from the ends of fusion transcript segments in exon representation, to where those ends exist on a genomic sequence. On occasion, fusion transcripts contain sequence that is intronic (exists on a genomic sequence but not the transcript), or have a junction that omits some sequence at the end of an exon. In the fusions model, we use offsets to describe this change. This concept of offset representation is also used in HGVS coding sequence representations. I think we should postpone any refactor of this method until the VRS 2.0 beta1 release for the Adjacency class, as there is a fundamental shift from a segment-based model to a junction (adjacency) model. Linking this thread for progress on that release. Once that is completed, we should use the beta model to revise how this is represented in FUSOR. |
In slack, @ahwagner asked: I think this is okay to resume now that we have the Adjacency class, right? @ahwagner You had said in comment about VRS 2.0 beta1, but I don't think we're at beta yet. Did you want us to proceed still?
|
Tagging @jsstevenson and @katiestahl so they can follow. Slack doesn't allow thread in a thread |
We did away with Alpha/Beta/RC for maturity levels which may have created some confusion here. I am comfortable with the |
@ahwagner ah okay. That's my bad for not remembering the maturity model changes. |
No fault here. With those changes this issue needed to be clarified. I also don't think anything has been lost by clarifying this now instead of earlier. However, with the recent cancervariants/fusion-curation#277 issue it is a good time to revisit. |
Going to add Alex's requested changes here:
New structure (Aligned Segment) will look like follows: {
"gene": "WEE1",
"alt_ac": "NC_000011.10",
"seg_start": {
"exon_ord": 1,
"offset": 0,
"genomic_location": {
"type": "SequenceLocation",
"sequenceReference": {
"type": "SequenceReference",
"refgetAccession": "SQ.2NkFm8HK88MqeNkCgj78KidCAXgnsfV1"
},
"start": 9575887
}
},
"seg_end": {
"exon_ord": 10,
"offset": 0,
"genomic_location": {
"type": "SequenceLocation",
"sequenceReference": {
"type": "SequenceReference",
"refgetAccession": "SQ.2NkFm8HK88MqeNkCgj78KidCAXgnsfV1"
},
"end": 9589767
}
},
"tx_ac": "NM_003390.3"
} @jarbesfeld am I missing anything? |
Addresses part of #224 * `transcript_to_genomic_coordinates` renamed to `tx_segment_to_genomic` * `genomic_to_transcript_exon_coordinates` renamed to `genomic_to_tx_segment`
Addresses #224 * Move `get_tx_exons_genomic_coords` from `UtaDatabase` to `ExonGenomicCoordsMapper` as a private method (`_get_tx_exons_genomic_coords`)
Addresses #224 * `genomic_to_tx_segment` will now require inter-residue coordinates to be passed
Addresses part of #224 * `transcript_to_genomic_coordinates` renamed to `tx_segment_to_genomic` * `genomic_to_transcript_exon_coordinates` renamed to `genomic_to_tx_segment`
Addresses part of #224 * `transcript_to_genomic_coordinates` renamed to `tx_segment_to_genomic` * `genomic_to_transcript_exon_coordinates` renamed to `genomic_to_tx_segment`
addresses #224 * initial work for cleaning up exon coord data retrieval
addresses #224 * Use `ExonGenomicCoordsMapper._get_all_exon_coords` instead
Addresses part of #224 * `transcript_to_genomic_coordinates` renamed to `tx_segment_to_genomic` * `genomic_to_transcript_exon_coordinates` renamed to `genomic_to_tx_segment`
addresses #224 * initial work for cleaning up exon coord data retrieval
addresses #224 * Use `ExonGenomicCoordsMapper._get_all_exon_coords` instead
addresses #224 * No code was changed in the classes or methods
I think all that's left to do in this epic is DRY + smaller methods |
A lot of this was written years ago. It's hard to follow what's happening. We should refactor this class
The text was updated successfully, but these errors were encountered: