Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --position argument to external subcommand for customising the UMI placement #15

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

MatthiasZepper
Copy link
Member

Proposed change

This PR introduces a new argument, --position, to the external sub-command of umi-transfer. The --position argument allows users to specify where the UMI is inserted, with the following options:

  • header (default) – UMI is appended to the read header, as traditionally used.
  • inline – UMI is inserted directly before the read sequence, enabling compatibility with Sarek.

On a technical level, all read processing is now moved to a separate module. This will also be useful, should we extend the tool by umi-transfer inline, which would partly perform the reverse action (extract the UMI from the read instead of pasting the two together).

Motivation

Sarek supports the use of consensus reads to increase the accuracy of variant calls. Consensus reads are formed by identifying, grouping and collapsing duplicate reads that originate from the same DNA molecule. Sequencing errors are corrected in the process and therefore the number of artifactual variant calls reduced.

Sarek uses fgbio for consensus read formation and processing. While the tool supports using UMIs from external files, the pipeline's sample sheet doesn't allow for a third FastQ file as input. Hence, UMIs must be integrated to the read first.

Outlook

Possibly, this change justifies a v1.6 release?

@MatthiasZepper MatthiasZepper changed the base branch from main to dev February 4, 2025 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant