-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor checkm2/databasedownload using aria2 #6654
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I've not seen this before, I will admit I'm not really happy that this is an official module as it's not actually a subcommand of the tool (despite the comment)... but I guess it's already here.
The changes generally look OK though (and clever of the remote md5 extraction1) - however have you considered having the zenodo_id
as an input channel rather than hardcoding this? This would allow user choice of which version of the datbaase to download
Thanks for the review James. I completely agree that the ideal approach would be to use the tool command to download the database. However, this is a design limitation, even the developer recommends downloading the database externally. That’s why I found it cleaner to implement a module that handles the download automatically. I’ve added the |
Fair enough - in this case I would've just said use a
Mm, I might be getting confused with some of your terminology - you've added an input channel not a parameter. Can you give examples of what you mean by 'the two parameters serving the same purpose'? Do you mean e.g.:
or something? If so I think based on your implementation, this these two are not necessarily required. A pipeline developer could chose to force a user to download the latest version by hardcoding the latest zenodo ID in the input channel of the module. But wrap that in an I don't really follow what you mean with the |
Sorry for the lack of clarity.
Yes, this is what I mean. And I know both are not required, but in this case, it's not exactly the database version, it's the Zenodo ID where the database is stored. And because of that, I'm wondering if the best idea to leave it as a param, because it could be a little confusing. I'm thinking in mag, because I plan to implement the migration from CheckM to CheckM2. So, I was thinking on using
This way, if the user want to change the database version, it would be possible as an advanced option via config
|
Another option could be implementing a function to get the zenodo id from the version, like this one: def get_zenodo_id(db_version) {
zenodo_ids = [
'2' : 5571251,
'1.1' : 4671167,
'1' : 4626519
]
if (!zenodo_ids.containsKey(db_version)) {
error("Error: Invalid database version '${db_version}'")
}
return zenodo_ids[db_version]
} This way the input value could be the database version itself, and it would make more sense to me to have the param |
OK, I think I'm following a little more:
Overall, I don't think the two parameter thing is that much of an issue to be honest, this is very common in many cases already. I think your default Zenodo ID implementation is fine plus an override with the input channel 👍 (and then it's up to the pipeline develop to allow users to customise this or not). Once the tests are passing I think I can give this a ✔️ |
(oh and very excitied to see this in MAG!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh poop I realised I didn't finish my previous review properly (pending comments): I noticed that while there is a stub block, there is no stub test.
Please add that and then you can merge!
* Use aria2 for CHECKM2_DATABASEDOWNLOAD, update snaps * Fix linting * Add bioconda to environment.yml * Add zenodo_id as input to select which version to download * Add input param to the test * Add process input for predict * Fix databasedownload meta * Improve field description --------- Co-authored-by: James A. Fellows Yates <[email protected]>
PR checklist
versions.yml
file.label
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda