Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support and standard keyword for WES #173

Open
jb-adams opened this issue Oct 6, 2021 · 16 comments
Open

Support and standard keyword for WES #173

jb-adams opened this issue Oct 6, 2021 · 16 comments

Comments

@jb-adams
Copy link
Member

jb-adams commented Oct 6, 2021

Currently, WES supports standard keywords for CWL and WDL workflow languages. Can we add a new workflow_type keyword supporting Nextflow-based workflows, perhaps NEXTFLOW?

@adeslatt
Copy link

adeslatt commented Oct 12, 2021

Perhaps we are thinking about this incorrectly. Jupyterlab notebooks can be translated from R, ipynb, etc. using the tool jupytext Kernels are specified when running jupyterlab notebooks using rendering tools such as papermill. We should say that a standard workflow specification could be articulated and stored in Common Workflow Language (as does the EU with their Go-FAIR plans and as stated on the Elixir site) but that we have translators that translate the CWL to Nextflow, WDL, Snakemake etc. Separating out concerns solves the problem and evolved best practices would be similar to the literate programming achieved with jupyterlab notebooks rendered in ipynb - allowing for the normal marked down section and then the execution cell.

What i suggest is that we have a language that is specified for storage. Most of these allow for processes to be specified, Nextflow and CWL, with inputs, outputs and the process specified with bash level scripting for the execution portion. That process should be a container where just that process is contained within it. This lends itself to continuous testing, etc.

Execution can then be specified easily and this lends itself to the automation we seek.

@geoffjentry
Copy link
Contributor

Converting CWL to WDL and vice versa is at best imperfect. There's been some recent workwhich is exciting but I believe it only goes in one direction. My understanding is that Nextflow is moving in a direction that makes direct translation harder, not easier - and there's not been much work in that regard in the first place.

If someone wants to contribute a universal translator between these standards, then I agree w/ your viewpoint.

@uniqueg
Copy link
Contributor

uniqueg commented Oct 12, 2021

Just for reference:

PRs to include Nextflow and Snakemake in the corresponding TRS enum:

@caylward-ilmn
Copy link

caylward-ilmn commented Oct 12, 2021

I'm reading through the API Spec for the WES POST /runs endpoint, specifically the workflow_type parameter description:

The workflow_type is the type of workflow language and must be "CWL" or "WDL" currently (or another alternative supported by this WES instance).

It seems like this workflow_type parameter for the POST /runs endpoint is not an enum. I guess (as indicated in the above comment) this is really a change to the TRS spec to support "NFL" as a descriptor type?

In the WES GET /service-info endpoint, there is a workflow_type_versions attribute which returns "Available workflow types supported by a given instance of the service". I would think this provides the acceptable values to use in the workflow_type field of the POST /runs endpoint, leaving it up to the implementation to indicate available "workflow_types", rather than the WES spec.

@patmagee
Copy link
Contributor

@caylward-ilmn this may be a case of imprecise language in the documentation. Higher up in the reference docs it also says:

(currently CWL or WDL formatted workflows, other types may be supported in the future)

Which seems to imply that CWL or WDL are the only supported languages currently

@caylward-ilmn
Copy link

caylward-ilmn commented Oct 12, 2021

@patmagee Sure - you're referring to the Executive Summary section (though it DOES say "other types may be supported in the future").

Regardless, that's a non-functional change to fix ancillary docs. Seemed like this issue was suggesting a change to the API specification? That I don't see...

@patmagee
Copy link
Contributor

@caylward-ilmn that caveat (or another alternative supported by this WES instance) does seem to appear everywhere in the docs already, so it may just practically be treated as an enum, when in reality CWL and WDL are just strings that can be anything.

It seems there already is in the specification the flexibility to define any workflow language as "supported" through the workflow_type field. Because of that I do not think adding NEXTFLOW as a documented language changes the specification from the way it currently operates. Issues around input harmonization, and whether adding NEXTFLOW changes the semantics of the API are clearly already present by allowing any language that a WES instance supports

@pditommaso
Copy link

Worth mentioning nextflow is already implemented by Tower API, and it's expecting NFL as language type for consistency with TRS

https://tower.nf/openapi/index.html#post-/ga4gh/wes/v1/runs

@patmagee
Copy link
Contributor

If NFL is already used in TRS I do no see any reason to add separate terminology with NEXTFLOW, and we should probably just use the NFL name

@jb-adams
Copy link
Member Author

I agree, wasn't aware of NFL in TRS till now

@uniqueg
Copy link
Contributor

uniqueg commented Nov 26, 2021

If the docs/specs are to be changed to mention supported workflow languages explicitly, then I think it would be nice if it were the same set of languages supported in TRS. From v2.0.1 onward this will be:

  • CWL
  • WDL
  • NFL
  • Galaxy
  • SMK

See here.

I am not sure whether the WES specs should include such an enumeration at all, though.

@patmagee
Copy link
Contributor

@uniqueg i think it definitely makes sense to keep language consistent between apis. NFL makes a lot of sense for this.

I am not sure what the process of getting additional languages is supported is. My guess is eventually WES will support snake make, and galaxy as well and the two lists would likely be similar.

@uniqueg
Copy link
Contributor

uniqueg commented Nov 27, 2021

Actually, I'm not quite sure what exactly it means for WES to support a workflow language - which is why I'm not sure whether languages should even be listed explicitly at all. First of all, I think that WES specs should be abstract enough to enable a wide range of engines to implement them and provide at least a basic feature set. But assuming there are cases where this does not hold, I think it's primarily the engine that would cause compatibility problems, not the language itself. There might be engines where an otherwise "supported language" is not a good match for WES. But I find it quite hard to imagine a case where the language itself would cause problems. As long as a workflow can be addressed by a URI, it should be able to be processed by a suitable engine, or not?

So what are the criteria for a language to be supported by WES? If you mean there should be implementations for a candidate language, then Snakemake should certainly be listed, as there are at least two implementations available that I know of:

Of course, these implementations cannot strictly comply with the specs if the language isn't listed.... ;-)

@uniqueg
Copy link
Contributor

uniqueg commented Nov 27, 2021

My suggestion would be to list the languages (or better yet the engines, or both) that have been demonstrated to play nice with WES in the documentation, not the specs. This would lower the barrier for workflow engine developers to implement the specs. I feel similarly about TRS, by the way. Let the developers themselves interpret the specs for their use case (language, engine) and decide whether the specs are a good match or not...

@patmagee
Copy link
Contributor

@uniqueg I think that is a good Idea. I was also wondering if instead of WES explicitly stating the languages that it supports, to instead describe what general features a language should require to be a candidate for WES. Similar to your suggestion above this lowers the bar and allows off spec languages

We could add a section something like the following (this is not an actual suggestion on the wording btw):

WES can be used for any workflow language that  adheres to the following general principals
1. A workflow should accept inputs  
2. A single workflow should output a set of outputs
3. A workflow should not require an interactive session with the user but can be executed in a completely headless way

@uniqueg
Copy link
Contributor

uniqueg commented May 5, 2022

I like that! :) Though we will need to check which requirements WES actually really has from the RunRequest model and maybe other more hidden limitations.

Speaking of the RunRequest schema: any particular reason why the required property isn't actually being used to allow automatic validation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants