How to I get the embeddings for a list of sequences? #30

ajrferrari · 2023-05-04T21:39:51Z

Hi, sorry if this is obvious from what is available on the code or at the jupyter notebook, but I could work out through it.

How do I get the embeddings from thousands of protein sequences efficiently? I saw there is a function ".get_rep(self, seq)" that returns the vector of interest for one sequence, but it's very inefficient to iterate over that for thousands of sequences.

Thanks,
Állan

RaphaelBouvet · 2023-06-01T13:46:45Z

You can use the jax reimplementation of unirep, it is faster than the original implementation
https://github.com/ElArkk/jax-unirep

jlotthammer · 2023-07-31T19:22:21Z

@RaphaelBouvet are the JAX sequence embeddings equivalent to the original model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to I get the embeddings for a list of sequences? #30

How to I get the embeddings for a list of sequences? #30

ajrferrari commented May 4, 2023 •

edited

Loading

RaphaelBouvet commented Jun 1, 2023

jlotthammer commented Jul 31, 2023

How to I get the embeddings for a list of sequences? #30

How to I get the embeddings for a list of sequences? #30

Comments

ajrferrari commented May 4, 2023 • edited Loading

RaphaelBouvet commented Jun 1, 2023

jlotthammer commented Jul 31, 2023

ajrferrari commented May 4, 2023 •

edited

Loading