Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dagster: create asset for indexing EAS data #1954

Closed
ccerv1 opened this issue Aug 15, 2024 · 3 comments
Closed

dagster: create asset for indexing EAS data #1954

ccerv1 opened this issue Aug 15, 2024 · 3 comments
Assignees
Labels
c:data Gathering data (e.g. indexing)

Comments

@ccerv1
Copy link
Member

ccerv1 commented Aug 15, 2024

Describe the feature you'd like to request

We'll probably just want to start by dumping raw attestation data into a data lake

Describe the solution you'd like

A crawler similar to the open_collective.py example that fetches attestations from the EAS graphql endpoint and dumps that standard data plus a JSON with the decoded schema data into BigQuery.

Here's an example: https://gist.github.com/ccerv1/12a83e4a698a200a48ac17b97b049241

Describe alternatives you've considered

Ask the EAS team if we can get a mirror of their database

@ccerv1 ccerv1 self-assigned this Aug 15, 2024
@ccerv1 ccerv1 added the c:data Gathering data (e.g. indexing) label Aug 15, 2024
@ccerv1 ccerv1 added this to the (c) Op RF5/6 milestone Aug 15, 2024
@ryscheng
Copy link
Member

ryscheng commented Sep 5, 2024

Recoping this issue:
I think we prefer to do it this way
#2069

So once we have a generic trace decoder, we can use that to decode EAS traces.

@ccerv1
Copy link
Member Author

ccerv1 commented Sep 9, 2024

#2110

@Jabolol could you take a look at the PR above? TY!

@Jabolol
Copy link
Contributor

Jabolol commented Sep 9, 2024

@ccerv1 Done! I made several adjustments to the PR, but the core contribution was definitely useful. I’ve tested it both locally and on BigQuery, and everything is working well.

image

On the flip side, I found yet another concurrency bug with Dagster. Apparently, each define_asset_job call can only have one partitioned run, so Open Collective and the new EAS assets clash. I'll create an issue to track and fix it.

Let me know if you need anything else!

@ccerv1 ccerv1 closed this as completed Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:data Gathering data (e.g. indexing)
Projects
Status: Done
Development

No branches or pull requests

3 participants