Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create onchain_metrics_by_artifact #13

Closed
wants to merge 43 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
403cc47
docs: updates for schema v5 (#1429)
ccerv1 May 16, 2024
80d3b7f
docs: update query example in data challenge CTA (#1430)
ccerv1 May 16, 2024
1c0a205
Cloudquery github resolver should handle null github fields (#1431)
ravenac95 May 16, 2024
c2c7f54
Initial Dagster Setup On K8s (#1432)
ravenac95 May 17, 2024
95fdb5b
Add production apps
ravenac95 May 17, 2024
ce601a0
Setup correct app references
ravenac95 May 17, 2024
fc0ad15
Setup deps for k8s resources
ravenac95 May 17, 2024
e2e754b
Fix namespace ref to git repo source
ravenac95 May 17, 2024
91d7857
Add cloudsql proxy operator
ravenac95 May 17, 2024
fb10e4d
Fix cloudsql proxy
ravenac95 May 17, 2024
e54702f
Another fix cloudsql proxy
ravenac95 May 17, 2024
cf7ec02
add repo licenses to RF4 metrics (#1433)
ccerv1 May 17, 2024
e5583d7
Enable cloudsql proxy for dagster
ravenac95 May 17, 2024
ba988e7
Bump oso-dagster chart
ravenac95 May 17, 2024
42dd477
Fix service accounts
ravenac95 May 17, 2024
cfc7898
Fix dagster serviceaccount
ravenac95 May 17, 2024
61dbaf3
More fixes for helm release
ravenac95 May 17, 2024
b1780ee
remove unnecessary sqlmesh test
ravenac95 May 17, 2024
a5ef095
update github workflow
ravenac95 May 17, 2024
33998c9
update production app
ravenac95 May 17, 2024
6fdb3f2
attempt to fix production-apps
ravenac95 May 17, 2024
60af70a
Update init container setup
ravenac95 May 17, 2024
8f92b70
Another setup test
ravenac95 May 17, 2024
a02cf55
Yet another setup test
ravenac95 May 17, 2024
07fcc16
Yet another setup test
ravenac95 May 17, 2024
1d0d8ed
Attempt different config strategy for prod apps
ravenac95 May 17, 2024
7af7995
Set dagster release name
ravenac95 May 17, 2024
27b9e8b
Another config test
ravenac95 May 17, 2024
ae3c0cc
Another config test
ravenac95 May 17, 2024
9461aef
Another config test
ravenac95 May 17, 2024
89bd035
Fix auth proxy setup
ravenac95 May 17, 2024
9629f7d
Remove proxy for now
ravenac95 May 17, 2024
41af951
Use postrenderers on the helmrelease
ravenac95 May 17, 2024
52d4554
Fix post rendererers
ravenac95 May 17, 2024
7c714c1
Another post renderers fix
ravenac95 May 17, 2024
5e93092
remove proxy
ravenac95 May 17, 2024
fb055fc
Fix serviceaccount for dagster
ravenac95 May 17, 2024
d0f8ec0
Reenable proxy
ravenac95 May 17, 2024
32d84e1
ravenac95/basic dagster deployment (#1435)
ravenac95 May 17, 2024
d99d6a1
refactor int_models that power summary metric marts (#1436)
ccerv1 May 18, 2024
17b2618
add: dbt schema YML for impact metric tables (#1437)
ccerv1 May 18, 2024
6514f58
ravenac95/dagster deployment working (#1439)
ravenac95 May 19, 2024
5560b42
Adds language to the github-repos resolver (#1441)
ravenac95 May 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/scripts/publish-cloudquery-plugins.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ tag="$(git rev-parse HEAD)"
build_base_image() {
language="$1"
tag="$2"
base_image="ghcr.io/opensource-observer/cloudquery-${language}-base:${tag}"
dockerfile_path="./docker/cloudquery-${language}-base.Dockerfile"
base_image="ghcr.io/opensource-observer/${language}-base:${tag}"
dockerfile_path="./docker/cloudquery/${language}-base.Dockerfile"
docker build -t "${base_image}" -f "${dockerfile_path}" .
echo $base_image
}
Expand All @@ -36,7 +36,7 @@ for path in $ts_plugins; do
docker build -t ${plugin_image} \
--build-arg PLUGIN_NAME=${plugin_name} \
--build-arg BASE_IMAGE=${ts_base_image} \
-f docker/cloudquery-ts.Dockerfile \
-f docker/cloudquery/ts.Dockerfile \
.
echo "Publishing the plugin to ${plugin_image}"
docker push ${plugin_image}
Expand All @@ -60,7 +60,7 @@ for path in $python_plugins; do
--build-arg PLUGIN_NAME=${plugin_name} \
--build-arg PLUGIN_CMD=${plugin_cmd} \
--build-arg BASE_IMAGE=${ts_base_image} \
-f docker/cloudquery-py.Dockerfile \
-f docker/cloudquery/py.Dockerfile \
.

echo "Publishing the plugin to ${plugin_image}"
Expand Down
32 changes: 32 additions & 0 deletions .github/scripts/publish-docker-containers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
set -euxo pipefail

SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
cd "${SCRIPT_DIR}/../../"
REPO_DIR=$(pwd)

# Publish all images
images_to_build="$(find ./docker/images/* -type f -name 'Dockerfile' -exec sh -c 'dirname $0' {} \;)"
tag="$(git rev-parse HEAD)"

for path in $images_to_build; do
image_name=$(basename $path)

image_repo="ghcr.io/opensource-observer/${image_name}"
sha_image="${image_repo}:${tag}"
latest_image="${image_repo}:latest"

echo "Building ${image_name} plugin"
docker build \
-t ${sha_image} \
-t ${latest_image} \
--label "org.opencontainers.image.source=https://github.com/opensource-observer/oso" \
--label "observer.opensource.oso.sha=${tag}" \
--build-arg IMAGE_NAME=${image_name} \
-f docker/images/${image_name}/Dockerfile \
.
echo "Publishing the image to ${sha_image}"
docker push "${sha_image}"
echo "Publishing latest to ${latest_image}"
docker push "${latest_image}"
done
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ on:
- main

jobs:
warehouse-publish-cloudquery-plugins:
name: warehouse-publish-cloudquery-plugins
warehouse-publish-docker-containers:
name: warehouse-publish-docker-containers
environment: indexer
runs-on: ubuntu-latest

Expand All @@ -33,4 +33,8 @@ jobs:
password: ${{ secrets.GITHUB_TOKEN }}

- name: Package and publish cloudquery plugins
run: bash .github/scripts/publish-cloudquery-plugins.sh
run: bash .github/scripts/publish-cloudquery-plugins.sh

- name: Package and publish other docker containers
run: bash .github/scripts/publish-docker-containers.sh

Empty file added a.out
Empty file.
36 changes: 27 additions & 9 deletions apps/docs/blog/2024-05-16-impact-metrics-rf4/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Open Source Observer is working with the Optimism Collective and its badgeholder

Retro Funding 4 is the Optimism Collective’s first experiment with [Metrics-based Evaluation](https://gov.optimism.io/t/upcoming-retro-rounds-and-their-design/7861). The hypothesis is that by leveraging quantitative metrics, citizens are able to more accurately express their preferences for the types of impact they want to reward, as well as make more accurate judgements of the impact delivered by individual projects.

In stark contrast to other Retro Funding experiments, _badgeholders will not vote on individual projects but will rather vote via selecting and weighting a number of metrics which measure different types of impact._
In contrast to other Retro Funding experiments, _badgeholders will not vote on individual projects but will rather vote via selecting and weighting a number of metrics which measure different types of impact._

The Optimism Foundation has published high level guidance on the types of impact that will be rewarded:

Expand All @@ -22,7 +22,7 @@ The Optimism Foundation has published high level guidance on the types of impact
- Interactions of new Optimism users
- Open source license of contract code

The round is expected to receive applications from 100s of projects building on 6 Superchain networks (OP Mainnet, Base, Frax, Metal, Mode, and Zora). Details for the round can be found [here](https://gov.optimism.io/t/retro-funding-4-onchain-builders-round-details/7988).
The round is expected to receive applications from hundreds of projects building on **six** Superchain networks (OP Mainnet, Base, Frax, Metal, Mode, and Zora). Details for the round can be found [here](https://gov.optimism.io/t/retro-funding-4-onchain-builders-round-details/7988).

At Open Source Observer, our objective is to help the Optimism community arrive at up to 20 credible impact metrics that can be applied to projects with contracts on the Superchain.

Expand All @@ -40,10 +40,10 @@ One thing to make crystal clear is that Open Source Observer relies 100% on publ

The following raw data sources will be used:

- L2 Blockchain Transactions and Traces (OP Mainnet, Base, Frax, Metal, Mode, Zora), powered by GoldSky
- L2 Blockchain Transactions and Traces (OP Mainnet, Base, Frax, Metal, Mode, Zora), powered by [GoldSky](https://goldsky.com/)
- Web3 Social & Identity ([Farcaster](https://docs.farcaster.xyz/learn/architecture/hubs), [Passport](https://www.passport.xyz/), [EigenTrust by Karma3Labs](https://docs.karma3labs.com/eigentrust), and potentially other NFT or attestation-related credentials)
- Open Source Code Contributions (GitHub, OSS Licenses)
- Project Applications (submitted on Agora)
- Open Source Code Contributions (GitHub, [OSS Licenses](https://spdx.org/licenses/))
- Project Applications (submitted on [Agora](https://vote.optimism.io/))

There is additional data available on OSO, including software dependencies, grant funding histories, data on other chains, etc, which is open for exploration but will not be incorporated into impact metrics for RF4.

Expand Down Expand Up @@ -73,7 +73,7 @@ As badgeholders will be voting on portfolios of metrics, not projects, the proje

### Metric Logic

Each metric will be expressed as a SQL model running on top of the underlying data, with some intermediate models to improve readability.
Each metric will be expressed as a SQL model running on top of the underlying data, with some intermediate models to improve readability. One of the core models is called `rf4_events_daily_to_project`, which is a daily snapshot of all events on the Superchain tagged by project up until the end of the RF4 window (2024-05-23).

Here’s an example of [gas fees](https://github.com/opensource-observer/oso/blob/main/warehouse/dbt/models/marts/superchain/metrics/rf4_gas_fees.sql):

Expand All @@ -89,19 +89,37 @@ group by
project_id
```

This query grabs all gas-generating events on the Superchain from before 2024-05-23 from RF4-approved projects and sums up their gas fees.
The query above grabs all gas-generating events on the Superchain from RF4-approved projects and sums up their gas fees.

Onchain events are also tagged with a `trusted_user_id` if the address that triggered the event is considered a trusted user. As mentioned earlier, we are working with multiple partners to define what a trusted user is, and will finalize the logic after the RF4 window closes.

Here is an example of a model that looks only at [successful transactions from trusted users](https://github.com/opensource-observer/oso/blob/main/warehouse/dbt/models/marts/superchain/metrics/rf4_trusted_transactions.sql) since 2023-10-01:

```sql
select
project_id,
'trusted_transaction_count' as metric,
SUM(amount) as amount
from {{ ref('rf4_events_daily_to_project') }}
where
event_type = 'CONTRACT_INVOCATION_SUCCESS_DAILY_COUNT'
and bucket_day >= '2023-10-01'
and trusted_user_id is not null
group by
project_id
```

Once again, all of the source code is available from our repo [here](https://github.com/opensource-observer/oso/tree/main/warehouse/dbt/models/marts/superchain). We also have an active [data challenge](https://docs.opensource.observer/docs/contribute/challenges/2024-04-05_data_challenge_01/) to get analysts’ input and proposals on impact metrics.

## Current Metrics

This section will be updated regularly to reflect the latest metrics under consideration for RF4.
This section will be updated regularly to reflect the latest metrics under consideration for RF4. These metrics will be calculated for _all projects_ on the Superchain that verify at least one public GitHub repo and one deployer address (and that are approved in the application process).

### Gas Fees

_Sum of a project's total contribution to gas fees across the Superchain._

\*_Why this metric matters for the collective:_ \*Gas fees are the primary recurring revenue source for the Superchain and a key indicator of aggregate blockspace demand. A project’s gas fee contribution is influenced by its total volume of contract interactions, the computational complexity of those interactions, and the state of the underlying gas market at the time of those transactions. In the long run, gas fees are what will power Retro Funding and enable it to continue in perpetuity. All members of the Superchain have committed at least 15% of their gross profit from gas fees to Retro Funding. Supporting projects that generate revenue in the form of gas fees helps power the economic engine of the Superchain.
**Why this metric matters for the collective:** Gas fees are the primary recurring revenue source for the Superchain and a key indicator of aggregate blockspace demand. A project’s gas fee contribution is influenced by its total volume of contract interactions, the computational complexity of those interactions, and the state of the underlying gas market at the time of those transactions. In the long run, gas fees are what will power Retro Funding and enable it to continue in perpetuity. All members of the Superchain have committed at least 15% of their gross profit from gas fees to Retro Funding. Supporting projects that generate revenue in the form of gas fees helps power the economic engine of the Superchain.

### Total Transactions

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,10 @@ As an example, here’s a very simple onchain impact metric that sums all of a p
SELECT
project_id,
SUM(amount) AS txns_6_months
FROM `opensource-observer.oso.events_monthly_to_project_by_source`
FROM `opensource-observer.oso.rf4_events_daily_to_project`
WHERE
event_type = 'CONTRACT_INVOCATION_DAILY_COUNT'
AND DATE(bucket_month) >= DATE_SUB(CURRENT_DATE(), INTERVAL 6 MONTH)
AND from_namespace = 'OPTIMISM'
event_type = 'CONTRACT_INVOCATION_SUCCESS_DAILY_COUNT'
AND DATE(bucket_day) >= DATE_SUB(CURRENT_DATE(), INTERVAL 6 MONTH)
GROUP BY project_id
```

Expand Down Expand Up @@ -92,7 +91,7 @@ A total of **3000 OP** is available as retroactive rewards in the form of L2 tok

The primary way to receive rewards is to submit an impact metric in the form described above. We will reward the contributors who come up with the best metrics with 20-50 OP tokens per metric (capped at 10 metrics per contributor). The actual amount of the reward will be a function of the complexity and utility of the metric. As a guiding principle, we want to incentivize contributors to work on hard but widely applicable metrics. (Basically, we don’t want to see 10 variants of daily active users.)

In addition to direct work on impact metrics, we also have reward budgets for work on collections (defining a group of related projects) and adding/updating project data. We will reward collection creators 10-20 OP per new collection that gets added to oss-directory. We will reward contributors of project data at the rates described in our [bounty program](https://docs.opensource.observer/docs/contribute/challenges/bounties#ongoing-bounties) at the prevailing OP-USDC dex rate on May 10. These are capped at 250 OP per contributor.
In addition to direct work on impact metrics, we also have reward budgets for work on collections (defining a group of related projects) and adding/updating project data. We will reward collection creators 10-20 OP per new collection that gets added to oss-directory. We will reward contributors of project data at the rates described in our [bounty program](https://docs.opensource.observer/docs/contribute/challenges/bounties#ongoing-bounties) at the prevailing OP-USDC dex rate on May 31. These are capped at 250 OP per contributor.

Finally, we have a reward pool for other forms of contribution during the life of the challenge. This could include efforts to onboard or process new datasets, community activation, and improvements to OSO’s underlying infrastructure.

Expand Down
2 changes: 1 addition & 1 deletion apps/docs/docs/contribute/impact-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ macro generates a url safe base64 encoded identifier from a hash of the
namespace of the identifier and the ID within that namespace. This is done to
simplify some table joins at later stages (so you don't need to match on
multiple dimensions). An example of using the macro within the `collections`
namespace for a collection of the slug `foo` would be as follows:
namespace for a collection named `foo` would be as follows:

```jinja
{{ oso_id('collection', 'foo')}}
Expand Down
70 changes: 59 additions & 11 deletions apps/docs/docs/contribute/project-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@ sidebar_position: 2
---

:::info
Contributing data about open source projects is one of the simplest and most
important ways to help the OSO community. When a new project is added to our
directory, we automatically index relevant data about its history and ongoing
activity, and we generate a project page on the OSO website. This makes it easy
for people to discover the project and analyze its data.
Add or update data about a project by making a pull request to the OSS Directory.
When a new project is added to OSS directory, we automatically index relevant
data about its history and ongoing activity so it can be queried via our API, included
in metrics dashboards, and analyzed by data scientists.
:::

## Quick Steps
Expand All @@ -21,7 +20,58 @@ Add or update project data by making a pull request to [OSS Directory](https://g
2. Locate or create a new project `.yaml` file under `./data/projects/`.
3. Link artifacts (ie, GitHubs, npm packages, blockchain addresses) in the project `.yaml` file.
4. Submit a pull request from your fork back to [OSS Directory](https://github.com/opensource-observer/oss-directory).
5. Once your pull request is approved, you can monitor how much of your project data has been indexed by querying the `event_indexing_status_by_project` through [our API](https://cloud.hasura.io/public/graphiql?endpoint=https://opensource-observer.hasura.app/v1/graphql).
5. Once your pull request is approved, your project will automatically be added to our daily indexers. It may take longer for some historical data (eg, GitHub events) to show up as we run backfill jobs less frequently.

## Schema Overview

---

The latest schema version is Version 5. In this schema, we replace the field `slug` with `name` and the previous `name` field with `display_name`.

:::important
The `name` field is the unique identifier for the project and **must** match the name of the project file. For example, if the project file is `./data/projects/m/my-project.yaml`, then the `name` field should be `my-project`. As a convention, we usually take the GitHub organization name as the project `name`. If the project is a standalone repo within a larger GitHub organization or personal account, you can use the project name followed by the repo owner as the name, separated by hyphens.
:::

### Fields

The schema currently contains the following fields:

- `version`: The version of the schema you are using. The latest version is Version 5. This is a required field.
- `name`: The unique identifier for the project. This is usually the GitHub organization name or the project name followed by the repo owner, separated by hyphens. This is a required field.
- `display_name`: The name of the project. This is a required field.
- `description`: A brief description of the project.
- `github`: The GitHub URL of the project. This is a list of URLs, as a project can have multiple GitHub URLs. In most cases, the first and only URL will be the main GitHub organization URL. You don't need to include all the repositories that belong to the organization, as we will automatically index all of them.
- `npm`: The npm URL of a package owned the project. This is a list of URLs, as a project can have multiple npm URLs.
- `blockchain`: A list of blockchain addresses associated with the project. Each address should include the address itself, the networks it is associated with, and any tags that describe the address. The most important addresses to include are deployers and wallets. We use deployers to trace all contracts deployed by a project, and wallets to trace all transactions made by a project.

### Supported Blockchain Networks and Tags

The OSS Directory currently supports the following blockchain networks, which can be enumerated in the `networks` field of a blockchain address:

- `mainnet`: The Ethereum mainnet.
- `arbitrum-one`: The Arbitrum L2 network.
- `optimism`: The Optimism L2 network.
- `base`: The Base L2 network.
- `metal`: The Metal L2 network.
- `mode`: The Mode L2 network.
- `frax`: The Frax L2 network.
- `zora`: The Zora L2 network.

We do not support testnets for any of these networks and do not intend to.

The following tags can be used to describe blockchain addresses:

- `deployer`: A deployer address.
- `eoa`: An externally owned account (EOA) address.
- `safe`: A multisig safe contract address.
- `wallet`: A wallet address. This tag is used to classify the address as a wallet that should be monitored for funding events. This tag is only associated with addresses that are also tagged as `eoa` or `safe`.

In previous versions of the schema, we enumerated contracts and factories with the following tags. These tags are still supported but no longer required since we index all contracts and factories associated with a project from its deployer(s).

- `contract`: A smart contract address.
- `factory`: A factory contract address.

Read below for more detailed steps on how to add or update project data or consult the [schema](../how-oso-works/oss-directory/) for more information.

## Detailed Steps

Expand Down Expand Up @@ -86,7 +136,7 @@ If you run into issues, check out [GitHub's instructions](https://docs.github.co
- Here's an example of a project `.yaml` file:

```yaml
version: 3
version: 5
name: opensource-observer
display_name: Open Source Observer
github:
Expand All @@ -108,7 +158,7 @@ If you run into issues, check out [GitHub's instructions](https://docs.github.co
- wallet
- address: "0x5cBd6362e6F222D2A0Feb89f32566ebd27091B98"
networks:
- arbitrum
- arbitrum-one
tags:
- safe
- wallet
Expand All @@ -132,9 +182,7 @@ If you run into issues, check out [GitHub's instructions](https://docs.github.co

### 5. Monitor indexing status of your project data

Once your pull request is merged, you can monitor how much of your project data has been indexed by querying [our API](https://cloud.hasura.io/public/graphiql?endpoint=https://opensource-observer.hasura.app/v1/graphql).

The `event_indexing_status_by_project` query takes a `project_name` as an argument and returns the first, last, and total number of event days indexed for the project for each event type and event data provider.
Once your pull request is merged, you can check whether your project data has been indexed by querying [our API](https://cloud.hasura.io/public/graphiql?endpoint=https://opensource-observer.hasura.app/v1/graphql).

Note that our indexer currently runs every 24 hours at 02:00 UTC. Therefore, it may take up to 24 hours for your project data to be fully indexed. Backfills are run periodically to ensure that all data is indexed. If you don't see any historic event data for your project, than the most likely reason is that the backfill has not yet been run.

Expand Down
Loading
Loading