Representing native tokens in prices models #6577

0xRobin · 2024-08-19T08:28:24Z

Representing native tokens in prices models

Native tokens are often annoying to deal with when pulling in price info.
I'll describe the current state with it's problems and solutions, and what I think should be the desired state.

1. Current state

Currently the native tokens are defined here:

spellbook/dbt_subprojects/tokens/models/prices/prices_native_tokens.sql

Lines 34 to 37 in d948aee

    
           ('eos-eos', null, 'EOS', null, null), 
        
           ('etc-ethereum-classic', null, 'ETC', null, null), 
        
           ('eth-ethereum', null, 'ETH', null, null), 
        
           ('ftm-fantom', null, 'FTM', null, null),

They have blockchain, contract_address and decimals as null values.

This is also how they show up in prices.usd (dune query)

Problems and current solutions

When you have a model that includes trades from both erc20 as native tokens, you have to adapt your query to deal with this.
Most solutions follow any of these setups

replace the native rows with a wrapped alternative

select * 
from (
    select
       blockchain
      ,tx_hash
      ,case when currency_contract = 0x0000000000000000000000000000000000000000 
         then 0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2 
         else currency_contract 
       end as currency_contract
      ,amount
      from trades
)
left join prices.usd
    on blockchain = blockchain
    and contract_address = currency_contract

here we end up using the wrong price feed (WETH instead of ETH), this logic will break whenever there's a depeg event and we also end up with the wrong symbol in our end table.

Add extra logic in the join condition

select
   blockchain
  ,tx_hash
  ,currency_contract
  ,amount
  ,price
from trades
left join prices.usd
on (blockchain = blockchain 
    and currency_contract = contract_address)
or (currency_contract = 0x0000000000000000000000000000000000000000
    and blockchain is null
    and symbol = 'ETH')

This works fine, but requiring blockchain null is confusing, and we're relying solely on the symbol column here to determine the price feed, which is a very unsafe column that can hold arbitrary data.
This is the current default that I've seen the most.
a note from the prices beta announcement reiterates the danger of relying on the symbol column:

Prices are calculated at the contract_address and blockchain level. Token symbol is not a unique identifier as different tokens may have the same symbol.

Patch the prices model so it integrates better

spellbook/dbt_subprojects/nft/macros/enrich_nft_trades.sql

Lines 10 to 35 in d948aee

    
           -- TODO: We should remove this CTE and include ETH into the general prices table once everything is migrated 
        
           WITH prices_patch as ( 
        
               SELECT 
        
                   contract_address 
        
                   ,blockchain 
        
                   ,decimals 
        
                   ,minute 
        
                   ,price 
        
                   ,symbol 
        
               FROM {{ source('prices','usd_forward_fill') }} 
        
               {% if is_incremental() %} 
        
               WHERE {{incremental_predicate('minute')}} 
        
               {% endif %} 
        
               UNION ALL 
        
               SELECT 
        
                   {{ var("ETH_ERC20_ADDRESS") }} as contract_address 
        
                   ,'ethereum' as blockchain 
        
                   ,18 as decimals 
        
                   ,minute 
        
                   ,price 
        
                   ,'ETH' as symbol 
        
               FROM {{ source('prices','usd_forward_fill') }} 
        
               WHERE blockchain is null AND symbol = 'ETH' 
        
               {% if is_incremental() %} 
        
               AND {{incremental_predicate('minute')}} 
        
               {% endif %}

This is mostly something I've been doing, haven't seen it adopted elsewhere.

Join in a prices table for the native token seperately

spellbook/dbt_subprojects/nft/macros/nft_mints.sql

Line 92 in d948aee

    
           , COALESCE(pu_eth.price*SUM(CAST(et.value as DOUBLE))/POWER(10, 18), pu_erc20s.price*SUM(CAST(erc20s.value as DOUBLE))/POWER(10, pu_erc20s.decimals))*(nft_mints.amount/nft_count.nfts_minted_in_tx) AS amount_usd

spellbook/dbt_subprojects/nft/macros/nft_mints.sql

Lines 124 to 130 in d948aee

    
           LEFT JOIN {{ source('prices','usd') }} pu_eth 
        
               ON pu_eth.blockchain IS NULL 
        
               AND pu_eth.minute=date_trunc('minute', et.block_time) 
        
               AND pu_eth.symbol = 'ETH' 
        
               {% if is_incremental() %} 
        
               AND {{incremental_predicate('pu_eth.minute')}} 
        
               {% endif %}

spellbook/dbt_subprojects/nft/macros/nft_mints.sql

Lines 138 to 143 in d948aee

    
           LEFT JOIN {{ source('prices','usd') }} pu_erc20s ON pu_erc20s.blockchain='{{blockchain}}' 
        
               AND pu_erc20s.minute=date_trunc('minute', erc20s.evt_block_time) 
        
               AND erc20s.contract_address=pu_erc20s.contract_address 
        
               {% if is_incremental() %} 
        
               AND {{incremental_predicate('pu_erc20s.minute')}} 
        
               {% endif %}

Conclusion: joining prices with native tokens can be tricky and is very subjective to produce join duplicates when small errors are made. This has been the subject of many data error investigations in spellbook.

2. Desired state

Prices should be uniquely identified by:

blockchain
contract_address
timestamp

and native tokens should follow this rule.
When this is true any join with prices would simply look like this:

left join prices.usd
on blockchain = blockchain
  and contract_address = currency_address
  and minute = block_time

no extra join logic, no case-when, no coalesce() in your select statement..

My proposal is to add all native tokens in the prices tables with:

0x0000000000000000000000000000000000000000 as contract_address
blockchain filled in correctly
decimals filled in correctly

If there are different standards for representing the native token as an erc20 address (or precompile addresses on some L2s), we can either specify the contract address for each chain individually, or we can add all representations that makes sense.
eg. We could have the ethereum price feed both at 0x00000.. and at 0xeeee... so that users don't have to clean up their data if the source uses a different representation.
OR we impose 1 native address per chain to force standardization.

This is a small change in spellbook, and could be quickly implemented.
The problem is that this change would break any query that currently follows solution no. 2 described above (which is the one I've seen most).

The text was updated successfully, but these errors were encountered:

jeff-dude · 2024-08-20T19:54:20Z

thinking out loud here on prices pipeline and summarizing above:

trusted tokens sql file, continue current approach and scale per chain added, reads from coinpaprika to get price
native tokens sql file, modify existing to include contract_address, symbol, decimals per chain
- note: some chains could have multiple rows (unique address), as noted above
new pipeline which generates prices for all other tokens (WIP pipeline)

three different inputs, all write to separate tables, then final union view per level of granularity (minute, hour, day).

table is then clean in terms of level of granularity and consistency of data written to all columns.

downstream:

all joins to prices is now the same for any scenario
while change to native tokens is simple in the spell itself, we need to measure the impact downstream (both spells joining prices in various ways + queries on the dune app)
- how to handle if large impact on queries?

expectation:

blockchain team will gather info from chains as they onboard onto dune to ensure we apply native token data correctly

0xBoxer · 2024-08-21T09:13:08Z

The problem is that this change would break any query that currently follows solution no. 2 described above (which is the one I've seen most).

We could implement your desired state and carry forward the existing implementation forward.
This would improve the UX for users today and keep more bad queries from being written.

Deprecating the existing rows is going to be very difficult if not impossible without breaking any queries.

jeff-dude · 2024-08-21T12:53:32Z

We could implement your desired state and carry forward the existing implementation forward. This would improve the UX for users today and keep more bad queries from being written.

Deprecating the existing rows is going to be very difficult if not impossible without breaking any queries.

not a bad idea if we want to avoid breaking changes. something like:

keep native tokens file as-is, but add another row to populate all fields such as blockchain, address, etc etc
technically we are duplicating the data, which would grow the table size further (it's about to explode in size anyway)
while we duplicate the data, the unique columns would remain unique, so joins wouldn't break (both old and new)

maybe there is something i'm not thinking of top of mind, but we could explore this

edit:
this may still break one of the options above, after speaking a bit with rob further on it. my suggestion is that we put a task in triage to prioritize which would run a test to see if the above would work or not to resolve all above scenarios.

jeff-dude self-assigned this Aug 20, 2024

jeff-dude added the in review Assignee is currently reviewing the PR label Aug 20, 2024

tomfutago mentioned this issue Sep 3, 2024

Update balancer_cowswap_amm_ethereum_liquidity.sql #6651

Merged

3 tasks

jeff-dude mentioned this issue Sep 19, 2024

DUX-665 prices_trusted_tokens add native tokens #6773

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representing native tokens in prices models #6577

Representing native tokens in prices models #6577

0xRobin commented Aug 19, 2024 •

edited

Loading

jeff-dude commented Aug 20, 2024 •

edited

Loading

0xBoxer commented Aug 21, 2024 •

edited

Loading

jeff-dude commented Aug 21, 2024 •

edited

Loading

Representing native tokens in prices models #6577

Representing native tokens in prices models #6577

Comments

0xRobin commented Aug 19, 2024 • edited Loading

Representing native tokens in prices models

1. Current state

Problems and current solutions

2. Desired state

jeff-dude commented Aug 20, 2024 • edited Loading

0xBoxer commented Aug 21, 2024 • edited Loading

jeff-dude commented Aug 21, 2024 • edited Loading

0xRobin commented Aug 19, 2024 •

edited

Loading

jeff-dude commented Aug 20, 2024 •

edited

Loading

0xBoxer commented Aug 21, 2024 •

edited

Loading

jeff-dude commented Aug 21, 2024 •

edited

Loading