Goal: Transforming the data loaded in DWH into Analytical Views developing a dbt project.
By this stage of the course you should have already:
- A running warehouse (BigQuery or postgres)
- A set of running pipelines ingesting the project dataset (week 3 completed)
- The following datasets ingested from the course Datasets list:
- Yellow taxi data - Years 2019 and 2020
- Green taxi data - Years 2019 and 2020
- fhv data - Year 2019.
Note
- We have two quick hack to load that data quicker, follow this video for option 1 or check instructions in week3/extras for option 2
Note
the cloud setup is the preferred option.
the local setup does not require a cloud database.
Alternative A | Alternative B |
---|---|
Setting up dbt for using BigQuery (cloud) | Setting up dbt for using Postgres locally |
- Open a free developer dbt cloud account following this link | - Open a free developer dbt cloud account following this link |
- Following these instructions to connect to your BigQuery instance | - follow the official dbt documentation or - follow the dbt core with BigQuery on Docker guide to setup dbt locally on docker or - use a docker image from oficial Install with Docker. |
- More detailed instructions in dbt_cloud_setup.md | - You will need to install the latest version with the BigQuery adapter (dbt-bigquery). |
- You will need to install the latest version with the postgres adapter (dbt-postgres). | |
After local installation you will have to set up the connection to PG in the profiles.yml , you can find the templates here |
- What is analytics engineering?
- ETL vs ELT
- Data modeling concepts (fact and dim tables)
- Introduction to dbt
- Anatomy of a dbt model: written code vs compiled Sources
- Materialisations: table, view, incremental, ephemeral
- Seeds, sources and ref
- Jinja and Macros
- Packages
- Variables
Note
This video is shown entirely on dbt cloud IDE but the same steps can be followed locally on the IDE of your choice
Tip
- If you recieve an error stating "Permission denied while globbing file pattern." when attempting to run
fact_trips.sql
this video may be helpful in resolving the issue
- Tests
- Documentation
Note
This video is shown entirely on dbt cloud IDE but the same steps can be followed locally on the IDE of your choice
🎥 Google data studio Video (Now renamed to Looker studio)
🎥 Metabase Video
Did you take notes? You can share them here.
- Notes by Alvaro Navas
- Sandy's DE learning blog
- Notes by Victor Padilha
- Marcos Torregrosa's blog (spanish)
- Notes by froukje
- Notes by Alain Boisvert
- Setting up Prefect with dbt by Vera
- Blog by Xia He-Bleinagel
- Setting up DBT with BigQuery by Tofag
- Blog post by Dewi Oktaviani
- Notes from Vincenzo Galante
- Notes from Balaji
- Notes by Linda
- 2024 - Videos transcript week4
- Blog Post by Jonah Oliver
- Add your notes here (above this line)