Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update all FERC dataset DOIs #3790

Merged
merged 1 commit into from
Aug 14, 2024
Merged

Update all FERC dataset DOIs #3790

merged 1 commit into from
Aug 14, 2024

Conversation

zaneselvans
Copy link
Member

@zaneselvans zaneselvans commented Aug 14, 2024

Overview

  • I noticed that almost all of the FERC archives had slight increases in the size of the 2023 XBRL partitions in August relative to July, and so it seemed likely that they have a few straggler filings each and I thought I would see if there were an issues with updating the DOIs to capture those additional filings for the quarterly release.
  • I kicked off a branch build. I imagine that there will be some row counts to update in the FERC Form 1, but if that's the only thing that comes up maybe we can merge it in.
  • So maybe the increase in sizes was just due to some revisions coming in that filled previously empty entries?

Testing

  • I ran a full nightly build with a manual workflow_dispatch. Technically, it failed, but this was a single stochastic error in the timeseries_analysis unit tests.
  • I also ran the full ETL and data validations locally overnight, and they all passed. The updated FERC1 archive didn't affect row counts in any tables that we track.

To-do list

Preview Give feedback

@zaneselvans zaneselvans added ferc1 Anything having to do with FERC Form 1 ferc714 Anything having to do with FERC Form 714 ferc2 Issues related to the FERC Form 2 dataset ferc6 ferc60 data-update When fresh data is integrated into PUDL from quarterly or annual updates labels Aug 14, 2024
@zaneselvans zaneselvans self-assigned this Aug 14, 2024
@zaneselvans zaneselvans requested a review from e-belfer August 14, 2024 15:08
@zaneselvans zaneselvans marked this pull request as ready for review August 14, 2024 15:09
Copy link
Member

@e-belfer e-belfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the handy sqlite-table-diff.ipynb notebook, when comparing a selection of tables from the two databases:

  • out_ferc1__yearly_all_plants - ~300 records change, 2/3 of which originate from the notoriously junky small plants table with a handful of updates to capex and opex values from two utilities comprising the other 1/3.
  • out_ferc1__yearly_balance_sheet_assets_sched110: No changes
  • out_ferc1__yearly_balance_sheet_liabilities_sched110: No changes
  • out_ferc1__yearly_cash_flows_sched120: No changes
  • out_ferc1__yearly_income_statements_sched114: One utility moved ~50k between budget lines
  • out_ferc1__yearly_operating_expenses_sched320: No changes
  • out_ferc1__yearly_operating_revenues_sched300: No changes
  • out_ferc1__yearly_plant_in_service_sched204: No changes
  • out_ferc1__yearly_retained_earnings_sched118: No changes
  • out_ferc1__yearly_utility_plant_summary_sched200: No changes

Which means that there are modest but not troubling changes in the data, and we should go ahead and update these DOIs.

@e-belfer e-belfer added this pull request to the merge queue Aug 14, 2024
Merged via the queue into main with commit 28db002 Aug 14, 2024
20 checks passed
@e-belfer e-belfer deleted the ferc-2024q2 branch August 14, 2024 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-update When fresh data is integrated into PUDL from quarterly or annual updates ferc1 Anything having to do with FERC Form 1 ferc2 Issues related to the FERC Form 2 dataset ferc6 ferc60 ferc714 Anything having to do with FERC Form 714
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants