Adjusting data normalization for Template V2 #161

MichaelTiemannOSC · 2022-10-27T01:55:37Z

We have discussed creating a more "tidy" normalization for the V2 template (vs. the original unversioned data input template). The basic change is to split the input data sheet into two sheets, one which focuses on company financial data and the other which provides row-by-row data of corporate disclosures where each row can be a specific disclosure across a time series from a given disclosure statement. For example, this for an AES disclosure given in 2018:

company_name	metric	sub_metric	unit	report_date	2016	2017	2018
AES Corp.	s1		kt CO2e	2019-02-27	70457	63497	51878
AES Corp.	s2	location	kt CO2e	2019-02-27	306	226	360
AES Corp.	s2	market	kt CO2e	2019-02-27	309	230	362
AES Corp.	s1s2	location	kt CO2e	2019-02-27	70763	63723	52238
AES Corp.	s3	combined	kt CO2e	2019-02-27	5865.8	15422	10894.3
AES Corp.	s3	3	kt CO2e	2019-02-27	5864	15421	10893
AES Corp.	s3	6	kt CO2e	2019-02-27	1.8	1	1.3
AES Corp.	production	equity	GWh	2019-02-27			81670.056
AES Corp.	pdf			2019-02-27			https://www.aes.com/sites/default/files/2021-02/2018-Sustainability-Report.pdf

This has worked well for an initial set of reports from 2016-2020 as most such reports have very limited data. As corporates have increased coverage and details, further requirements for the V2 template are coming into view. The purpose of this issue is to track all of the goals for the V2 template, which may mean spawning further issues.

Issues to decide:

integrated utilities that deliver both electricity and gas don't fit our current implementation, which presumes there is a single base-year statistic for production and emissions (for building projections). An integrated utility has multiple base-year statistics (one for electricity, one for gas). How, really, do we want to either split business segments so that they fit in a single sector/region assignment, or if we generalize the implementation to support multi-sectoral and multi-regional statistics, how should users be able to see, slice, and dice that data?
Carpenter Technology Corp produces both raw steel (with better than 10x the industry average for emissions intensity) as well as other structure metal products. In their sustainability report, they report both their raw steel intensity numbers (0.20 t CO2/t Steel) as well as their overall intensity numbers (3.98 t CO2/t Product). They do not report separate steel vs. non-steel production numbers, but they do set intensity targets for overall intensity. Our benchmarks define decarbonization targets for Steel and Aluminum, but not for "Advanced Metal Products" (which might be various steel, aluminum, or titanium allows). What should we try to do at the data layer to collect statistics about decarbonization targets for activities closely aligned with well-defined benchmark targets?
EEI reports give detailed information about generation and emissions for "owned generation" (usually on an equity basis), purchased power, and total (owned + purchased) generation and emissions. Some companies are even setting targets for what they expect the emissions intensity of their purchased power to be by 2030, 2040, and 2050. The GHG Protocol documents how purchased power should be attributed to Scope 2 or Scope 3 Category 3 emissions. We should have some kind of data quality checking to ensure that reported Scope 2 and Scope 3 Category 3 disclosures make sense. In the case of PPL, this also applies to Gas, not just Electricity.
Is it likely we will ever want to collect statistics on types of fuels burned for Scope 1, and if so, do we need to have some kind of language in the data input that says "to calculate the complete Scope 1 metric, aggregate these sub-metrics"? And if we do, should we do some kind of quality check if both a total and the disaggregated components are all listed?
National Grid reports Scope 3 data, some of which applies only to its US-based operations, some of which applies to both US and UK operations. This may be related to the question of how we deal with multi-region companies. Would we want to report temperature alignment for National Grid's UK vs. US operations? If so, we must keep track of which scope emissions belong to what region.
Guidance is needed on how to translate AGA-reported gas statistics into data that can be scored against a benchmark budget. This means translating fugitive CH4 emissions and Gas Throughput into scope categories with appropriate nominal values.

We should add other topics to the list as they come up, and we should file issues when a particular problem is ready to be implemented against a spec. We can reference those issues within this master tracking issue.

MichaelTiemannOSC added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Oct 27, 2022

MichaelTiemannOSC assigned LeylaJavadova and ImkeHorten Oct 27, 2022

franz-mf unassigned LeylaJavadova Nov 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjusting data normalization for Template V2 #161

Adjusting data normalization for Template V2 #161

MichaelTiemannOSC commented Oct 27, 2022

Adjusting data normalization for Template V2 #161

Adjusting data normalization for Template V2 #161

Comments

MichaelTiemannOSC commented Oct 27, 2022