An analysis of U.S. CO2 emissions by sector using ClimateTrace data

4 minute read

CO2 is by far the most abundant human-emitted greenhouse gas. Roughly 40 billion metric tons of CO2 are generated each year by transportation, electrical generation, cement manufacturing, deforestation, agriculture, and many other practices.

ClimateTrace

Over time, I’ve been increasingly curious about the topic of climate change. Although there is much international effort to understand emissions, a lack of transparency can hinder our ability to accurately measure them. ClimateTrace is an organization attempting to leverage satellites, remote sensing, and artificial intelligence to provide more accurate estimates of global emissions.

Addressing the problem: largest contributors of CO2 emissions

A comprehensive database of CO2 emissions released by ClimateTrace can help answer a few question I have about CO2 emissions, such as:

  • How much CO2 does the U.S. emit compared to other countries?
  • What parts of the economy contribute the most CO2 pollution?
  • What is the current state of those sectors and how have they changed over recent years?

Investigating these questions can help me understand what sectors of the U.S. economy are in most need of cleaner products and practices.

Exploration and Cleaning

emissions <- read_csv('climatetrace.csv', show_col_types = FALSE)
glimpse(emissions)
## Rows: 58,500
## Columns: 7
## $ `Tonnes Co2e` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ country_full  <chr> "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "A~
## $ country       <chr> "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW", ~
## $ sector        <chr> "agriculture", "agriculture", "agriculture", "agricultur~
## $ subsector     <chr> "cropland fires", "cropland fires", "cropland fires", "c~
## $ start         <date> 2020-01-01, 2019-01-01, 2018-01-01, 2017-01-01, 2016-01~
## $ end           <date> 2021-01-01, 2020-01-01, 2019-01-01, 2018-01-01, 2017-01~

On my first impression of the data, I’m surprised to see null values for emissions in the Tonnes Co2e column. Let’s get a better look at the distribution of emissions.

skim(emissions$'Tonnes Co2e')
   
Name emissions$“Tonnes Co2e”
Number of rows 58500
Number of columns 1
_______________________  
Column type frequency:  
numeric 1
________________________  
Group variables None

Data summary

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
data 14413 0.75 6894622 69037274 0 0 9000 865200 4386245000 ▇▁▁▁▁

A quick skim shows us that about 25% of the 58,500 rows are null, and that the emission distribution is skewed right. For now, I’ll save the null entries in a separate tibble before dropping them so I can have a closer look later. Also, having two country identifiers and two date columns might be superfluous, so I’ll keep the full country name and end date only.

emissions <-
  emissions %>% 
  select(!c(country, start))

emiss_null <- # Saved null values for further analysis
  emissions %>% 
  filter(is.na(emissions$`Tonnes Co2e`))

emissions <-
  emissions %>% na.omit

glimpse(emissions)
## Rows: 44,087
## Columns: 5
## $ `Tonnes Co2e` <dbl> 0, 0, 0, 0, 0, 0, 13300, 13300, 13300, 13700, 13000, 138~
## $ country_full  <chr> "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "A~
## $ sector        <chr> "agriculture", "agriculture", "agriculture", "agricultur~
## $ subsector     <chr> "rice cultivation", "rice cultivation", "rice cultivatio~
## $ end           <date> 2021-01-01, 2020-01-01, 2019-01-01, 2018-01-01, 2017-01~

Now the data is condensed and relevant to our first question: how much CO2 does the U.S. emit compared to other countries?

Top Polluters by Country, Sector, and Subsector

Let’s take a look at the top 6 emitters of CO2 by the country_full column and save it to a new tibble called top_country.

top_country <-
  emissions %>%
  group_by(country_full) %>%
  summarize(total_emissions = sum(`Tonnes Co2e`)) %>%
  arrange(desc(total_emissions))

head(top_country)
## # A tibble: 6 x 2
##   country_full             total_emissions
##   <chr>                              <dbl>
## 1 China                        79505597168
## 2 United States of America     38453419159
## 3 India                        22026290510
## 4 Russian Federation           14687431292
## 5 Indonesia                     8233503249
## 6 Japan                         8208617507

These totals indicate that, over the last 5 years, the U.S. has placed runner-up for the most CO2 emissions, second only to China. Using similar code, let’s dive deeper into the sectors most contributing to emissions in the U.S.

## # A tibble: 6 x 2
##   sector        total_emissions
##   <chr>                   <dbl>
## 1 power             82195880544
## 2 manufacturing     57042411063
## 3 transport         43085579679
## 4 agriculture       38273277305
## 5 oil and gas       33263211325
## 6 buildings         24883138209

Over the last 5 years, the power sector has contributed the most to CO2 emissions globally. This may come at no surprise; we utilize vast amounts of power to live our modern lives, and there is no real fossil fuel alternatives that can provide base load power at scale.

Let’s take a look at the top contributors by subsector.

## # A tibble: 6 x 2
##   subsector                             total_emissions
##   <chr>                                           <dbl>
## 1 electricity generation                    74149861000
## 2 roads                                     36063812889
## 3 other manufacturing                       29076461911
## 4 residential commercial onsite heating     21409922971
## 5 enteric fermentation                      16732518899
## 6 oil and gas production                    15672465055

No surprises here, as electricity generation is the main driver of the power industry. Even though coal, oil, and natural gas have long been known to be heavy CO2 emitters, the data shows these economic behemoths still press on as modern leaders in CO2 pollution today.

Conclusions and Further Analysis

We have reached conclusions for our first two questions:

  • How much CO2 does the U.S. emit compared to other countries?
    The U.S. emits roughly 38.5 billion tons of CO2, about half of that of China, the world’s CO2 emissions leader.

  • What parts of the economy contribute the most CO2 pollution?
    Electricity generation is by far the world’s largest subsector of CO2 emissions, followed by road transport and other manufacturing around the world.

Initial analysis of ClimateTrace’s data inspired has inspired questions such as:

  • What countries have null values and why?
  • How does a country’s population correlate with their CO2 emissions?

Further analysis will track trend variation by subsector over time, and look deeper at factors driving CO2 emissions world wide.