`pyam`
an open-source Python package
for IAM scenario analysis¶

Daniel Huppmann, IIASA, huppmann@iiasa.ac.at ¶

The presentation is available at
iiasa.github.io/ene-present.github.io/cdlinks-delhi2018 ¶

The `pyam` package is available at github.com/IAMconsortium/pyam ¶

The package was developed by Matthew Gidden and Daniel Huppmann.
It is released under an APACHE 2.0 Open-Source license.

The presentation is based on a talk by Matthew Gidden given at IAMC 2017, Recife, Brazil and the tutorial notebooks of the pyam package.

This presentation is licensed under
a Creative Commons Attribution 4.0 International License.

Diagnostics, analysis and visualization tools
for Integrated Assessment timeseries data¶

First steps with the `pyam` package¶

The pyam package provides a range of diagnostic tools and functions
for analyzing and working with IAMC-style timeseries data.

The package can be used with data that follows the data template convention of the Integrated Assessment Modeling Consortium (IAMC). An illustrative example is shown below; see data.ene.iiasa.ac.at/database for more information.

model	scenario	region	variable	unit	2005	2010	2015
MESSAGE V.4	AMPERE3-Base	World	Primary Energy	EJ/y	454.5	479.6	...
...	...	...	...	...	...	...	...

Features of the `pyam` package¶

Validation, diagnostics and sanity checks of the data¶

Visualization and plotting tools¶

Categorization of scenarios and creation of metadata indicators¶

Source of tutorial data¶

The timeseries data used in this tutorial is a partial snapshot of the scenario database compiled for the IPCC's Fifth Assessment Report (AR5):

Krey V., O. Masera, G. Blanford, T. Bruckner, R. Cooke, K. Fisher-Vanden, H. Haberl, E. Hertwich, E. Kriegler, D. Mueller, S. Paltsev, L. Price, S. Schlömer, D. Ürge-Vorsatz, D. van Vuuren, and T. Zwickel, 2014: Annex II: Metrics & Methodology.
In: Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Edenhofer, O., R. Pichs-Madruga, Y. Sokona, E. Farahani, S. Kadner, K. Seyboth, A. Adler, I. Baum, S. Brunner, P. Eickemeier, B. Kriemann, J. Savolainen, S. Schlömer, C. von Stechow, T. Zwickel and J.C. Minx (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA. Link

The complete AR5 scenario database is publicly available at tntcat.iiasa.ac.at/AR5DB/.

Scientific references for selected tutorial data¶

The data snapshot used for this tutorial consists of selected data from two model intercomparison projects:

Energy Modeling Forum Round 27 (EMF27), see the Special Issue in Climatic Change 3-4, 2014.
EU FP7 project AMPERE, see the following scientific publications:
- Riahi, K., et al. (2015). "Locked into Copenhagen pledges — Implications of short-term emission targets for the cost and feasibility of long-term climate goals." Technological Forecasting and Social Change 90(Part A): 8-23.
  DOI: 10.1016/j.techfore.2013.09.016
- Kriegler, E., et al. (2015). "Making or breaking climate targets: The AMPERE study on staged accession scenarios for climate policy." Technological Forecasting and Social Change 90(Part A): 24-44.
  DOI: 10.1016/j.techfore.2013.09.021

The data used in this tutorial is ONLY a partial snapshot
of the IPCC AR5 scenario database!
This tutorial is only intended for an illustration of the pyam package.

Import package and load data from the AR5 tutorial csv snapshot file¶

First, we import the pyam package and load the timeseries data snapshot from the file tutorial_AR5_data.csv in the pyam/tutorial folder.

In [1]:

%matplotlib inline
import pyam

In [2]:

data = '../../pyam/tutorial/tutorial_AR5_data.csv'
df = pyam.IamDataFrame(data=data)

INFO:root:Reading `../../pyam/tutorial/tutorial_AR5_data.csv`

What's in our dataset?¶

As a first step, we use a number of functions to find out what is included in the snapshot.

In [3]:

df.models()

Out[3]:

0    AIM-Enduse 12.1
1           GCAM 3.0
2          IMAGE 2.4
3        MERGE_EMF27
4        MESSAGE V.4
5         REMIND 1.5
6        WITCH_EMF27
Name: model, dtype: object

In [4]:

df.scenarios()

Out[4]:

0             AMPERE3-450
1         AMPERE3-450P-CE
2         AMPERE3-450P-EU
3             AMPERE3-550
4         AMPERE3-550P-EU
5     AMPERE3-Base-EUback
6       AMPERE3-CF450P-EU
7          AMPERE3-RefPol
8          EMF27-450-Conv
9         EMF27-450-NoCCS
10       EMF27-550-LimBio
11    EMF27-Base-FullTech
12          EMF27-G8-EERE
Name: scenario, dtype: object

In [5]:

df.regions()

Out[5]:

0      ASIA
1       LAM
2       MAF
3    OECD90
4       REF
5     World
Name: region, dtype: object

In [6]:

df.variables(include_units=True)

Out[6]:

	variable	unit
0	Emissions\|CO2	Mt CO2/yr
1	Emissions\|CO2\|Fossil Fuels and Industry	Mt CO2/yr
2	Emissions\|CO2\|Fossil Fuels and Industry\|Energy...	Mt CO2/yr
3	Emissions\|CO2\|Fossil Fuels and Industry\|Energy...	Mt CO2/yr
4	Price\|Carbon	US$2005/t CO2
5	Primary Energy	EJ/yr
6	Primary Energy\|Coal	EJ/yr
7	Primary Energy\|Fossil\|w/ CCS	EJ/yr
8	Temperature\|Global Mean\|MAGICC6\|MED	°C

A first look at the data¶

We use the temperature outcome as the first variable of interest in our data snapshot.

In [7]:

v = 'Temperature|Global Mean|MAGICC6|MED'
df.filter({'region': 'World', 'variable': v}).line_plot(legend=False)

Out[7]:

<matplotlib.axes._subplots.AxesSubplot at 0x111b3ba90>

Categorization of scenarios¶

We use the temperature outcome as a first criteria for categorization of scenarios.

The function categorize() assigns all scenarios fulfilling a number of criteria to a specific category. The function metadata() applies a categorization to all scenarios.

In [8]:

df.metadata(meta='uncategorized', name='temperature')

In [9]:

df.categorize(
    'temperature', 'Below 1.6C',
    criteria={v: {'up': 1.6, 'year': 2100}},
    color='cornflowerblue'
)

INFO:root:4 scenarios categorized as `temperature: Below 1.6C`

In [10]:

df.categorize(
    'temperature', 'Below 2.0C',
    criteria={v: {'up': 2.0, 'lo': 1.6, 'year': 2100}},
    color='forestgreen'
)

INFO:root:8 scenarios categorized as `temperature: Below 2.0C`

In [11]:

df.categorize(
    'temperature', 'Below 2.5C',
    criteria={v: {'up': 2.5, 'lo': 2.0, 'year': 2100}},
    color='gold'
)

INFO:root:16 scenarios categorized as `temperature: Below 2.5C`

In [12]:

df.categorize(
    'temperature', 'Below 3.5C',
     criteria={v: {'up': 3.5, 'lo': 2.5, 'year': 2100}},
     color='firebrick'
)

INFO:root:3 scenarios categorized as `temperature: Below 3.5C`

In [13]:

df.categorize(
    'temperature', 'Above 3.5C',
    criteria={v: {'lo': 3.5, 'year': 2100}},
    color='magenta'
)

INFO:root:9 scenarios categorized as `temperature: Above 3.5C`

Checking for uncategorized scenarios¶

In [14]:

df.filter({'temperature': 'uncategorized'})[['model', 'scenario']]\
    .drop_duplicates()

Out[14]:

	model	scenario
0	AIM-Enduse 12.1	EMF27-450-Conv
23	AIM-Enduse 12.1	EMF27-450-NoCCS
46	AIM-Enduse 12.1	EMF27-550-LimBio
69	AIM-Enduse 12.1	EMF27-Base-FullTech
92	AIM-Enduse 12.1	EMF27-G8-EERE
590	WITCH_EMF27	EMF27-450-Conv
613	WITCH_EMF27	EMF27-550-LimBio
636	WITCH_EMF27	EMF27-Base-FullTech

The pyam package includes the function require_variable() to check a-priori whether a variable exists. The option exclude: True marks these scenarios as "exclude" in the metadata, so that they can be easily removed from further analysis.

In [15]:

df.require_variable(variable=v, exclude=True)

INFO:root:8 scenarios do not include required variable `Temperature|Global Mean|MAGICC6|MED`, marked as `exclude: True` in metadata

Out[15]:

	model	scenario
0	AIM-Enduse 12.1	EMF27-450-Conv
1	AIM-Enduse 12.1	EMF27-450-NoCCS
2	AIM-Enduse 12.1	EMF27-550-LimBio
3	AIM-Enduse 12.1	EMF27-Base-FullTech
4	AIM-Enduse 12.1	EMF27-G8-EERE
5	WITCH_EMF27	EMF27-450-Conv
6	WITCH_EMF27	EMF27-550-LimBio
7	WITCH_EMF27	EMF27-Base-FullTech

Plotting the temperature outcome again using the categorization¶

We repeat the plot, this time excluding the uncategorized scenarios and using the 'temperature' metadata column to assign colors. The colors of the individual categories were defined in the function categorize() above.

In [16]:

df.filter({'variable': v, 'exclude': False})\
    .line_plot(color='temperature')

Out[16]:

<matplotlib.axes._subplots.AxesSubplot at 0x111d7add8>

Using the categorization to analyse other variables¶

We now plot the timeseries data of the 'Primary Energy' variable, using the color-coding of the Temperature categorization to analyse the correlation between energy consumption and warming.

In [17]:

df.filter({'variable': 'Primary Energy', 'exclude': False})\
    .line_plot(color='temperature')

Out[17]:

<matplotlib.axes._subplots.AxesSubplot at 0x1124030f0>

Filtering scenarios by Primary Energy in the base year¶

To get clearer understanding of the relationship between Primary Energy and Warming, we focus on only those scenarios that have similar levels of Primary Energy in the base year (2010).

We first use the function validate() to check that certain values are within a given range,

In [18]:

df.validate(criteria={'Primary Energy': {'lo': 400, 'year': 2010}}).head()

INFO:root:104 of 6622 data points to not satisfy the criteria

Out[18]:

	model	scenario	region	variable	unit	year	value
672	AIM-Enduse 12.1	EMF27-450-Conv	REF	Primary Energy	EJ/yr	2010	52.61
666	AIM-Enduse 12.1	EMF27-450-Conv	MAF	Primary Energy	EJ/yr	2010	50.12
660	AIM-Enduse 12.1	EMF27-450-Conv	ASIA	Primary Energy	EJ/yr	2010	168.75
669	AIM-Enduse 12.1	EMF27-450-Conv	OECD90	Primary Energy	EJ/yr	2010	202.29
663	AIM-Enduse 12.1	EMF27-450-Conv	LAM	Primary Energy	EJ/yr	2010	31.42

Assigning valid scenarios to a new category¶

We assign those scenarios that have a Primary Energy level above 400 EJ/y in 2010 to a new category, and then re-display the previous figure including only these scenarios.

In [19]:

df.metadata(meta='uncategorized', name='PE')

In [20]:

df.categorize(name='PE', value='high',
              criteria={'Primary Energy': {'lo': 400, 'year': 2010}})

INFO:root:24 scenarios categorized as `PE: high`

In [21]:

df.filter(
    {'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
    .line_plot(color='temperature')

Out[21]:

<matplotlib.axes._subplots.AxesSubplot at 0x11259f048>

Highlighting particular models and scenarios¶

Next, we want to check how one particular model behaves within an ensemble of scenarios.

In [22]:

df.metadata(meta='uncategorized', name='model_family')

In [23]:

pyam.categorize(
    df, filters={'model': 'MESSAGE*'}, name='model_family', value='MESSAGE',
    criteria={'Primary Energy': {'lo': 400, 'year': 2010}},
    marker='o')

INFO:root:4 scenarios categorized as `model_family: MESSAGE`

In [24]:

from pyam.plotting import run_control
rc = run_control()
rc.update({'marker': {'model_family': {'uncategorized': None}}})

In [25]:

df.filter(
    {'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
    .line_plot(color='temperature', marker='model_family')

Out[25]:

<matplotlib.axes._subplots.AxesSubplot at 0x112809c50>

And just for the fun of it, let's add scenario linestyles, too...¶

In [26]:

df.filter(
    {'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
    .line_plot(color='temperature', marker='model_family',
               linestyle='scenario', legend=True)

Out[26]:

<matplotlib.axes._subplots.AxesSubplot at 0x1125cd4a8>

Further analysis using metadata¶

Rather than plotting the development over time, it is often useful to extract and visualize key indicators. In this example, we determine the year of peak warming and plot this indicator against the cumulative CO2 emissions from 2010 until that year.

In [27]:

def peak_warming(x, peak_year=False):
    peak = x[x == x.max()]
    if peak_year:
        return peak.index[0]
    else:
        return float(max(peak))

In [28]:

mean_temperature = df.filter(filters={'variable': v}).timeseries()

In [29]:

df.metadata(
    mean_temperature.apply(peak_warming, raw=False, axis=1),
    'median warming at peak')

In [30]:

df.metadata(
    mean_temperature.apply(peak_warming, peak_year=True, raw=False, axis=1),
    'year of peak warming')

In [31]:

co2 = df.filter({'region': 'World', 'variable': 'Emissions|CO2'})\
    .timeseries() / 1000

In [32]:

df.metadata(
    co2.apply(lambda x:
              pyam.cumulative(x, first_year=2010,
                              last_year=df.meta.loc[x.name[0:2],
                                                    'year of peak warming']),
              raw=False, axis=1),
    'cumulative CO2 emissions (2010 to peak warming)')

In [33]:

df.filter({'exclude': False}).\
    scatter(x='cumulative CO2 emissions (2010 to peak warming)',
            y='median warming at peak')

Out[33]:

<matplotlib.axes._subplots.AxesSubplot at 0x112c1d4a8>

In [34]:

df.filter({'exclude': False}).\
    scatter(x='cumulative CO2 emissions (2010 to peak warming)',
            y='median warming at peak',
            color='temperature')

Out[34]:

<matplotlib.axes._subplots.AxesSubplot at 0x1138b6a20>

In [35]:

df.filter({'exclude': False}).\
    scatter(x='cumulative CO2 emissions (2010 to peak warming)',
            y='median warming at peak',
            color='temperature', marker='model_family')

Out[35]:

<matplotlib.axes._subplots.AxesSubplot at 0x1128e6ac8>

We had previously defined the marker for the scenarios not categorized by model family as None, so no marker is shown in the previous scatterplot.

We can easily reset that marker as illustrated below.

In [36]:

rc.update({'marker': {'model_family': {'uncategorized': '*'}}})

In [37]:

df.filter({'exclude': False}).\
    scatter(x='cumulative CO2 emissions (2010 to peak warming)',
            y='median warming at peak',
            color='temperature', marker='model_family')

Out[37]:

<matplotlib.axes._subplots.AxesSubplot at 0x113b62c18>

Looking at regional disaggregation for a particular scenario¶

We use the 'EMF27-550-LimBio' scenario from the MESSAGE model to more closely look at the regional break

In [38]:

df_ = df.filter({'model': 'MESSAGE*', 'scenario': 'EMF27-550-LimBio',
                 'variable': 'Emissions|CO2'})

In [39]:

df_.filter({'region': 'World'}, keep=False).bar_plot(bars='region')

Out[39]:

<matplotlib.axes._subplots.AxesSubplot at 0x113d75320>

In [40]:

df_.filter({'region': 'World'}, keep=False).\
    filter({'year': 2010}).pie_plot(category='region')

Out[40]:

<matplotlib.axes._subplots.AxesSubplot at 0x113f132e8>

In [41]:

df_.filter({'region': 'World'}, keep=False).\
    filter({'year': 2050}).pie_plot(category='region')

Out[41]:

<matplotlib.axes._subplots.AxesSubplot at 0x1140d0e10>

In [42]:

df_.filter({'region': 'World'}, keep=False)\
    .filter({'year': 2100}).pie_plot(category='region')

Out[42]:

<matplotlib.axes._subplots.AxesSubplot at 0x11411cef0>

And finally - who doesn't like maps?¶

This feature is work in progress. The following figure is based on the unit test and the CEDS Harmonization work by Matt Gidden.

In [45]:

import matplotlib.pyplot as plt
import cartopy
fig, ax = plt.subplots(
    subplot_kw={'projection': cartopy.crs.PlateCarree()}, figsize=(10, 7))
df.map_regions('iso').region_plot(ax=ax, cbar=False)

Out[45]:

<cartopy.mpl.geoaxes.GeoAxesSubplot at 0x1143bca58>

`pyam`
an open-source Python package
for IAM scenario analysis¶

Daniel Huppmann, IIASA, huppmann@iiasa.ac.at ¶

The presentation is available at
iiasa.github.io/ene-present.github.io/cdlinks-delhi2018 ¶

The `pyam` package is available at github.com/IAMconsortium/pyam ¶

Diagnostics, analysis and visualization tools
for Integrated Assessment timeseries data¶

First steps with the `pyam` package¶

Features of the `pyam` package¶

Validation, diagnostics and sanity checks of the data¶

Visualization and plotting tools¶

Categorization of scenarios and creation of metadata indicators¶

Source of tutorial data¶

Scientific references for selected tutorial data¶

Import package and load data from the AR5 tutorial csv snapshot file¶

What's in our dataset?¶

A first look at the data¶

Categorization of scenarios¶

Checking for uncategorized scenarios¶

Plotting the temperature outcome again using the categorization¶

Using the categorization to analyse other variables¶

Filtering scenarios by Primary Energy in the base year¶

Assigning valid scenarios to a new category¶

Highlighting particular models and scenarios¶

And just for the fun of it, let's add scenario linestyles, too...¶

Further analysis using metadata¶

Looking at regional disaggregation for a particular scenario¶

And finally - who doesn't like maps?¶

Summary, conclusions, outlook¶

The `pyam` package ...¶

1. is a toolbox for AUTOMATED sanity checks and diagnostics of scenarios¶

2. allows efficient analysis of scenarios in model comparison exercises¶

3. provides a number of 'out-of-the-box' visualization tools¶

We hope that the package will develop into a valuable resource
for the energy modeling and integrated assessment community!¶

Please send suggestions or contribute to the package development on GitHub!¶

Find out more on github.com/IAMconsortium/pyam ¶

pyam an open-source Python package for IAM scenario analysis¶

Daniel Huppmann, IIASA, huppmann@iiasa.ac.at¶

The presentation is available at iiasa.github.io/ene-present.github.io/cdlinks-delhi2018¶

The pyam package is available at github.com/IAMconsortium/pyam¶

Diagnostics, analysis and visualization tools for Integrated Assessment timeseries data¶

First steps with the pyam package¶

Features of the pyam package¶

Validation, diagnostics and sanity checks of the data¶

Visualization and plotting tools¶

Categorization of scenarios and creation of metadata indicators¶

Source of tutorial data¶

Scientific references for selected tutorial data¶

Import package and load data from the AR5 tutorial csv snapshot file¶

What's in our dataset?¶

A first look at the data¶

Categorization of scenarios¶

Checking for uncategorized scenarios¶

Plotting the temperature outcome again using the categorization¶

Using the categorization to analyse other variables¶

Filtering scenarios by Primary Energy in the base year¶

Assigning valid scenarios to a new category¶

Highlighting particular models and scenarios¶

And just for the fun of it, let's add scenario linestyles, too...¶

Further analysis using metadata¶

Looking at regional disaggregation for a particular scenario¶

And finally - who doesn't like maps?¶

Summary, conclusions, outlook¶

The pyam package ...¶

1. is a toolbox for AUTOMATED sanity checks and diagnostics of scenarios¶

2. allows efficient analysis of scenarios in model comparison exercises¶

3. provides a number of 'out-of-the-box' visualization tools¶

We hope that the package will develop into a valuable resource for the energy modeling and integrated assessment community!¶

Please send suggestions or contribute to the package development on GitHub!¶

Find out more on github.com/IAMconsortium/pyam¶

`pyam`
an open-source Python package
for IAM scenario analysis¶

Daniel Huppmann, IIASA, huppmann@iiasa.ac.at ¶

The presentation is available at
iiasa.github.io/ene-present.github.io/cdlinks-delhi2018 ¶

The `pyam` package is available at github.com/IAMconsortium/pyam ¶

Diagnostics, analysis and visualization tools
for Integrated Assessment timeseries data¶

First steps with the `pyam` package¶

Features of the `pyam` package¶

The `pyam` package ...¶

We hope that the package will develop into a valuable resource
for the energy modeling and integrated assessment community!¶

Find out more on github.com/IAMconsortium/pyam ¶