pyam
pyam
package is available at github.com/IAMconsortium/pyam¶The package was developed by Matthew Gidden and Daniel Huppmann.
It is released under an APACHE 2.0 Open-Source license.
The presentation is based on a talk by Matthew Gidden given at IAMC 2017, Recife, Brazil and the tutorial notebooks of the pyam
package.
This presentation is licensed under
a Creative Commons Attribution 4.0 International License.
pyam
package¶The pyam
package provides a range of diagnostic tools and functions
for analyzing and working with IAMC-style timeseries data.
The package can be used with data that follows the data template convention of the Integrated Assessment Modeling Consortium (IAMC). An illustrative example is shown below; see data.ene.iiasa.ac.at/database for more information.
model | scenario | region | variable | unit | 2005 | 2010 | 2015 |
---|---|---|---|---|---|---|---|
MESSAGE V.4 | AMPERE3-Base | World | Primary Energy | EJ/y | 454.5 | 479.6 | ... |
... | ... | ... | ... | ... | ... | ... | ... |
The timeseries data used in this tutorial is a partial snapshot of the scenario database compiled for the IPCC's Fifth Assessment Report (AR5):
Krey V., O. Masera, G. Blanford, T. Bruckner, R. Cooke, K. Fisher-Vanden, H. Haberl, E. Hertwich, E. Kriegler, D. Mueller, S. Paltsev, L. Price, S. Schlömer, D. Ürge-Vorsatz, D. van Vuuren, and T. Zwickel, 2014: Annex II: Metrics & Methodology.
In: Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Edenhofer, O., R. Pichs-Madruga, Y. Sokona, E. Farahani, S. Kadner, K. Seyboth, A. Adler, I. Baum, S. Brunner, P. Eickemeier, B. Kriemann, J. Savolainen, S. Schlömer, C. von Stechow, T. Zwickel and J.C. Minx (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA. Link
The complete AR5 scenario database is publicly available at tntcat.iiasa.ac.at/AR5DB/.
The data snapshot used for this tutorial consists of selected data from two model intercomparison projects:
The data used in this tutorial is ONLY a partial snapshot
of the IPCC AR5 scenario database!
This tutorial is only intended for an illustration of the pyam
package.
First, we import the pyam
package and load the timeseries data snapshot from the file tutorial_AR5_data.csv
in the pyam/tutorial
folder.
%matplotlib inline
import pyam
data = '../../pyam/tutorial/tutorial_AR5_data.csv'
df = pyam.IamDataFrame(data=data)
INFO:root:Reading `../../pyam/tutorial/tutorial_AR5_data.csv`
As a first step, we use a number of functions to find out what is included in the snapshot.
df.models()
0 AIM-Enduse 12.1 1 GCAM 3.0 2 IMAGE 2.4 3 MERGE_EMF27 4 MESSAGE V.4 5 REMIND 1.5 6 WITCH_EMF27 Name: model, dtype: object
df.scenarios()
0 AMPERE3-450 1 AMPERE3-450P-CE 2 AMPERE3-450P-EU 3 AMPERE3-550 4 AMPERE3-550P-EU 5 AMPERE3-Base-EUback 6 AMPERE3-CF450P-EU 7 AMPERE3-RefPol 8 EMF27-450-Conv 9 EMF27-450-NoCCS 10 EMF27-550-LimBio 11 EMF27-Base-FullTech 12 EMF27-G8-EERE Name: scenario, dtype: object
df.regions()
0 ASIA 1 LAM 2 MAF 3 OECD90 4 REF 5 World Name: region, dtype: object
df.variables(include_units=True)
variable | unit | |
---|---|---|
0 | Emissions|CO2 | Mt CO2/yr |
1 | Emissions|CO2|Fossil Fuels and Industry | Mt CO2/yr |
2 | Emissions|CO2|Fossil Fuels and Industry|Energy... | Mt CO2/yr |
3 | Emissions|CO2|Fossil Fuels and Industry|Energy... | Mt CO2/yr |
4 | Price|Carbon | US$2005/t CO2 |
5 | Primary Energy | EJ/yr |
6 | Primary Energy|Coal | EJ/yr |
7 | Primary Energy|Fossil|w/ CCS | EJ/yr |
8 | Temperature|Global Mean|MAGICC6|MED | °C |
We use the temperature outcome as the first variable of interest in our data snapshot.
v = 'Temperature|Global Mean|MAGICC6|MED'
df.filter({'region': 'World', 'variable': v}).line_plot(legend=False)
<matplotlib.axes._subplots.AxesSubplot at 0x111b3ba90>
We use the temperature outcome as a first criteria for categorization of scenarios.
The function categorize()
assigns all scenarios fulfilling a number of criteria to a specific category.
The function metadata()
applies a categorization to all scenarios.
df.metadata(meta='uncategorized', name='temperature')
df.categorize(
'temperature', 'Below 1.6C',
criteria={v: {'up': 1.6, 'year': 2100}},
color='cornflowerblue'
)
INFO:root:4 scenarios categorized as `temperature: Below 1.6C`
df.categorize(
'temperature', 'Below 2.0C',
criteria={v: {'up': 2.0, 'lo': 1.6, 'year': 2100}},
color='forestgreen'
)
INFO:root:8 scenarios categorized as `temperature: Below 2.0C`
df.categorize(
'temperature', 'Below 2.5C',
criteria={v: {'up': 2.5, 'lo': 2.0, 'year': 2100}},
color='gold'
)
INFO:root:16 scenarios categorized as `temperature: Below 2.5C`
df.categorize(
'temperature', 'Below 3.5C',
criteria={v: {'up': 3.5, 'lo': 2.5, 'year': 2100}},
color='firebrick'
)
INFO:root:3 scenarios categorized as `temperature: Below 3.5C`
df.categorize(
'temperature', 'Above 3.5C',
criteria={v: {'lo': 3.5, 'year': 2100}},
color='magenta'
)
INFO:root:9 scenarios categorized as `temperature: Above 3.5C`
df.filter({'temperature': 'uncategorized'})[['model', 'scenario']]\
.drop_duplicates()
model | scenario | |
---|---|---|
0 | AIM-Enduse 12.1 | EMF27-450-Conv |
23 | AIM-Enduse 12.1 | EMF27-450-NoCCS |
46 | AIM-Enduse 12.1 | EMF27-550-LimBio |
69 | AIM-Enduse 12.1 | EMF27-Base-FullTech |
92 | AIM-Enduse 12.1 | EMF27-G8-EERE |
590 | WITCH_EMF27 | EMF27-450-Conv |
613 | WITCH_EMF27 | EMF27-550-LimBio |
636 | WITCH_EMF27 | EMF27-Base-FullTech |
The pyam
package includes the function require_variable()
to check a-priori whether a variable exists. The option exclude: True
marks these scenarios as "exclude" in the metadata, so that they can be easily removed from further analysis.
df.require_variable(variable=v, exclude=True)
INFO:root:8 scenarios do not include required variable `Temperature|Global Mean|MAGICC6|MED`, marked as `exclude: True` in metadata
model | scenario | |
---|---|---|
0 | AIM-Enduse 12.1 | EMF27-450-Conv |
1 | AIM-Enduse 12.1 | EMF27-450-NoCCS |
2 | AIM-Enduse 12.1 | EMF27-550-LimBio |
3 | AIM-Enduse 12.1 | EMF27-Base-FullTech |
4 | AIM-Enduse 12.1 | EMF27-G8-EERE |
5 | WITCH_EMF27 | EMF27-450-Conv |
6 | WITCH_EMF27 | EMF27-550-LimBio |
7 | WITCH_EMF27 | EMF27-Base-FullTech |
We repeat the plot, this time excluding the uncategorized scenarios and using the 'temperature'
metadata column to assign colors. The colors of the individual categories were defined in the function categorize()
above.
df.filter({'variable': v, 'exclude': False})\
.line_plot(color='temperature')
<matplotlib.axes._subplots.AxesSubplot at 0x111d7add8>
We now plot the timeseries data of the 'Primary Energy' variable, using the color-coding of the Temperature categorization to analyse the correlation between energy consumption and warming.
df.filter({'variable': 'Primary Energy', 'exclude': False})\
.line_plot(color='temperature')
<matplotlib.axes._subplots.AxesSubplot at 0x1124030f0>
To get clearer understanding of the relationship between Primary Energy and Warming, we focus on only those scenarios that have similar levels of Primary Energy in the base year (2010).
We first use the function validate()
to check that certain values are within a given range,
df.validate(criteria={'Primary Energy': {'lo': 400, 'year': 2010}}).head()
INFO:root:104 of 6622 data points to not satisfy the criteria
model | scenario | region | variable | unit | year | value | |
---|---|---|---|---|---|---|---|
672 | AIM-Enduse 12.1 | EMF27-450-Conv | REF | Primary Energy | EJ/yr | 2010 | 52.61 |
666 | AIM-Enduse 12.1 | EMF27-450-Conv | MAF | Primary Energy | EJ/yr | 2010 | 50.12 |
660 | AIM-Enduse 12.1 | EMF27-450-Conv | ASIA | Primary Energy | EJ/yr | 2010 | 168.75 |
669 | AIM-Enduse 12.1 | EMF27-450-Conv | OECD90 | Primary Energy | EJ/yr | 2010 | 202.29 |
663 | AIM-Enduse 12.1 | EMF27-450-Conv | LAM | Primary Energy | EJ/yr | 2010 | 31.42 |
We assign those scenarios that have a Primary Energy level above 400 EJ/y in 2010 to a new category, and then re-display the previous figure including only these scenarios.
df.metadata(meta='uncategorized', name='PE')
df.categorize(name='PE', value='high',
criteria={'Primary Energy': {'lo': 400, 'year': 2010}})
INFO:root:24 scenarios categorized as `PE: high`
df.filter(
{'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
.line_plot(color='temperature')
<matplotlib.axes._subplots.AxesSubplot at 0x11259f048>
Next, we want to check how one particular model behaves within an ensemble of scenarios.
df.metadata(meta='uncategorized', name='model_family')
pyam.categorize(
df, filters={'model': 'MESSAGE*'}, name='model_family', value='MESSAGE',
criteria={'Primary Energy': {'lo': 400, 'year': 2010}},
marker='o')
INFO:root:4 scenarios categorized as `model_family: MESSAGE`
from pyam.plotting import run_control
rc = run_control()
rc.update({'marker': {'model_family': {'uncategorized': None}}})
df.filter(
{'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
.line_plot(color='temperature', marker='model_family')
<matplotlib.axes._subplots.AxesSubplot at 0x112809c50>
df.filter(
{'variable': 'Primary Energy', 'exclude': False, 'PE': 'high'})\
.line_plot(color='temperature', marker='model_family',
linestyle='scenario', legend=True)
<matplotlib.axes._subplots.AxesSubplot at 0x1125cd4a8>
Rather than plotting the development over time, it is often useful to extract and visualize key indicators. In this example, we determine the year of peak warming and plot this indicator against the cumulative CO2 emissions from 2010 until that year.
def peak_warming(x, peak_year=False):
peak = x[x == x.max()]
if peak_year:
return peak.index[0]
else:
return float(max(peak))
mean_temperature = df.filter(filters={'variable': v}).timeseries()
df.metadata(
mean_temperature.apply(peak_warming, raw=False, axis=1),
'median warming at peak')
df.metadata(
mean_temperature.apply(peak_warming, peak_year=True, raw=False, axis=1),
'year of peak warming')
co2 = df.filter({'region': 'World', 'variable': 'Emissions|CO2'})\
.timeseries() / 1000
df.metadata(
co2.apply(lambda x:
pyam.cumulative(x, first_year=2010,
last_year=df.meta.loc[x.name[0:2],
'year of peak warming']),
raw=False, axis=1),
'cumulative CO2 emissions (2010 to peak warming)')
df.filter({'exclude': False}).\
scatter(x='cumulative CO2 emissions (2010 to peak warming)',
y='median warming at peak')
<matplotlib.axes._subplots.AxesSubplot at 0x112c1d4a8>
df.filter({'exclude': False}).\
scatter(x='cumulative CO2 emissions (2010 to peak warming)',
y='median warming at peak',
color='temperature')
<matplotlib.axes._subplots.AxesSubplot at 0x1138b6a20>
df.filter({'exclude': False}).\
scatter(x='cumulative CO2 emissions (2010 to peak warming)',
y='median warming at peak',
color='temperature', marker='model_family')
<matplotlib.axes._subplots.AxesSubplot at 0x1128e6ac8>
We had previously defined the marker for the scenarios not categorized by model family as None
, so no marker is shown in the previous scatterplot.
We can easily reset that marker as illustrated below.
rc.update({'marker': {'model_family': {'uncategorized': '*'}}})
df.filter({'exclude': False}).\
scatter(x='cumulative CO2 emissions (2010 to peak warming)',
y='median warming at peak',
color='temperature', marker='model_family')
<matplotlib.axes._subplots.AxesSubplot at 0x113b62c18>
We use the 'EMF27-550-LimBio'
scenario from the MESSAGE model to more closely look at the regional break
df_ = df.filter({'model': 'MESSAGE*', 'scenario': 'EMF27-550-LimBio',
'variable': 'Emissions|CO2'})
df_.filter({'region': 'World'}, keep=False).bar_plot(bars='region')
<matplotlib.axes._subplots.AxesSubplot at 0x113d75320>
df_.filter({'region': 'World'}, keep=False).\
filter({'year': 2010}).pie_plot(category='region')
<matplotlib.axes._subplots.AxesSubplot at 0x113f132e8>
df_.filter({'region': 'World'}, keep=False).\
filter({'year': 2050}).pie_plot(category='region')
<matplotlib.axes._subplots.AxesSubplot at 0x1140d0e10>
df_.filter({'region': 'World'}, keep=False)\
.filter({'year': 2100}).pie_plot(category='region')
<matplotlib.axes._subplots.AxesSubplot at 0x11411cef0>
This feature is work in progress. The following figure is based on the unit test and the CEDS Harmonization work by Matt Gidden.
import matplotlib.pyplot as plt
import cartopy
fig, ax = plt.subplots(
subplot_kw={'projection': cartopy.crs.PlateCarree()}, figsize=(10, 7))
df.map_regions('iso').region_plot(ax=ax, cbar=False)
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0x1143bca58>
pyam
package ...¶