Skip to content

Instantly share code, notes, and snippets.

@camriddell
Last active August 2, 2024 16:05
Show Gist options
  • Save camriddell/bfbe9c7425e230bcfe3c246f21c3329f to your computer and use it in GitHub Desktop.
Save camriddell/bfbe9c7425e230bcfe3c246f21c3329f to your computer and use it in GitHub Desktop.
notes-2024-07-16.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "a3d481af",
"metadata": {},
"source": [
"# How do I even get started with Data Visualization?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a94b3c78",
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display\n",
"\n",
"display(\"Let's Get Started\")"
]
},
{
"cell_type": "markdown",
"id": "547ba3f5",
"metadata": {},
"source": [
"## Load Data Files\n",
"\n",
"*This is optional if you are **not** using Google Colab*\n",
"\n",
"Execute these cells to access the `data` for this workshop and install some additional packages."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c02aee51",
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/dutc-io/agu-data-viz.git"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b34f338e",
"metadata": {},
"outputs": [],
"source": [
"%cd agu-data-viz"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "26274060",
"metadata": {},
"outputs": [],
"source": [
"!pip install ipyvizzu==0.15.0"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3f7b766",
"metadata": {},
"outputs": [],
"source": [
"! [[ -e data/anscombe.csv ]] && echo \"Data files loaded!\""
]
},
{
"cell_type": "markdown",
"id": "afed0baf",
"metadata": {},
"source": [
"## Types of Visualizations\n",
"- Exploratory\n",
"- Communicative\n",
"- Fun!\n",
"\n",
"## Types of Data Visualization Tools\n",
"- Non-programmatic\n",
" - low : hand/computer drawing\n",
" - high: GUI chart builders* (excel, tableau, ...)\n",
"- Programmatic\n",
" - low : programmatic drawing\n",
" - high: declarative & convenience"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "239cb123",
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display\n",
"from pandas import read_csv\n",
"\n",
"anscombe = read_csv('data/anscombe.csv')\n",
"\n",
"display(\n",
" # anscombe.head(),\n",
" anscombe.groupby('id')[['x', 'y']].agg(['mean', 'std']),\n",
")"
]
},
{
"cell_type": "markdown",
"id": "36660b4e",
"metadata": {},
"source": [
"## Common Data Visualization APIs\n",
"\n",
"### Drawing (non-declarative)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7627320f",
"metadata": {},
"outputs": [],
"source": [
"from pandas import read_csv\n",
"from matplotlib.pyplot import subplots, show\n",
"\n",
"anscombe = read_csv('data/anscombe.csv')\n",
"\n",
"fig, axes = subplots(nrows=2, ncols=2)\n",
"\n",
"for (label, group), ax in zip(anscombe.groupby('id'), axes.flat):\n",
" ax.scatter(group['x'], group['y'], s=24)\n",
" ax.set_title(label)\n",
"\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "a7ecdc22",
"metadata": {},
"source": [
"### High Level - Declarative"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bec53cfb",
"metadata": {},
"outputs": [],
"source": [
"from matplotlib.pyplot import show\n",
"from plotnine import ggplot, facet_wrap, geom_point, geom_smooth, labs, theme_minimal, aes\n",
"from pandas import read_csv\n",
"\n",
"anscombe = read_csv('data/anscombe.csv')\n",
"\n",
"(\n",
" ggplot(anscombe, aes(x='x', y='y'))\n",
" + facet_wrap('id', ncol=2)\n",
" + geom_point()\n",
" + geom_smooth(method='ols')\n",
" + labs(x='x variable', y='y variable', title='Anscombe’s Quartet')\n",
" + theme_minimal()\n",
").draw()\n",
"\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "c4d1a578",
"metadata": {},
"source": [
"### High Level - Convenience"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a67b81e",
"metadata": {},
"outputs": [],
"source": [
"from matplotlib.pyplot import show\n",
"from seaborn import lmplot\n",
"from pandas import read_csv\n",
"\n",
"anscombe = read_csv('data/anscombe.csv')\n",
"\n",
"lmplot(anscombe, x='x', y='y', col='id', col_wrap=2)\n",
"\n",
"show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0d7ee168",
"metadata": {},
"outputs": [],
"source": [
"from seaborn.objects import Plot, Dots, PolyFit, Line\n",
"from pandas import read_csv\n",
"\n",
"anscombe = read_csv('data/anscombe.csv')\n",
"\n",
"\n",
"(\n",
" Plot(anscombe, x='x', y='y')\n",
" .facet(col='id', wrap=2)\n",
" .add(Dots(color='black'))\n",
" .add(Line(), PolyFit(1))\n",
").show()"
]
},
{
"cell_type": "markdown",
"id": "3012bbe4",
"metadata": {},
"source": [
"### Is it worth it to learn multiple data visualization libraries/languages?\n",
"\n",
"## Let’s Make Some Viz!\n",
"\n",
"### Static Data Visualizations (explore, communicative, fun)\n",
"\n",
"**matplotlib**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "15be4264",
"metadata": {},
"outputs": [],
"source": [
"from matplotlib.pyplot import figure, show\n",
"from matplotlib.patches import Circle, Rectangle\n",
"\n",
"fig = figure(figsize=(6,6))\n",
"\n",
"c = Circle((.5, .8), .1)\n",
"fig.add_artist(c)\n",
"\n",
"## uncomment below if running from Jupyter Notebook/Google Colab\n",
"# ax = fig.add_axes([0, 0, 0, 0])\n",
"# ax.set_visible(False)\n",
"\n",
"body_rect = Rectangle((.47, .75), .06, -.5)\n",
"fig.add_artist(body_rect)\n",
"\n",
"arms_rect = Rectangle((.3, .55), .4, .05)\n",
"fig.add_artist(arms_rect)\n",
"\n",
"lleg_rect = Rectangle((.5, .3), .3, .05, angle=225)\n",
"fig.add_artist(lleg_rect)\n",
"\n",
"rleg_rect = Rectangle((.47, .26), .3, .05, angle=-45)\n",
"fig.add_artist(rleg_rect)\n",
"\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "38a60299",
"metadata": {},
"source": [
"**Useful Things to draw for Data Viz**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82d8ea89",
"metadata": {},
"outputs": [],
"source": [
"from numpy import linspace, pi, sin, cos\n",
"from matplotlib.pyplot import figure, show, plot, rc\n",
"\n",
"rc('font', size=16)\n",
"\n",
"xs = linspace(0, 2 * pi)\n",
"\n",
"fig = figure()\n",
"ax = fig.add_axes([.3, .3, .5, .5])\n",
"\n",
"# ax.plot(xs, sin(xs))\n",
"# ax.plot(xs, cos(xs))\n",
"\n",
"# ax.set_ylabel('this is my y label')\n",
"# fig.supylabel('this my figure y label')\n",
"\n",
"show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77e502c3",
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display\n",
"\n",
"from numpy import linspace, pi, sin, cos\n",
"from matplotlib.pyplot import subplots, show\n",
"\n",
"# fig, ax = subplots()\n",
"\n",
"# fig, axes = subplots(nrows=1, ncols=2)\n",
"fig, axes = subplots(nrows=2, ncols=2)\n",
"# print(axes[:, 0])\n",
"\n",
"xs = linspace(0, 2 * pi)\n",
"\n",
"# display(axes)\n",
"# display(type(axes))\n",
"\n",
"axes[1, 0].plot(xs, sin(xs))\n",
"\n",
"# display(axes, type(axes), axes[0])\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "ae0d8aa0",
"metadata": {},
"source": [
"Matplotlib is object oriented\n",
"- Containers → Artists\n",
"- Figure → Axes →\n",
" - X/YAxis\n",
" - ticks\n",
" - ticklabels\n",
" - axis label\n",
" - Primitives\n",
" - Patches (Circle, Rectangle)\n",
" - Line2d\n",
" - Annotations/Text\n",
" - ...\n",
" - Legend\n",
" - Primitives\n",
" - Text (label & title)\n",
"\n",
"- Coordinate Spaces: values → ... → screen\n",
" - Proportional coordinate space (Figure & Axes)\n",
" - Data coordinate space (Axes)\n",
" - Identity/point space (Figure)\n",
"\n",
"**Applied to Star Trader Data - Tracking Ship Failures**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "994c936e",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"\n",
"from matplotlib.pyplot import subplots, show\n",
"from pandas import read_csv, to_datetime\n",
"\n",
"df = (\n",
" read_csv(\n",
" Path('data') / 'failures.csv',\n",
" index_col=['date', 'player','ship'],\n",
" parse_dates=['date'],\n",
" )\n",
" .sort_index()\n",
")\n",
"\n",
"plot_data = (\n",
" df.pivot_table(index='date', columns='player', values='faults', aggfunc='sum')\n",
" .rolling('90D').mean()\n",
")\n",
"\n",
"ax = plot_data.plot(legend=False)\n",
"for line in ax.lines:\n",
" x, y = line.get_data()\n",
" ax.annotate(\n",
" line.get_label(), xy=(x[-1], y[-1]),\n",
" xytext=(5, 0), textcoords='offset points',\n",
" color=line.get_color()\n",
" )\n",
"\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "e898e431",
"metadata": {},
"source": [
"### Interactive Data Visualizations (explore, fun)\n",
"\n",
"- System/function Exploration\n",
"- Data Exploration\n",
"- System Observability\n",
"\n",
"*Limited Communicatve Ability unless STRONGLY guided*\n",
"\n",
"**bokeh & panel** - a powerful way to share your data on the web!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc30971b",
"metadata": {},
"outputs": [],
"source": [
"from panel import Column, extension\n",
"from bokeh.plotting import figure\n",
"from bokeh.models import ColumnDataSource\n",
"\n",
"from numpy import linspace, zeros\n",
"from scipy.stats import skewnorm\n",
"\n",
"# Connect `panel` application to notebook runtime\n",
"extension()\n",
"\n",
"loc = 5\n",
"scale = 1\n",
"skew = 0\n",
"\n",
"cds = ColumnDataSource({\n",
" 'x': linspace(-10, 10, 500),\n",
" 'y1': zeros(shape=500),\n",
"})\n",
"cds.data['y2'] = skewnorm(loc=loc, scale=scale, a=skew).pdf(cds.data['x'])\n",
"\n",
"def update_plot(loc, scale, skew):\n",
" cds.data['y2'] = skewnorm.pdf(x=cds.data['x'], a=skew, loc=loc, scale=scale)\n",
"\n",
"p = figure(y_range=(0, .5), width=500, height=300)\n",
"p.varea(x='x', y1='y1', y2='y2', source=cds, alpha=.3)\n",
"p.line(x='x', y='y2', source=cds, line_width=4)\n",
"p.yaxis.major_label_text_font_size = \"20pt\"\n",
"p.xaxis.major_label_text_font_size = \"20pt\"\n",
"\n",
"Column(p).servable()"
]
},
{
"cell_type": "markdown",
"id": "b2a6b3d8",
"metadata": {},
"source": [
"adding interactivity"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e69af49d",
"metadata": {},
"outputs": [],
"source": [
"from panel import Column, bind, extension\n",
"from panel.widgets import FloatSlider\n",
"from bokeh.plotting import figure\n",
"from bokeh.models import ColumnDataSource\n",
"\n",
"from numpy import linspace, zeros\n",
"from scipy.stats import skewnorm\n",
"\n",
"# Increase font size of widgets\n",
"css = '''\n",
".bk-root .bk, .bk-root .bk:before, .bk-root .bk:after {\n",
" font-size: 110%;\n",
" }\n",
"'''\n",
"extension(raw_css=[css])\n",
"\n",
"loc = FloatSlider(name='mean', value=0, start=-10, end=10)\n",
"scale = FloatSlider(name='std. dev', value=1, start=.1, end=10)\n",
"skew = FloatSlider(name='skew', value=0, start=-6, end=6)\n",
"\n",
"cds = ColumnDataSource({\n",
" 'x': linspace(-10, 10, 500),\n",
" 'y1': zeros(shape=500),\n",
" 'y2': zeros(shape=500),\n",
"})\n",
"\n",
"def update_plot(loc, scale, skew):\n",
" cds.data['y2'] = skewnorm.pdf(x=cds.data['x'], a=skew, loc=loc, scale=scale)\n",
"\n",
"p = figure(y_range=(0, .5), width=500, height=300)\n",
"p.varea(x='x', y1='y1', y2='y2', source=cds, alpha=.3)\n",
"p.line(x='x', y='y2', source=cds, line_width=4)\n",
"p.yaxis.major_label_text_font_size = \"20pt\"\n",
"p.xaxis.major_label_text_font_size = \"20pt\"\n",
"\n",
"Column(\n",
" Column(loc, scale, skew),\n",
" p,\n",
" bind(update_plot, skew=skew, loc=loc, scale=scale)\n",
").servable()"
]
},
{
"cell_type": "markdown",
"id": "8612871a",
"metadata": {},
"source": [
"**Applied to our Star Trader Data - Planetary Weather**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ad9e3dd7",
"metadata": {},
"outputs": [],
"source": [
"from pandas import read_csv\n",
"\n",
"df = (\n",
" read_csv(\n",
" 'data/weather_york.csv',\n",
" usecols=['date', 'temperature_max', 'temperature_min'],\n",
" parse_dates=['date'],\n",
" index_col='date'\n",
" )\n",
").loc['1990':'2000']\n",
"\n",
"# Long timeseries Zoom\n",
"from bokeh.plotting import figure, ColumnDataSource\n",
"from panel import Column\n",
"from pandas import to_datetime, DateOffset\n",
"\n",
"cds = ColumnDataSource(df)\n",
"p = figure(\n",
" width=1000, height=250, x_axis_type='datetime', y_range=[0, 110],\n",
" x_range=[df.index.min(), df.index.min() + DateOffset(years=1, days=-1)],\n",
")\n",
"p.vbar(x='date', bottom='temperature_min', top='temperature_max', source=cds, width=(24 * 60 * 60 * 1000))\n",
"\n",
"range_p = figure(\n",
" width=1000, height=p.height // 4, x_axis_type='datetime', y_range=[0, 110],\n",
" x_range=[df.index.min(), df.index.max()],\n",
")\n",
"range_p.vbar(x='date', bottom='temperature_min', top='temperature_max', source=cds, width=(24 * 60 * 60 * 1000))\n",
"\n",
"from bokeh.models import RangeTool\n",
"\n",
"rangetool = RangeTool(x_range=p.x_range)\n",
"range_p.add_tools(rangetool)\n",
"\n",
"Column(p, range_p).servable()"
]
},
{
"cell_type": "markdown",
"id": "a4cf5939",
"metadata": {},
"source": [
"### Animated Data Visualizations (communicative, fun)\n",
"\n",
"**ipyvizzu**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "29df05f7",
"metadata": {},
"outputs": [],
"source": [
"from panel.pane import Markdown, HTML\n",
"from pandas import read_csv, MultiIndex\n",
"from ipyvizzu import Chart, Data, Config, Style, DisplayTarget\n",
"\n",
"countries = (\n",
" read_csv('data/dictionary.csv')\n",
" .set_index('Code')['Country']\n",
")\n",
"countries['URS'] = 'Soviet Union'\n",
"\n",
"medals_count = (\n",
" read_csv('data/summer.csv')\n",
" .drop_duplicates(['Year', 'Country', 'Event', 'Medal'])\n",
" .groupby(['Year', 'Country']).size()\n",
" .rename('Total Medals')\n",
")\n",
"\n",
"medals_count = (\n",
" medals_count.reindex(\n",
" MultiIndex.from_product(\n",
" medals_count.index.levels),\n",
" fill_value=0\n",
" )\n",
" .reset_index()\n",
" .astype({'Year': str})\n",
" .assign(Country=lambda d: d['Country'].map(countries))\n",
")\n",
"\n",
"medals_count['Cumulative Medals'] = medals_count.groupby(['Country'])['Total Medals'].cumsum()\n",
"\n",
"countries_min = (\n",
" medals_count.groupby('Country')\n",
" ['Total Medals'].sum()\n",
" .gt(80)\n",
" .loc[lambda s: s].index\n",
")\n",
"\n",
"data = Data()\n",
"data.add_data_frame(medals_count)\n",
"\n",
"config = {\n",
"\t'y': 'Country',\n",
"\t'x': 'Total Medals',\n",
" 'sort': 'byValue'\n",
"}\n",
"\n",
"style = Style(\n",
" {'plot': {'paddingTop': 40, 'paddingLeft': 150}}\n",
")\n",
"\n",
"chart = Chart(\n",
" width=\"800px\", height=\"600px\",\n",
" display=DisplayTarget.MANUAL\n",
")\n",
"chart.on('logo-draw', 'event.preventDefault();')\n",
"chart.animate(\n",
" data,\n",
" style,\n",
" Config(config | {'title': 'United States Leads Summer Olympic Medals'}),\n",
")\n",
"\n",
"filt = '||'.join(\n",
" f\"record.Country == '{c}'\"\n",
" for c in countries_min\n",
")\n",
"chart.animate(\n",
" Config({\n",
" 'title': 'Countries Winning > 80 Summer Olympic Medals',\n",
" }),\n",
" Data.filter(filt),\n",
" delay=2,\n",
"\tduration=4,\n",
")\n",
"\n",
"for i, (year, group) in enumerate(medals_count.groupby('Year')):\n",
" title = 'Summer Olympic Medals 1896'\n",
" if year != '1896':\n",
" title += f' - {year}'\n",
" chart.animate(\n",
" Data.filter(\n",
" f'record.Year == {year} && ({filt})'\n",
" ),\n",
" Config(\n",
" config |\n",
" {'title': title, 'x': 'Cumulative Medals'}\n",
" ),\n",
"\t\tdelay=4 if i == 0 else 0,\n",
" duration=1,\n",
" x={\"easing\": \"linear\", \"delay\": 0},\n",
" y={\"delay\": 0},\n",
" show={\"delay\": 0},\n",
" hide={\"delay\": 0},\n",
" title={\"duration\": 0, \"delay\": 0},\n",
" )\n",
"\n",
"# # Zoom Out\n",
"chart.animate(\n",
" Data.filter(None),\n",
" Config({\n",
" 'title': 'Summer Olympic Medals up to 2012',\n",
" 'x': 'Total Medals',\n",
" }),\n",
" duration=3\n",
")\n",
"\n",
"chart.animate(\n",
" Data.filter('''\n",
" record.Country == 'United States'\n",
" || record.Country == 'United Kingdom'\n",
" || record.Country == 'France'\n",
" || record.Country == 'Italy'\n",
" '''),\n",
" Config({'title': 'Select Countries'}),\n",
")\n",
"\n",
"HTML(chart).servable()"
]
},
{
"cell_type": "markdown",
"id": "5be53380",
"metadata": {},
"source": [
"**Applied to Star Trader Data - **"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43cfbc85",
"metadata": {},
"outputs": [],
"source": [
"from panel.pane import Markdown, HTML\n",
"from pandas import read_csv\n",
"from ipyvizzu import Chart, Data, Config, Style, DisplayTarget\n",
"\n",
"df = (\n",
" read_csv(\n",
" 'data/weather_york.csv',\n",
" usecols=['date', 'temperature_max', 'temperature_min'],\n",
" parse_dates=['date'],\n",
" index_col='date'\n",
" )\n",
" .assign(\n",
" year=lambda d: d.index.year,\n",
" doy=lambda d: d.index.dayofyear.astype(str),\n",
" )\n",
" .sort_index()\n",
").loc['1990':'2000']\n",
"\n",
"# Animate the annual weather curve\n",
"\n",
"HTML(chart).servable()"
]
},
{
"cell_type": "markdown",
"id": "06f02b7c",
"metadata": {},
"source": [
"**Animation as Automated Interactivity**\n",
"- Drill Down : provides context\n",
"- Transformation: insight to dynamic metrics\n",
"- Movement : pre-attentive cues to draw eyes\n",
"\n",
"## Your Turn…\n",
"- Take any of the tools we have discussed today and make 1 static, 1 interactive (web), or 1 animated\n",
"- Remember before you start, think about what you want to create? Something exploratory, communicative?\n",
"\n",
"**Suggested Starting Points**\n",
"1. *static* Using all `data/weather_*.csv` datasets\n",
"\t- Visualize the average yearly `temperature_max` and `temperature_min` for a single planet (york, sol, kirk, …).\n",
"\t- Visualize the average yearly `temperature_max` and `temperature_min` for ALL planet (york, sol, kirk, …)\n",
"\t\t- Prioritize the comparison of the stars within a given year.\n",
"\t\t- Think: should the values be overlaid onto a single chart? Or spread across multiple?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9abb7baf",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"from pandas import read_csv\n",
"\n",
"# data_path = Path('data')\n",
"# for p in data_path.glob('weather*'):\n",
"# print(p)\n",
"\n",
"df = (\n",
" read_csv(\n",
" 'data/weather_york.csv',\n",
" usecols=['date', 'temperature_max', 'temperature_min'],\n",
" parse_dates=['date'],\n",
" index_col='date'\n",
" )\n",
").loc['2000']\n",
"\n",
"plot_df = (\n",
" df\n",
" # df.resample('Y').mean()\n",
")\n",
"\n",
"from matplotlib.pyplot import subplots, show\n",
"\n",
"fig, ax = subplots()\n",
"# ax.plot(plot_df.index.year, plot_df['temperature_max'])\n",
"# ax.plot(plot_df.index.year, plot_df['temperature_min'])\n",
"ax.bar(\n",
" plot_df.index,\n",
" bottom=plot_df['temperature_min'],\n",
" height=plot_df['temperature_max'] - plot_df['temperature_min'],\n",
" width=1\n",
")\n",
"\n",
"# ax.set_title('Yearly Average in Sol', size='xx-large', loc='left')\n",
"ax.set_title('2000s Temperature in Sol')\n",
"ax.set_ylabel('Temperature (°F)', size='large')\n",
"ax.spines[['top', 'right']].set_visible(False)\n",
"\n",
"show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "252a7dba",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"from pandas import read_csv\n",
"from matplotlib.pyplot import subplots, show\n",
"\n",
"data_path = Path('data')\n",
"dfs = {}\n",
"for p in data_path.glob('weather*'):\n",
" dfs[p.stem] = (\n",
" read_csv(\n",
" 'data/weather_york.csv',\n",
" usecols=['date', 'temperature_max', 'temperature_min'],\n",
" parse_dates=['date'],\n",
" index_col='date'\n",
" )\n",
" ).loc['2000']\n",
"\n",
"fig, axes = subplots(2, 2, sharex=True, sharey=True)\n",
"\n",
"for (fname, df), ax in zip(dfs.items(), axes.flat):\n",
" planet_name = fname.split('_')[1]\n",
" ax.bar(\n",
" df.index,\n",
" bottom=df['temperature_min'],\n",
" height=df['temperature_max'] - df['temperature_min'],\n",
" width=1\n",
" )\n",
" ax.set_title(planet_name.title(), loc='left')\n",
"\n",
"show()"
]
},
{
"cell_type": "markdown",
"id": "f13007c4",
"metadata": {},
"source": [
"2. *interactive* Using the `data/failures.csv` dataset,\n",
"\t- Plot the total failures for each 'player' for each day.\n",
"\t\t- Apply a smoothing factor (`rolling average`) of 90 days prior to plotting the data.\n",
"\t- Create a slider widget that control the number of days involved in the smoothing.\n",
"\t\t- e.g. this slider should allow me to apply 0 days of smoothing all the way up to 90 days of smoothing"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5d8bdc12",
"metadata": {},
"outputs": [],
"source": [
"# …"
]
},
{
"cell_type": "markdown",
"id": "82d72557",
"metadata": {},
"source": [
"3. *animated* Using the `data/weather_*.csv` datasets\n",
"\t- Plot ALL of the 'temperature_max' data for EACH of the planets.\n",
"\t- Animate: drill-down to the planet with the HIGHEST average temperature (across all datapoints)\n",
"\t- Animate: reintroduce the other planet’s 'temperature_max' to the chart while maintaining the year 2010 zoom."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "521e65b3",
"metadata": {},
"outputs": [],
"source": [
"# …"
]
},
{
"cell_type": "markdown",
"id": "6696cac8",
"metadata": {},
"source": [
"**Can’t Get Started With The Above?** - try recreating examples from the documentation for these tools\n",
"(see **Useful Links** below), or follow along with a tutorial for a tool of your choice!\n",
"\n",
"## Useful Links\n",
"\n",
"**Matplotlib**\n",
"- Tutorial: https://matplotlib.org/stable/tutorials/index.html\n",
"- Cheatsheets: https://matplotlib.org/cheatsheets/\n",
"- Examples: https://matplotlib.org/stable/gallery/index.html\n",
"\n",
"**Plotnine**\n",
"- Tutorial: http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html (note that plotnine does not have official tutorials, so please refer to ggplot2)\n",
"- Examples: https://plotnine.readthedocs.io/en/stable/gallery.html#\n",
"\n",
"**Bokeh**\n",
"- Tutorial: https://docs.bokeh.org/en/latest/docs/first_steps.html#first-steps\n",
"- Examples: https://docs.bokeh.org/en/latest/docs/gallery.html#gallery\n",
"\n",
"**IPyvizzu**\n",
"- Tutoriall: https://ipyvizzu.vizzuhq.com/latest/tutorial/\n",
"- Examples: https://ipyvizzu.vizzuhq.com/latest/examples/analytical_operations/\n",
"\n"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"main_language": "python",
"notebook_metadata_filter": "-all"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment