camriddell · August 2, 2024 16:05
diff --git a/notes.2023-07-12.ipynb b/notes.2023-07-12.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a3d481af",
   "metadata": {},
   "source": [
    "# How do I even get started with Data Visualization?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a94b3c78",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import display\n",
    "\n",
    "display(\"Let's Get Started\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "547ba3f5",
   "metadata": {},
   "source": [
    "## Load Data Files\n",
    "\n",
    "*This is optional if you are **not** using Google Colab*\n",
    "\n",
    "Execute these cells to access the `data` for this workshop and install some additional packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c02aee51",
   "metadata": {},
   "outputs": [],
   "source": [
    "!git clone https://github.com/dutc-io/agu-data-viz.git"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b34f338e",
   "metadata": {},
   "outputs": [],
   "source": [
    "%cd agu-data-viz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26274060",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install ipyvizzu==0.15.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3f7b766",
   "metadata": {},
   "outputs": [],
   "source": [
    "! [[ -e data/anscombe.csv ]] && echo \"Data files loaded!\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "afed0baf",
   "metadata": {},
   "source": [
    "## Types of Visualizations\n",
    "- Exploratory\n",
    "- Communicative\n",
    "- Fun!\n",
    "\n",
    "## Types of Data Visualization Tools\n",
    "- Non-programmatic\n",
    "    - low : hand/computer drawing\n",
    "    - high: GUI chart builders* (excel, tableau, ...)\n",
    "- Programmatic\n",
    "    - low : programmatic drawing\n",
    "    - high: declarative & convenience"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "239cb123",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import display\n",
    "from pandas import read_csv\n",
    "\n",
    "anscombe = read_csv('data/anscombe.csv')\n",
    "\n",
    "display(\n",
    "    # anscombe.head(),\n",
    "    anscombe.groupby('id')[['x', 'y']].agg(['mean', 'std']),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36660b4e",
   "metadata": {},
   "source": [
    "## Common Data Visualization APIs\n",
    "\n",
    "### Drawing (non-declarative)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7627320f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pandas import read_csv\n",
    "from matplotlib.pyplot import subplots, show\n",
    "\n",
    "anscombe = read_csv('data/anscombe.csv')\n",
    "\n",
    "fig, axes = subplots(nrows=2, ncols=2)\n",
    "\n",
    "for (label, group), ax in zip(anscombe.groupby('id'), axes.flat):\n",
    "    ax.scatter(group['x'], group['y'], s=24)\n",
    "    ax.set_title(label)\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a7ecdc22",
   "metadata": {},
   "source": [
    "### High Level - Declarative"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bec53cfb",
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib.pyplot import show\n",
    "from plotnine import ggplot, facet_wrap, geom_point, geom_smooth, labs, theme_minimal, aes\n",
    "from pandas import read_csv\n",
    "\n",
    "anscombe = read_csv('data/anscombe.csv')\n",
    "\n",
    "(\n",
    "    ggplot(anscombe, aes(x='x', y='y'))\n",
    "    + facet_wrap('id', ncol=2)\n",
    "    + geom_point()\n",
    "    + geom_smooth(method='ols')\n",
    "    + labs(x='x variable', y='y variable', title='Anscombe’s Quartet')\n",
    "    + theme_minimal()\n",
    ").draw()\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4d1a578",
   "metadata": {},
   "source": [
    "### High Level - Convenience"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a67b81e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib.pyplot import show\n",
    "from seaborn import lmplot\n",
    "from pandas import read_csv\n",
    "\n",
    "anscombe = read_csv('data/anscombe.csv')\n",
    "\n",
    "lmplot(anscombe, x='x', y='y', col='id', col_wrap=2)\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d7ee168",
   "metadata": {},
   "outputs": [],
   "source": [
    "from seaborn.objects import Plot, Dots, PolyFit, Line\n",
    "from pandas import read_csv\n",
    "\n",
    "anscombe = read_csv('data/anscombe.csv')\n",
    "\n",
    "\n",
    "(\n",
    "    Plot(anscombe, x='x', y='y')\n",
    "    .facet(col='id', wrap=2)\n",
    "    .add(Dots(color='black'))\n",
    "    .add(Line(), PolyFit(1))\n",
    ").show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3012bbe4",
   "metadata": {},
   "source": [
    "### Is it worth it to learn multiple data visualization libraries/languages?\n",
    "\n",
    "## Let’s Make Some Viz!\n",
    "\n",
    "### Static Data Visualizations (explore, communicative, fun)\n",
    "\n",
    "**matplotlib**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15be4264",
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib.pyplot import figure, show\n",
    "from matplotlib.patches import Circle, Rectangle\n",
    "\n",
    "fig = figure(figsize=(6,6))\n",
    "\n",
    "c = Circle((.5, .8), .1)\n",
    "fig.add_artist(c)\n",
    "\n",
    "## uncomment below if running from Jupyter Notebook/Google Colab\n",
    "# ax = fig.add_axes([0, 0, 0, 0])\n",
    "# ax.set_visible(False)\n",
    "\n",
    "body_rect = Rectangle((.47, .75), .06, -.5)\n",
    "fig.add_artist(body_rect)\n",
    "\n",
    "arms_rect = Rectangle((.3, .55), .4, .05)\n",
    "fig.add_artist(arms_rect)\n",
    "\n",
    "lleg_rect = Rectangle((.5, .3), .3, .05, angle=225)\n",
    "fig.add_artist(lleg_rect)\n",
    "\n",
    "rleg_rect = Rectangle((.47, .26), .3, .05, angle=-45)\n",
    "fig.add_artist(rleg_rect)\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38a60299",
   "metadata": {},
   "source": [
    "**Useful Things to draw for Data Viz**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "82d8ea89",
   "metadata": {},
   "outputs": [],
   "source": [
    "from numpy import linspace, pi, sin, cos\n",
    "from matplotlib.pyplot import figure, show, plot, rc\n",
    "\n",
    "rc('font', size=16)\n",
    "\n",
    "xs = linspace(0, 2 * pi)\n",
    "\n",
    "fig = figure()\n",
    "ax = fig.add_axes([.3, .3, .5, .5])\n",
    "\n",
    "# ax.plot(xs, sin(xs))\n",
    "# ax.plot(xs, cos(xs))\n",
    "\n",
    "# ax.set_ylabel('this is my y label')\n",
    "# fig.supylabel('this my figure y label')\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "77e502c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import display\n",
    "\n",
    "from numpy import linspace, pi, sin, cos\n",
    "from matplotlib.pyplot import subplots, show\n",
    "\n",
    "# fig, ax = subplots()\n",
    "\n",
    "# fig, axes = subplots(nrows=1, ncols=2)\n",
    "fig, axes = subplots(nrows=2, ncols=2)\n",
    "# print(axes[:, 0])\n",
    "\n",
    "xs = linspace(0, 2 * pi)\n",
    "\n",
    "# display(axes)\n",
    "# display(type(axes))\n",
    "\n",
    "axes[1, 0].plot(xs, sin(xs))\n",
    "\n",
    "# display(axes, type(axes), axes[0])\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ae0d8aa0",
   "metadata": {},
   "source": [
    "Matplotlib is object oriented\n",
    "- Containers → Artists\n",
    "- Figure → Axes →\n",
    "    - X/YAxis\n",
    "        - ticks\n",
    "        - ticklabels\n",
    "        - axis label\n",
    "    - Primitives\n",
    "        - Patches (Circle, Rectangle)\n",
    "        - Line2d\n",
    "        - Annotations/Text\n",
    "        - ...\n",
    "    - Legend\n",
    "        - Primitives\n",
    "        - Text (label & title)\n",
    "\n",
    "- Coordinate Spaces: values → ... → screen\n",
    "    - Proportional coordinate space (Figure & Axes)\n",
    "    - Data coordinate space         (Axes)\n",
    "    - Identity/point space          (Figure)\n",
    "\n",
    "**Applied to Star Trader Data - Tracking Ship Failures**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "994c936e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "from matplotlib.pyplot import subplots, show\n",
    "from pandas import read_csv, to_datetime\n",
    "\n",
    "df = (\n",
    "    read_csv(\n",
    "        Path('data') / 'failures.csv',\n",
    "        index_col=['date', 'player','ship'],\n",
    "        parse_dates=['date'],\n",
    "    )\n",
    "    .sort_index()\n",
    ")\n",
    "\n",
    "plot_data = (\n",
    "    df.pivot_table(index='date', columns='player', values='faults', aggfunc='sum')\n",
    "    .rolling('90D').mean()\n",
    ")\n",
    "\n",
    "ax = plot_data.plot(legend=False)\n",
    "for line in ax.lines:\n",
    "    x, y = line.get_data()\n",
    "    ax.annotate(\n",
    "        line.get_label(), xy=(x[-1], y[-1]),\n",
    "        xytext=(5, 0), textcoords='offset points',\n",
    "        color=line.get_color()\n",
    "    )\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e898e431",
   "metadata": {},
   "source": [
    "### Interactive Data Visualizations (explore, fun)\n",
    "\n",
    "- System/function Exploration\n",
    "- Data Exploration\n",
    "- System Observability\n",
    "\n",
    "*Limited Communicatve Ability unless STRONGLY guided*\n",
    "\n",
    "**bokeh & panel** - a powerful way to share your data on the web!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc30971b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from panel import Column, extension\n",
    "from bokeh.plotting import figure\n",
    "from bokeh.models import ColumnDataSource\n",
    "\n",
    "from numpy import linspace, zeros\n",
    "from scipy.stats import skewnorm\n",
    "\n",
    "# Connect `panel` application to notebook runtime\n",
    "extension()\n",
    "\n",
    "loc   = 5\n",
    "scale = 1\n",
    "skew  = 0\n",
    "\n",
    "cds = ColumnDataSource({\n",
    "    'x': linspace(-10, 10, 500),\n",
    "    'y1': zeros(shape=500),\n",
    "})\n",
    "cds.data['y2'] = skewnorm(loc=loc, scale=scale, a=skew).pdf(cds.data['x'])\n",
    "\n",
    "def update_plot(loc, scale, skew):\n",
    "    cds.data['y2'] = skewnorm.pdf(x=cds.data['x'], a=skew, loc=loc, scale=scale)\n",
    "\n",
    "p = figure(y_range=(0, .5), width=500, height=300)\n",
    "p.varea(x='x', y1='y1', y2='y2', source=cds, alpha=.3)\n",
    "p.line(x='x', y='y2', source=cds, line_width=4)\n",
    "p.yaxis.major_label_text_font_size = \"20pt\"\n",
    "p.xaxis.major_label_text_font_size = \"20pt\"\n",
    "\n",
    "Column(p).servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2a6b3d8",
   "metadata": {},
   "source": [
    "adding interactivity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e69af49d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from panel import Column, bind, extension\n",
    "from panel.widgets import FloatSlider\n",
    "from bokeh.plotting import figure\n",
    "from bokeh.models import ColumnDataSource\n",
    "\n",
    "from numpy import linspace, zeros\n",
    "from scipy.stats import skewnorm\n",
    "\n",
    "# Increase font size of widgets\n",
    "css = '''\n",
    ".bk-root .bk, .bk-root .bk:before, .bk-root .bk:after {\n",
    "  font-size: 110%;\n",
    "  }\n",
    "'''\n",
    "extension(raw_css=[css])\n",
    "\n",
    "loc   = FloatSlider(name='mean', value=0, start=-10, end=10)\n",
    "scale = FloatSlider(name='std. dev', value=1, start=.1, end=10)\n",
    "skew  = FloatSlider(name='skew', value=0, start=-6, end=6)\n",
    "\n",
    "cds = ColumnDataSource({\n",
    "    'x': linspace(-10, 10, 500),\n",
    "    'y1': zeros(shape=500),\n",
    "    'y2': zeros(shape=500),\n",
    "})\n",
    "\n",
    "def update_plot(loc, scale, skew):\n",
    "    cds.data['y2'] = skewnorm.pdf(x=cds.data['x'], a=skew, loc=loc, scale=scale)\n",
    "\n",
    "p = figure(y_range=(0, .5), width=500, height=300)\n",
    "p.varea(x='x', y1='y1', y2='y2', source=cds, alpha=.3)\n",
    "p.line(x='x', y='y2', source=cds, line_width=4)\n",
    "p.yaxis.major_label_text_font_size = \"20pt\"\n",
    "p.xaxis.major_label_text_font_size = \"20pt\"\n",
    "\n",
    "Column(\n",
    "    Column(loc, scale, skew),\n",
    "    p,\n",
    "    bind(update_plot, skew=skew, loc=loc, scale=scale)\n",
    ").servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8612871a",
   "metadata": {},
   "source": [
    "**Applied to our Star Trader Data - Planetary Weather**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad9e3dd7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pandas import read_csv\n",
    "\n",
    "df = (\n",
    "    read_csv(\n",
    "        'data/weather_york.csv',\n",
    "        usecols=['date', 'temperature_max', 'temperature_min'],\n",
    "        parse_dates=['date'],\n",
    "        index_col='date'\n",
    "    )\n",
    ").loc['1990':'2000']\n",
    "\n",
    "# Long timeseries Zoom\n",
    "from bokeh.plotting import figure, ColumnDataSource\n",
    "from panel import Column\n",
    "from pandas import to_datetime, DateOffset\n",
    "\n",
    "cds = ColumnDataSource(df)\n",
    "p = figure(\n",
    "    width=1000, height=250, x_axis_type='datetime', y_range=[0, 110],\n",
    "    x_range=[df.index.min(), df.index.min() + DateOffset(years=1, days=-1)],\n",
    ")\n",
    "p.vbar(x='date', bottom='temperature_min', top='temperature_max', source=cds, width=(24 * 60 * 60 * 1000))\n",
    "\n",
    "range_p = figure(\n",
    "    width=1000, height=p.height // 4, x_axis_type='datetime', y_range=[0, 110],\n",
    "    x_range=[df.index.min(), df.index.max()],\n",
    ")\n",
    "range_p.vbar(x='date', bottom='temperature_min', top='temperature_max', source=cds, width=(24 * 60 * 60 * 1000))\n",
    "\n",
    "from bokeh.models import RangeTool\n",
    "\n",
    "rangetool = RangeTool(x_range=p.x_range)\n",
    "range_p.add_tools(rangetool)\n",
    "\n",
    "Column(p, range_p).servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4cf5939",
   "metadata": {},
   "source": [
    "### Animated Data Visualizations (communicative, fun)\n",
    "\n",
    "**ipyvizzu**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29df05f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from panel.pane import Markdown, HTML\n",
    "from pandas import read_csv, MultiIndex\n",
    "from ipyvizzu import Chart, Data, Config, Style, DisplayTarget\n",
    "\n",
    "countries = (\n",
    "    read_csv('data/dictionary.csv')\n",
    "    .set_index('Code')['Country']\n",
    ")\n",
    "countries['URS'] = 'Soviet Union'\n",
    "\n",
    "medals_count = (\n",
    "    read_csv('data/summer.csv')\n",
    "    .drop_duplicates(['Year', 'Country', 'Event', 'Medal'])\n",
    "    .groupby(['Year', 'Country']).size()\n",
    "    .rename('Total Medals')\n",
    ")\n",
    "\n",
    "medals_count = (\n",
    "    medals_count.reindex(\n",
    "        MultiIndex.from_product(\n",
    "            medals_count.index.levels),\n",
    "        fill_value=0\n",
    "    )\n",
    "    .reset_index()\n",
    "    .astype({'Year': str})\n",
    "    .assign(Country=lambda d: d['Country'].map(countries))\n",
    ")\n",
    "\n",
    "medals_count['Cumulative Medals'] = medals_count.groupby(['Country'])['Total Medals'].cumsum()\n",
    "\n",
    "countries_min = (\n",
    "    medals_count.groupby('Country')\n",
    "    ['Total Medals'].sum()\n",
    "    .gt(80)\n",
    "    .loc[lambda s: s].index\n",
    ")\n",
    "\n",
    "data = Data()\n",
    "data.add_data_frame(medals_count)\n",
    "\n",
    "config = {\n",
    "\t'y': 'Country',\n",
    "\t'x': 'Total Medals',\n",
    "    'sort': 'byValue'\n",
    "}\n",
    "\n",
    "style = Style(\n",
    "    {'plot': {'paddingTop': 40, 'paddingLeft': 150}}\n",
    ")\n",
    "\n",
    "chart = Chart(\n",
    "    width=\"800px\", height=\"600px\",\n",
    "    display=DisplayTarget.MANUAL\n",
    ")\n",
    "chart.on('logo-draw', 'event.preventDefault();')\n",
    "chart.animate(\n",
    "    data,\n",
    "    style,\n",
    "    Config(config | {'title': 'United States Leads Summer Olympic Medals'}),\n",
    ")\n",
    "\n",
    "filt = '||'.join(\n",
    "    f\"record.Country == '{c}'\"\n",
    "    for c in countries_min\n",
    ")\n",
    "chart.animate(\n",
    "    Config({\n",
    "        'title': 'Countries Winning > 80 Summer Olympic Medals',\n",
    "    }),\n",
    "    Data.filter(filt),\n",
    "    delay=2,\n",
    "\tduration=4,\n",
    ")\n",
    "\n",
    "for i, (year, group) in enumerate(medals_count.groupby('Year')):\n",
    "    title = 'Summer Olympic Medals 1896'\n",
    "    if year != '1896':\n",
    "        title += f' - {year}'\n",
    "    chart.animate(\n",
    "        Data.filter(\n",
    "            f'record.Year == {year} && ({filt})'\n",
    "        ),\n",
    "        Config(\n",
    "            config |\n",
    "            {'title': title, 'x': 'Cumulative Medals'}\n",
    "        ),\n",
    "\t\tdelay=4 if i == 0 else 0,\n",
    "        duration=1,\n",
    "        x={\"easing\": \"linear\", \"delay\": 0},\n",
    "        y={\"delay\": 0},\n",
    "        show={\"delay\": 0},\n",
    "        hide={\"delay\": 0},\n",
    "        title={\"duration\": 0, \"delay\": 0},\n",
    "    )\n",
    "\n",
    "# # Zoom Out\n",
    "chart.animate(\n",
    "    Data.filter(None),\n",
    "    Config({\n",
    "        'title': 'Summer Olympic Medals up to 2012',\n",
    "        'x': 'Total Medals',\n",
    "    }),\n",
    "    duration=3\n",
    ")\n",
    "\n",
    "chart.animate(\n",
    "    Data.filter('''\n",
    "        record.Country == 'United States'\n",
    "        || record.Country == 'United Kingdom'\n",
    "        || record.Country == 'France'\n",
    "        || record.Country == 'Italy'\n",
    "    '''),\n",
    "    Config({'title': 'Select Countries'}),\n",
    ")\n",
    "\n",
    "HTML(chart).servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5be53380",
   "metadata": {},
   "source": [
    "**Applied to Star Trader Data - **"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "43cfbc85",
   "metadata": {},
   "outputs": [],
   "source": [
    "from panel.pane import Markdown, HTML\n",
    "from pandas import read_csv\n",
    "from ipyvizzu import Chart, Data, Config, Style, DisplayTarget\n",
    "\n",
    "df = (\n",
    "    read_csv(\n",
    "        'data/weather_york.csv',\n",
    "        usecols=['date', 'temperature_max', 'temperature_min'],\n",
    "        parse_dates=['date'],\n",
    "        index_col='date'\n",
    "    )\n",
    "    .assign(\n",
    "        year=lambda d: d.index.year,\n",
    "        doy=lambda d: d.index.dayofyear.astype(str),\n",
    "    )\n",
    "    .sort_index()\n",
    ").loc['1990':'2000']\n",
    "\n",
    "# Animate the annual weather curve\n",
    "\n",
    "HTML(chart).servable()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "06f02b7c",
   "metadata": {},
   "source": [
    "**Animation as Automated Interactivity**\n",
    "- Drill Down    : provides context\n",
    "- Transformation: insight to dynamic metrics\n",
    "- Movement      : pre-attentive cues to draw eyes\n",
    "\n",
    "## Your Turn…\n",
    "- Take any of the tools we have discussed today and make 1 static, 1 interactive (web), or 1 animated\n",
    "- Remember before you start, think about what you want to create? Something exploratory, communicative?\n",
    "\n",
    "**Suggested Starting Points**\n",
    "1. *static* Using all `data/weather_*.csv` datasets\n",
    "\t- Visualize the average yearly `temperature_max` and `temperature_min` for a single planet (york, sol, kirk, …).\n",
    "\t- Visualize the average yearly `temperature_max` and `temperature_min` for ALL planet (york, sol, kirk, …)\n",
    "\t\t- Prioritize the comparison of the stars within a given year.\n",
    "\t\t- Think: should the values be overlaid onto a single chart? Or spread across multiple?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9abb7baf",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "from pandas import read_csv\n",
    "\n",
    "# data_path = Path('data')\n",
    "# for p in data_path.glob('weather*'):\n",
    "#     print(p)\n",
    "\n",
    "df = (\n",
    "    read_csv(\n",
    "        'data/weather_york.csv',\n",
    "        usecols=['date', 'temperature_max', 'temperature_min'],\n",
    "        parse_dates=['date'],\n",
    "        index_col='date'\n",
    "    )\n",
    ").loc['2000']\n",
    "\n",
    "plot_df = (\n",
    "    df\n",
    "    # df.resample('Y').mean()\n",
    ")\n",
    "\n",
    "from matplotlib.pyplot import subplots, show\n",
    "\n",
    "fig, ax = subplots()\n",
    "# ax.plot(plot_df.index.year, plot_df['temperature_max'])\n",
    "# ax.plot(plot_df.index.year, plot_df['temperature_min'])\n",
    "ax.bar(\n",
    "    plot_df.index,\n",
    "    bottom=plot_df['temperature_min'],\n",
    "    height=plot_df['temperature_max'] - plot_df['temperature_min'],\n",
    "    width=1\n",
    ")\n",
    "\n",
    "# ax.set_title('Yearly Average in Sol', size='xx-large', loc='left')\n",
    "ax.set_title('2000s Temperature in Sol')\n",
    "ax.set_ylabel('Temperature (°F)', size='large')\n",
    "ax.spines[['top', 'right']].set_visible(False)\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "252a7dba",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "from pandas import read_csv\n",
    "from matplotlib.pyplot import subplots, show\n",
    "\n",
    "data_path = Path('data')\n",
    "dfs = {}\n",
    "for p in data_path.glob('weather*'):\n",
    "    dfs[p.stem] = (\n",
    "        read_csv(\n",
    "            'data/weather_york.csv',\n",
    "            usecols=['date', 'temperature_max', 'temperature_min'],\n",
    "            parse_dates=['date'],\n",
    "            index_col='date'\n",
    "        )\n",
    "    ).loc['2000']\n",
    "\n",
    "fig, axes = subplots(2, 2, sharex=True, sharey=True)\n",
    "\n",
    "for (fname, df), ax in zip(dfs.items(), axes.flat):\n",
    "    planet_name = fname.split('_')[1]\n",
    "    ax.bar(\n",
    "        df.index,\n",
    "        bottom=df['temperature_min'],\n",
    "        height=df['temperature_max'] - df['temperature_min'],\n",
    "        width=1\n",
    "    )\n",
    "    ax.set_title(planet_name.title(), loc='left')\n",
    "\n",
    "show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f13007c4",
   "metadata": {},
   "source": [
    "2. *interactive* Using the `data/failures.csv` dataset,\n",
    "\t- Plot the total failures for each 'player' for each day.\n",
    "\t\t- Apply a smoothing factor (`rolling average`) of 90 days prior to plotting the data.\n",
    "\t- Create a slider widget that control the number of days involved in the smoothing.\n",
    "\t\t- e.g. this slider should allow me to apply 0 days of smoothing all the way up to 90 days of smoothing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5d8bdc12",
   "metadata": {},
   "outputs": [],
   "source": [
    "# …"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82d72557",
   "metadata": {},
   "source": [
    "3. *animated* Using the `data/weather_*.csv` datasets\n",
    "\t- Plot ALL of the 'temperature_max' data for EACH of the planets.\n",
    "\t- Animate: drill-down to the planet with the HIGHEST average temperature (across all datapoints)\n",
    "\t- Animate: reintroduce the other planet’s 'temperature_max' to the chart while maintaining the year 2010 zoom."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "521e65b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# …"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6696cac8",
   "metadata": {},
   "source": [
    "**Can’t Get Started With The Above?** - try recreating examples from the documentation for these tools\n",
    "(see **Useful Links** below), or follow along with a tutorial for a tool of your choice!\n",
    "\n",
    "## Useful Links\n",
    "\n",
    "**Matplotlib**\n",
    "- Tutorial: https://matplotlib.org/stable/tutorials/index.html\n",
    "- Cheatsheets: https://matplotlib.org/cheatsheets/\n",
    "- Examples: https://matplotlib.org/stable/gallery/index.html\n",
    "\n",
    "**Plotnine**\n",
    "- Tutorial: http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html (note that plotnine does not have official tutorials, so please refer to ggplot2)\n",
    "- Examples: https://plotnine.readthedocs.io/en/stable/gallery.html#\n",
    "\n",
    "**Bokeh**\n",
    "- Tutorial: https://docs.bokeh.org/en/latest/docs/first_steps.html#first-steps\n",
    "- Examples: https://docs.bokeh.org/en/latest/docs/gallery.html#gallery\n",
    "\n",
    "**IPyvizzu**\n",
    "- Tutoriall: https://ipyvizzu.vizzuhq.com/latest/tutorial/\n",
    "- Examples: https://ipyvizzu.vizzuhq.com/latest/examples/analytical_operations/\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all",
   "main_language": "python",
   "notebook_metadata_filter": "-all"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
diff --git a/notes.2023-08-07.ipynb b/notes.2023-08-07.ipynb
diff --git a/notes.2024-07-16.ipynb b/notes.2024-07-16.ipynb
diff --git a/notes.2024-08-01.ipynb b/notes.2024-08-01.ipynb