Skip to content

Instantly share code, notes, and snippets.

@ocefpaf
Created June 16, 2022 23:26
Show Gist options
  • Save ocefpaf/724108415fd14fbf0848f48fcfdd8dfe to your computer and use it in GitHub Desktop.
Save ocefpaf/724108415fd14fbf0848f48fcfdd8dfe to your computer and use it in GitHub Desktop.
distribution-gve
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"trusted": true
},
"id": "2bec0f14",
"cell_type": "code",
"source": "import pandas as pd\n\n\ndf = pd.read_csv(\"GVE_only-06-16_14_13_48.csv\")",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"id": "c2fa68ba",
"cell_type": "code",
"source": "def score_programming(data):\n \"\"\"https://github.com/oceanhackweek/admin/issues/41#issuecomment-1157692167\"\"\"\n nbox = len(data.split(\"\\n\"))\n score = 0\n if nbox <= 3:\n score = 1\n elif nbox >=4 and nbox < 6:\n score = 2\n elif nbox >=6:\n score = 3\n else:\n raise ValueError(f\"Could not evalute score for {nbox}.\")\n return score\n\n\ncol = \"For the primary programming language you listed above, please check the boxes of the tasks you can execute. Check all that apply.\"\n\nscores = [score_programming(data) for data in df[col]]\n\ndf[\"score programming\"] = scores",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"id": "9ef72a79",
"cell_type": "markdown",
"source": "### Oceanographic Subfields"
},
{
"metadata": {
"trusted": true
},
"id": "f7bda223",
"cell_type": "code",
"source": "import re\n\n\nsubfields = [re.split(\"\\n|, \", s.lower()) for s in df[\"In which field(s) does your research interest fall under? Select all that apply.\"]]\nflat_list = [answer for applicant in subfields for answer in applicant]\n\nunique = sorted(set(flat_list))\n\nunique",
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 3,
"data": {
"text/plain": "['artscience',\n 'biological oceanography',\n 'chemical oceanography',\n 'coastal morphology',\n 'data science',\n 'education',\n 'geology and geophysics',\n 'geomorphology',\n 'meteorology',\n 'ocean engineering',\n 'ocean literacy',\n 'physical oceanography',\n 'resource management',\n 'statistics']"
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"id": "c11b1f03",
"cell_type": "code",
"source": "compose = {}\n\nfor k, applicant in enumerate(subfields):\n compose.update({k: [True if e in sorted(applicant) else False for e in unique]})\n\n\nsubfields = pd.DataFrame(compose).T\nsubfields.columns = unique",
"execution_count": 4,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"id": "deae8572",
"cell_type": "code",
"source": "subfields.sum().plot.bar();",
"execution_count": 5,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"id": "926aed42",
"cell_type": "markdown",
"source": "### Diversity"
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"id": "1b81a35d",
"cell_type": "code",
"source": "diversity = \"In terms of ethnic identity, do you consider yourself a minority with respect to your research field?\"\ngender = \"In terms of gender identity, do you consider yourself a minority with respect to your research field?\"\n\ndf[[diversity, gender]].describe()",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 6,
"data": {
"text/plain": " In terms of ethnic identity, do you consider yourself a minority with respect to your research field? \\\ncount 27 \nunique 2 \ntop No \nfreq 21 \n\n In terms of gender identity, do you consider yourself a minority with respect to your research field? \ncount 26 \nunique 2 \ntop No \nfreq 16 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>In terms of ethnic identity, do you consider yourself a minority with respect to your research field?</th>\n <th>In terms of gender identity, do you consider yourself a minority with respect to your research field?</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>count</th>\n <td>27</td>\n <td>26</td>\n </tr>\n <tr>\n <th>unique</th>\n <td>2</td>\n <td>2</td>\n </tr>\n <tr>\n <th>top</th>\n <td>No</td>\n <td>No</td>\n </tr>\n <tr>\n <th>freq</th>\n <td>21</td>\n <td>16</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "7b65b392",
"cell_type": "markdown",
"source": "### Language"
},
{
"metadata": {
"trusted": true
},
"id": "cf8050a6",
"cell_type": "code",
"source": "col = \"Please rank the programming languages, up to 3, that you are most familiar with.\"\ndf[col].describe()",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "count 27\nunique 24\ntop Python\nfreq 2\nName: Please rank the programming languages, up to 3, that you are most familiar with., dtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"id": "12e033a0",
"cell_type": "code",
"source": "ax = df[\"score programming\"].T.plot.hist(bins=4)\nax.set_xticks([1, 2, 3]);",
"execution_count": 8,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD4CAYAAADrRI2NAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAO40lEQVR4nO3da5BlVXnG8f/DxQIUC61pIwHGRksxhMKArTEhMYpaNYoRc5dSyxi0o0kM5qbjpcR8SAoTrylT0TFOECQYuYQYzcXRqJRVCvYQVGAwWjriCMk0IQZR44i++XDOJFNtz8zpnt5722f9f1WnZu919tnr/XDqmdXr7L1XqgpJUjsOG7oASVK/DH5JaozBL0mNMfglqTEGvyQ15oihC5jEhg0banZ2dugyJGld2b59+51VNbO0fV0E/+zsLAsLC0OXIUnrSpIvL9fuVI8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDVmXdy5K/2gmt38gaFLWFd2XnTO0CUIR/yS1ByDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4JekxnQW/Em2Jtmd5KYl7S9J8rkkNyf5k676lyQtr8sR/8XApn0bkjwROBc4vap+FHh9h/1LkpbRWfBX1bXAXUuaXwxcVFXfHh+zu6v+JUnL63uO/xHATye5LsnHkjxmfwcmmU+ykGRhcXGxxxIlabr1HfxHAA8AHgf8AfDeJFnuwKraUlVzVTU3MzPTZ42SNNX6Dv5dwNU1cj3wPWBDzzVIUtP6Dv5rgLMBkjwCuA9wZ881SFLTOnsef5LLgScAG5LsAi4EtgJbx5d47gGeV1XVVQ2SpO/XWfBX1Xn7ees5XfUpSTo479yVpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMZ0Ff5KtSXaPF11Z+t7vJ6kkLrsoST3rcsR/MbBpaWOSk4CnALd12LckaT86C/6quha4a5m33gS8DHDJRUkaQK9z/EmeAXy1qj49wbHzSRaSLCwuLvZQnSS1obfgT3IM8CrgNZMcX1VbqmququZmZma6LU6SGtLniP9hwMnAp5PsBE4Ebkjy4B5rkKTmHdFXR1X1WeBBe/fH4T9XVXf2VYMkqdvLOS8HPgGckmRXkvO76kuSNLnORvxVdd5B3p/tqm9J0v55564kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mN6XIhlq1Jdie5aZ+2P01ya5LPJPnbJMd11b8kaXldjvgvBjYtadsGnFZVpwP/Bryiw/4lScvoLPir6lrgriVtH6yqe8e7n2S04LokqUdDzvH/GvCPA/YvSU0aJPiTvAq4F7jsAMfMJ1lIsrC4uNhfcZI05XoP/iTPA54OPLuqan/HVdWWqpqrqrmZmZn+CpSkKXdEn50l2QS8HPiZqvpmn31Lkka6vJzzcuATwClJdiU5H3grcCywLcmNSd7WVf+SpOV1NuKvqvOWaX5nV/1JkibjnbuS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktSYiYI/yWldFyJJ6sekI/63Jbk+yW+4XKIkrW8TBX9V/RTwbOAkYCHJXyd5SqeVSZI6MfEcf1V9Hng148cqA382Xjj957sqTpK09iad4z89yZuAHcDZwM9W1Y+Mt9/UYX2SpDU26WOZ3wq8A3hlVX1rb2NV3Z7k1Z1UJknqxKTB/zTgW1X1XYAkhwFHVdU3q+rSzqqTJK25Sef4PwQcvc/+MeO2/UqyNcnuJDft0/bAJNuSfH787wNWXrIk6VBMGvxHVdU9e3fG28cc5DMXA5uWtG0GPlxVDwc+PN6XJPVo0uD/RpIz9+4keTTwrQMcT1VdC9y1pPlc4F3j7XcBz5ywf0nSGpl0jv+lwBVJbh/vHw/8yir6+6GqugOgqu5I8qD9HZhkHpgH2Lhx4yq6kiQtZ6Lgr6pPJXkkcAoQ4Naq+k6XhVXVFmALwNzcXHXZlyS1ZNIRP8BjgNnxZ85IQlVdssL+/iPJ8ePR/vHA7hV+XpJ0iCYK/iSXAg8DbgS+O24uYKXB/z7gecBF43//boWflyQdoklH/HPAqVU18ZRLksuBJwAbkuwCLmQU+O9Ncj5wG/BLKytXknSoJg3+m4AHA3dMeuKqOm8/bz1p0nNIktbepMG/AbglyfXAt/c2VtUzOqlKktSZSYP/tV0WIUnqz6SXc34syUOAh1fVh5IcAxzebWmSpC5M+ljmFwJXAm8fN50AXNNRTZKkDk36yIbfBM4C7ob/W5Rlv3fdSpJ+cE0a/N+uqj17d5Icweg6fknSOjNp8H8sySuBo8dr7V4B/H13ZUmSujJp8G8GFoHPAr8O/AOj9XclSevMpFf1fI/R0ovv6LYcSVLXJn1Wz5dYZk6/qh665hVJkjq1kmf17HUUo2fsPHDty5EkdW2iOf6q+s99Xl+tqjcDZ3dbmiSpC5NO9Zy5z+5hjP4COLaTiiRJnZp0qucN+2zfC+wEfnnNq5EkdW7Sq3qe2HUhkqR+TDrV87sHer+q3riSTpP8DvACRlcKfRZ4flX9z0rOIUlanUlv4JoDXszo4WwnAC8CTmU0z7+iuf4kJwC/DcxV1WmMnvL5rJWcQ5K0eitZiOXMqvo6QJLXAldU1QsOod+jk3wHOAa4fZXnkSSt0KQj/o3Ann329wCzq+mwqr4KvJ7Rmrt3AP9dVR9celyS+SQLSRYWFxdX05UkaRmTBv+lwPVJXpvkQuA64JLVdJjkAcC5wMnADwP3TfKcpcdV1ZaqmququZmZmdV0JUlaxqQ3cP0R8Hzgv4CvMfox9o9X2eeTgS9V1WJVfQe4GvjJVZ5LkrRCk474YTQXf3dVvQXYleTkVfZ5G/C4JMckCfAkYMcqzyVJWqFJl168EHg58Ipx05HAu1fTYVVdx2gZxxsYXcp5GLBlNeeSJK3cpFf1/BxwBqOwpqpuT7LqRzZU1YXAhav9vCRp9Sad6tlTVcX40cxJ7ttdSZKkLk0a/O9N8nbguCQvBD6Ei7JI0rp00Kme8Q+wfwM8ErgbOAV4TVVt67g2SVIHDhr8VVVJrqmqRwOGvSStc5NO9XwyyWM6rUSS1ItJr+p5IvCiJDuBbwBh9MfA6V0VJknqxgGDP8nGqroNeGpP9UiSOnawEf81jJ7K+eUkV1XVL/RQkySpQweb488+2w/tshBJUj8OFvy1n21J0jp1sKmeRyW5m9HI/+jxNvz/j7v377Q6SdKaO2DwV9XhfRUiSerHSh7LLEmaAga/JDXG4Jekxhj8ktSYQYI/yXFJrkxya5IdSX5iiDokqUWTPqtnrb0F+Keq+sUk92G0nq8kqQe9B3+S+wOPB34VoKr2AHv6rkOSWjXEiP+hwCLwV0keBWwHLqiqb+x7UJJ5YB5g48aNq+5sdvMHVl9pg3ZedM7QJUjq2BBz/EcAZwJ/UVVnMHrM8+alB1XVlqqaq6q5mZmZvmuUpKk1RPDvAnZV1XXj/SsZ/UcgSepB78FfVf8OfCXJKeOmJwG39F2HJLVqqKt6XgJcNr6i54vA8weqQ5KaM0jwV9WNwNwQfUtS67xzV5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYMFvxJDk/yr0neP1QNktSiIUf8FwA7Buxfkpo0SPAnORE4B/jLIfqXpJYNtebum4GXAcfu74Ak88A8wMaNG/upSlKnZjd/YOgS1p2dF52z5ufsfcSf5OnA7qrafqDjqmpLVc1V1dzMzExP1UnS9Btiqucs4BlJdgLvAc5O8u4B6pCkJvUe/FX1iqo6sapmgWcB/1JVz+m7DklqldfxS1JjhvpxF4Cq+ijw0SFrkKTWOOKXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMUOsuXtSko8k2ZHk5iQX9F2DJLVsiIVY7gV+r6puSHIssD3Jtqq6ZYBaJKk5Q6y5e0dV3TDe/jqwAzih7zokqVWDzvEnmQXOAK5b5r35JAtJFhYXF3uvTZKm1WDBn+R+wFXAS6vq7qXvV9WWqpqrqrmZmZn+C5SkKTVI8Cc5klHoX1ZVVw9RgyS1aoiregK8E9hRVW/su39Jat0QI/6zgOcCZye5cfx62gB1SFKTer+cs6o+DqTvfiVJI965K0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqzFBr7m5K8rkkX0iyeYgaJKlVQ6y5ezjw58BTgVOB85Kc2ncdktSqIUb8jwW+UFVfrKo9wHuAcweoQ5Ka1Puau8AJwFf22d8F/PjSg5LMA/Pj3XuSfG6V/W0A7lzlZ5uT1w1dwbrj90udyusO6Tv2kOUahwj+5RZar+9rqNoCbDnkzpKFqpo71PNIy/H7pa518R0bYqpnF3DSPvsnArcPUIckNWmI4P8U8PAkJye5D/As4H0D1CFJTep9qqeq7k3yW8A/A4cDW6vq5g67POTpIukA/H6pa2v+HUvV902vS5KmmHfuSlJjDH5JaszUBn+SrUl2J7lp6Fo0fZKclOQjSXYkuTnJBUPXpOmR5Kgk1yf59Pj79Ydrev5pneNP8njgHuCSqjpt6Ho0XZIcDxxfVTckORbYDjyzqm4ZuDRNgSQB7ltV9yQ5Evg4cEFVfXItzj+1I/6quha4a+g6NJ2q6o6qumG8/XVgB6O70qVDViP3jHePHL/WbJQ+tcEv9SXJLHAGcN3ApWiKJDk8yY3AbmBbVa3Z98vglw5BkvsBVwEvraq7h65H06OqvltVP8bo6QaPTbJmU9YGv7RK47nXq4DLqurqoevRdKqqrwEfBTat1TkNfmkVxj++vRPYUVVvHLoeTZckM0mOG28fDTwZuHWtzj+1wZ/kcuATwClJdiU5f+iaNFXOAp4LnJ3kxvHraUMXpalxPPCRJJ9h9HyzbVX1/rU6+dRezilJWt7UjvglScsz+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1Jj/hc6VY8VAdNK7wAAAABJRU5ErkJggg==\n"
},
"metadata": {
"needs_background": "light"
}
}
]
}
],
"metadata": {
"_draft": {
"nbviewer_url": "https://gist.github.com/3100d7b04ca56ff99cc0431fc10ebb3d"
},
"gist": {
"id": "3100d7b04ca56ff99cc0431fc10ebb3d",
"data": {
"description": "distribution-gve",
"public": true
}
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3 (ipykernel)",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.10.5",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment