{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Module 8: Python Pandas\n",
    "## Chapter 15 & 16 from the Alex DeCaria textbook\n",
    "\n",
    "Pandas is a data manipulation tool built on Python's Numpy module. Pandas introduces two new data type structures. The first data structure type is a Pandas *series*, which is a one dimensional array. The second data structure is a *'dataframe'*, which is similar to a 2-D NumPy array, except that it can hold multiple data types within it (strings, floats, integers, datetime objects, ect...). This features make Pandas particularly good for working with time series data. In addition, I've always found reading in text and csv data rather annoying in Python, but Pandas makes this task MUCH easier! For today, we will: \n",
    "- Read csv-formatted data into Python \n",
    "- Learn how to subset and slice dataframes \n",
    "- Aggegrate and group data\n",
    "- Work with time objects in Pandas\n",
    "- While not discussed here, plotting data from a Pandas dataframe is super easy! (We will learn about Python plotting next week)\n",
    "\n",
    "**Before starting:** Make sure that you open up a Jupyter notebook session using OnDemand so you can interactively follow along with today's lecture! Also be sure to copy this Jupyter Notebook files as instructed on CANVAS!\n",
    "\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "#  Reading in data\n",
    "\n",
    "As a starting point, lets load the appriopriate libraries for working with Pandas:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the purpose of this class, we will mostly be working with *dataframes* as we will mostly be working with 2D data sets. As a first step, lets fire up a linux terminal session and copy the filed called:<br>\n",
    "*module8_WBB.csv*<br>\n",
    "*module8_zoo.csv*<br><br>\n",
    "These can be found in:<br>\n",
    "*/uufs/chpc.utah.edu/common/home/u0703457/public_html/dereks_homepage/Atmos_5340/module_8*<br>\n",
    "<br>\n",
    "Copy this file into your working directory (preferably, in the same directory as this script, which you have hopefully downloaded and placed in your module8 subdirectory.\n",
    "<br><br>\n",
    "⚠️⚠️⚠️ *As a sanity check lets look at this file using the 'more' command in linux to check out the contents of these files, and to ensure that we are looking at the correct file. You should **ALWAYS** look at the files you plan on reading in with any programming language to make sure the contents are what you think they are*\n",
    "<br><br>\n",
    "To read in a csv file, we will be using the `pd.read_csv()` function, which is similar to Python's default csv function, except that it is a bit less cumbersome to use! This function has a number of arguments that will allow this code to deal all sorts of csv files. For this lecture, we will just focus on a few key arguments:\n",
    "\n",
    ">- `filepath_or_buffer`: name of file we are reading in\n",
    ">- `sep`: how are values seperated in our csv file\n",
    ">- `header`: Do we want to include our header? If left as blank, column names are assigned from the first row of our csv file.\n",
    ">- `skiprows`: How many rows do we want to skip when reading in our file? (0-base index).\n",
    "\n",
    "<br><br>\n",
    "Other arguments for this function can be found here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html\n",
    "<br><br>\n",
    "Lets read in our file now:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id  water_need\n",
      "0   elephant     1001         500\n",
      "1   elephant     1002         600\n",
      "2   elephant     1003         550\n",
      "3      tiger     1004         300\n",
      "4      tiger     1005         320\n",
      "5      tiger     1006         330\n",
      "6      tiger     1007         290\n",
      "7      tiger     1008         310\n",
      "8      zebra     1009         200\n",
      "9      zebra     1010         220\n",
      "10     zebra     1011         240\n",
      "11     zebra     1012         230\n",
      "12     zebra     1013         220\n",
      "13     zebra     1014         100\n",
      "14     zebra     1015          80\n",
      "15      lion     1016         420\n",
      "16      lion     1017         600\n",
      "17      lion     1018         500\n",
      "18      lion     1019         390\n",
      "19  kangaroo     1020         410\n",
      "20  kangaroo     1021         430\n",
      "21  kangaroo     1022         410\n"
     ]
    }
   ],
   "source": [
    "my_zoo = pd.read_csv('module8_zoo.csv', delimiter = ',')\n",
    "print(my_zoo)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.frame.DataFrame"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(my_zoo)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And that is it..! Reading in data with Pandas is usually very easy, especially when its a nicely formatted csv file! As seen in our print statement, we are working with a Pandas data frame, where the indices on the left are the index numbers of our 2D array, while the column names are defined using the column names from the first row of our csv file."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "#  Selecting rows and columns\n",
    "\n",
    "Selecting a specific row or column within a Pandas dataframe is fairly simple. To index a column, we just use the dataframe variable name, and subset the columns by using the name of the column that we would like to select:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0     elephant\n",
      "1     elephant\n",
      "2     elephant\n",
      "3        tiger\n",
      "4        tiger\n",
      "5        tiger\n",
      "6        tiger\n",
      "7        tiger\n",
      "8        zebra\n",
      "9        zebra\n",
      "10       zebra\n",
      "11       zebra\n",
      "12       zebra\n",
      "13       zebra\n",
      "14       zebra\n",
      "15        lion\n",
      "16        lion\n",
      "17        lion\n",
      "18        lion\n",
      "19    kangaroo\n",
      "20    kangaroo\n",
      "21    kangaroo\n",
      "Name: animal, dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(my_zoo['animal'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>   \n",
    "\n",
    "To get the first 5 rows of our dataframe, we can slice our data frame with the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     animal  uniq_id  water_need\n",
      "0  elephant     1001         500\n",
      "1  elephant     1002         600\n",
      "2  elephant     1003         550\n"
     ]
    }
   ],
   "source": [
    "print(my_zoo[:3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>  \n",
    "\n",
    "We can also combine both commands to subset our data from the first 3 rows of the 'animal' column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0    elephant\n",
      "1    elephant\n",
      "2    elephant\n",
      "Name: animal, dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(my_zoo[:3]['animal'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br> \n",
    "\n",
    "What is we wanted to grab the name of our columns only, and save it as a list?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['animal', 'uniq_id', 'water_need']\n"
     ]
    }
   ],
   "source": [
    "print(list(my_zoo.columns))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "You also have the ability to select multiple columns by simply inserting a list of column names as part of the subset command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id\n",
      "0   elephant     1001\n",
      "1   elephant     1002\n",
      "2   elephant     1003\n",
      "3      tiger     1004\n",
      "4      tiger     1005\n",
      "5      tiger     1006\n",
      "6      tiger     1007\n",
      "7      tiger     1008\n",
      "8      zebra     1009\n",
      "9      zebra     1010\n",
      "10     zebra     1011\n",
      "11     zebra     1012\n",
      "12     zebra     1013\n",
      "13     zebra     1014\n",
      "14     zebra     1015\n",
      "15      lion     1016\n",
      "16      lion     1017\n",
      "17      lion     1018\n",
      "18      lion     1019\n",
      "19  kangaroo     1020\n",
      "20  kangaroo     1021\n",
      "21  kangaroo     1022\n"
     ]
    }
   ],
   "source": [
    "print(my_zoo[['animal','uniq_id']])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "# Do it yourself #1\n",
    "\n",
    "How would we subset our dataframe 'my_zoo' for the last 10 rows of the animal and water_need columns?\n",
    "<br><br><br><br><br><br>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>  \n",
    "\n",
    "What if we wanted save the rows for animals that have a water needed greater than or equal to 300?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id  water_need\n",
      "0   elephant     1001         500\n",
      "1   elephant     1002         600\n",
      "2   elephant     1003         550\n",
      "3      tiger     1004         300\n",
      "4      tiger     1005         320\n",
      "5      tiger     1006         330\n",
      "7      tiger     1008         310\n",
      "15      lion     1016         420\n",
      "16      lion     1017         600\n",
      "17      lion     1018         500\n",
      "18      lion     1019         390\n",
      "19  kangaroo     1020         410\n",
      "20  kangaroo     1021         430\n",
      "21  kangaroo     1022         410\n"
     ]
    }
   ],
   "source": [
    "thirsty_animals = my_zoo[my_zoo['water_need'] >= 300]\n",
    "print(thirsty_animals)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br> \n",
    "\n",
    "# Do it yourself #2\n",
    "\n",
    "What if I just wanted the ID numbers for animals that had a water requirement between 300-400? How would I do this? What animal requires this amount of water?\n",
    " <br><br><br><br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "\n",
    "\n",
    "**Other useful functions and methods:**\n",
    "\n",
    "You can also add new columns to dateframe easily using Pandas. For example, lets say we wanted to add a new column called 'age', which we would fill in later. We can simply do the following to accomplish this task:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id  water_need  age\n",
      "0   elephant     1001         500  NaN\n",
      "1   elephant     1002         600  NaN\n",
      "2   elephant     1003         550  NaN\n",
      "3      tiger     1004         300  NaN\n",
      "4      tiger     1005         320  NaN\n",
      "5      tiger     1006         330  NaN\n",
      "6      tiger     1007         290  NaN\n",
      "7      tiger     1008         310  NaN\n",
      "8      zebra     1009         200  NaN\n",
      "9      zebra     1010         220  NaN\n",
      "10     zebra     1011         240  NaN\n",
      "11     zebra     1012         230  NaN\n",
      "12     zebra     1013         220  NaN\n",
      "13     zebra     1014         100  NaN\n",
      "14     zebra     1015          80  NaN\n",
      "15      lion     1016         420  NaN\n",
      "16      lion     1017         600  NaN\n",
      "17      lion     1018         500  NaN\n",
      "18      lion     1019         390  NaN\n",
      "19  kangaroo     1020         410  NaN\n",
      "20  kangaroo     1021         430  NaN\n",
      "21  kangaroo     1022         410  NaN\n"
     ]
    }
   ],
   "source": [
    "my_zoo['age'] = np.nan\n",
    "print(my_zoo)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, I just set the value in the column as `NaN`, to signify that these need to be filled in at a later time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "You can use the `.loc` method to subset for a specific row number using an index. Lets saw we now know the age for the 2nd elephant in our data frame. To fill this in, we can use the `.at` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "my_zoo.at[2,'age'] = '4'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Did it work?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id  water_need  age\n",
      "0   elephant     1001         500  NaN\n",
      "1   elephant     1002         600  NaN\n",
      "2   elephant     1003         550  4.0\n",
      "3      tiger     1004         300  NaN\n",
      "4      tiger     1005         320  NaN\n",
      "5      tiger     1006         330  NaN\n",
      "6      tiger     1007         290  NaN\n",
      "7      tiger     1008         310  NaN\n",
      "8      zebra     1009         200  NaN\n",
      "9      zebra     1010         220  NaN\n",
      "10     zebra     1011         240  NaN\n",
      "11     zebra     1012         230  NaN\n",
      "12     zebra     1013         220  NaN\n",
      "13     zebra     1014         100  NaN\n",
      "14     zebra     1015          80  NaN\n",
      "15      lion     1016         420  NaN\n",
      "16      lion     1017         600  NaN\n",
      "17      lion     1018         500  NaN\n",
      "18      lion     1019         390  NaN\n",
      "19  kangaroo     1020         410  NaN\n",
      "20  kangaroo     1021         430  NaN\n",
      "21  kangaroo     1022         410  NaN\n"
     ]
    }
   ],
   "source": [
    "print(my_zoo)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "Another handy function is the `.replace` method, which will replace all instances of a specified value within our data frame. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "      animal  uniq_id  water_need  age\n",
      "0   elephant     1001         500  5.0\n",
      "1   elephant     1002         600  5.0\n",
      "2   elephant     1003         550  4.0\n",
      "3      tiger     1004         300  5.0\n",
      "4      tiger     1005         320  5.0\n",
      "5      tiger     1006         330  5.0\n",
      "6      tiger     1007         290  5.0\n",
      "7      tiger     1008         310  5.0\n",
      "8      zebra     1009         200  5.0\n",
      "9      zebra     1010         220  5.0\n",
      "10     zebra     1011         240  5.0\n",
      "11     zebra     1012         230  5.0\n",
      "12     zebra     1013         220  5.0\n",
      "13     zebra     1014         100  5.0\n",
      "14     zebra     1015          80  5.0\n",
      "15      lion     1016         420  5.0\n",
      "16      lion     1017         600  5.0\n",
      "17      lion     1018         500  5.0\n",
      "18      lion     1019         390  5.0\n",
      "19  kangaroo     1020         410  5.0\n",
      "20  kangaroo     1021         430  5.0\n",
      "21  kangaroo     1022         410  5.0\n"
     ]
    }
   ],
   "source": [
    "my_zoo = my_zoo.replace(np.nan, 5)\n",
    "print(my_zoo)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that this command replaced all instances of `np.nan` with 5. Also note that the age of elephant #2 (ID 1002) remained unchanged since it was not equal to `np.nan`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "#  Aggregating data\n",
    "\n",
    "Pandas dataframe structure makes it super easy to aggregate and manipulate data. For example, lets say we wanted to take an average of the water need for each animal type? We could this manually with this relatively short dataframe, but what if you needed to do this for a much larger zoo, with thousands of animals?!\n",
    "Fortunately, this can be easily done in Python:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "animal\n",
      "elephant    550.000000\n",
      "kangaroo    416.666667\n",
      "lion        477.500000\n",
      "tiger       310.000000\n",
      "zebra       184.285714\n",
      "Name: water_need, dtype: float64\n"
     ]
    }
   ],
   "source": [
    "avg_water = my_zoo.groupby('animal')['water_need'].agg(np.mean)   \n",
    "print(avg_water)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "We can also compute a bunch of other mathematical functions, simultaneously, in a single line!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                mean  median        std  amax  amin\n",
      "animal                                             \n",
      "elephant  550.000000     550  50.000000   600   500\n",
      "kangaroo  416.666667     410  11.547005   430   410\n",
      "lion      477.500000     460  93.941471   600   390\n",
      "tiger     310.000000     310  15.811388   330   290\n",
      "zebra     184.285714     220  65.791880   240    80\n"
     ]
    }
   ],
   "source": [
    "water_stats = my_zoo.groupby('animal')['water_need'].agg([np.mean,np.median,np.std,np.max,np.min])\n",
    "print(water_stats)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "#  Working with data with timestamps\n",
    "\n",
    "Now, lets work with a data that has a column for time. Meteorological data often contains information about some quanitity that is being measured/modeled, in addition to the time that the measurement/modeled value was made at."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets take a quick peek at `module8_WBB.csv` with the Linux `more` command since we will be reading it in shortly.\n",
    "\n",
    "Do you see anything weird about this file that we may want to consider before reading it in with Pandas `pd.read_csv` function?\n",
    "\n",
    "Lets read in our file..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                          Station_ID  air_temp_set_1  relative_humidity_set_1  \\\n",
      "Date_Time                                                                       \n",
      "2011-11-30 12:00:00-07:00        WBB           41.83                    65.03   \n",
      "2011-11-30 12:05:00-07:00        WBB           42.15                    63.24   \n",
      "2011-11-30 12:10:00-07:00        WBB           42.40                    62.89   \n",
      "2011-11-30 12:15:00-07:00        WBB           41.92                    64.74   \n",
      "2011-11-30 12:20:00-07:00        WBB           41.61                    65.40   \n",
      "2011-11-30 12:25:00-07:00        WBB           41.76                    64.47   \n",
      "2011-11-30 12:30:00-07:00        WBB           41.11                    62.54   \n",
      "2011-11-30 12:35:00-07:00        WBB           40.26                    55.46   \n",
      "2011-11-30 12:40:00-07:00        WBB           39.06                    57.24   \n",
      "2011-11-30 12:45:00-07:00        WBB           40.35                    56.89   \n",
      "2011-11-30 12:50:00-07:00        WBB           40.55                    57.80   \n",
      "2011-11-30 12:55:00-07:00        WBB           39.85                    59.45   \n",
      "2011-11-30 13:00:00-07:00        WBB           39.58                    61.89   \n",
      "2011-11-30 13:05:00-07:00        WBB           39.04                    65.71   \n",
      "2011-11-30 13:10:00-07:00        WBB           38.07                    69.02   \n",
      "2011-11-30 13:15:00-07:00        WBB           37.45                    72.12   \n",
      "2011-11-30 13:20:00-07:00        WBB           37.24                    74.39   \n",
      "2011-11-30 13:25:00-07:00        WBB           37.20                    77.28   \n",
      "2011-11-30 13:30:00-07:00        WBB           36.46                    81.40   \n",
      "2011-11-30 13:35:00-07:00        WBB           35.58                    84.60   \n",
      "2011-11-30 13:40:00-07:00        WBB           35.49                    85.80   \n",
      "2011-11-30 13:45:00-07:00        WBB           35.56                    86.00   \n",
      "2011-11-30 13:50:00-07:00        WBB           36.48                    85.40   \n",
      "2011-11-30 13:55:00-07:00        WBB           37.20                    83.70   \n",
      "2011-11-30 14:00:00-07:00        WBB           37.40                    82.50   \n",
      "2011-11-30 14:05:00-07:00        WBB           37.87                    79.01   \n",
      "2011-11-30 14:10:00-07:00        WBB           37.74                    75.35   \n",
      "2011-11-30 14:15:00-07:00        WBB           37.65                    74.23   \n",
      "2011-11-30 14:20:00-07:00        WBB           37.71                    74.77   \n",
      "2011-11-30 14:25:00-07:00        WBB           38.16                    73.81   \n",
      "...                              ...             ...                      ...   \n",
      "2011-12-02 09:35:00-07:00        WBB           31.26                    48.90   \n",
      "2011-12-02 09:40:00-07:00        WBB           32.04                    47.57   \n",
      "2011-12-02 09:45:00-07:00        WBB           32.86                    45.90   \n",
      "2011-12-02 09:50:00-07:00        WBB           33.42                    45.31   \n",
      "2011-12-02 09:55:00-07:00        WBB           34.14                    44.35   \n",
      "2011-12-02 10:00:00-07:00        WBB           35.28                    42.54   \n",
      "2011-12-02 10:05:00-07:00        WBB           36.19                    41.02   \n",
      "2011-12-02 10:10:00-07:00        WBB           34.97                    43.20   \n",
      "2011-12-02 10:15:00-07:00        WBB           35.55                    42.60   \n",
      "2011-12-02 10:20:00-07:00        WBB           34.56                    44.34   \n",
      "2011-12-02 10:25:00-07:00        WBB           33.12                    47.06   \n",
      "2011-12-02 10:30:00-07:00        WBB           33.40                    46.75   \n",
      "2011-12-02 10:35:00-07:00        WBB           33.42                    46.71   \n",
      "2011-12-02 10:40:00-07:00        WBB           33.22                    46.96   \n",
      "2011-12-02 10:45:00-07:00        WBB           34.27                    44.99   \n",
      "2011-12-02 10:50:00-07:00        WBB           35.47                    43.20   \n",
      "2011-12-02 10:55:00-07:00        WBB           33.21                    47.13   \n",
      "2011-12-02 11:00:00-07:00        WBB           33.69                    45.98   \n",
      "2011-12-02 11:05:00-07:00        WBB           35.46                    43.29   \n",
      "2011-12-02 11:10:00-07:00        WBB           36.30                    41.72   \n",
      "2011-12-02 11:15:00-07:00        WBB           36.27                    42.02   \n",
      "2011-12-02 11:20:00-07:00        WBB           36.54                    41.69   \n",
      "2011-12-02 11:25:00-07:00        WBB           37.89                    39.56   \n",
      "2011-12-02 11:30:00-07:00        WBB           36.88                    41.33   \n",
      "2011-12-02 11:35:00-07:00        WBB           35.60                    43.24   \n",
      "2011-12-02 11:40:00-07:00        WBB           35.74                    42.71   \n",
      "2011-12-02 11:45:00-07:00        WBB           34.97                    43.94   \n",
      "2011-12-02 11:50:00-07:00        WBB           36.59                    41.50   \n",
      "2011-12-02 11:55:00-07:00        WBB           37.33                    40.51   \n",
      "2011-12-02 12:00:00-07:00        WBB           38.01                    38.91   \n",
      "\n",
      "                           wind_speed_set_1  wind_direction_set_1  \\\n",
      "Date_Time                                                           \n",
      "2011-11-30 12:00:00-07:00              7.36                 275.9   \n",
      "2011-11-30 12:05:00-07:00              6.82                 283.9   \n",
      "2011-11-30 12:10:00-07:00              6.44                 300.8   \n",
      "2011-11-30 12:15:00-07:00              9.46                 292.5   \n",
      "2011-11-30 12:20:00-07:00              9.40                 290.3   \n",
      "2011-11-30 12:25:00-07:00              7.02                 288.8   \n",
      "2011-11-30 12:30:00-07:00              6.93                 296.2   \n",
      "2011-11-30 12:35:00-07:00             11.43                 312.8   \n",
      "2011-11-30 12:40:00-07:00             13.62                 319.0   \n",
      "2011-11-30 12:45:00-07:00              9.53                 288.7   \n",
      "2011-11-30 12:50:00-07:00             10.85                 280.4   \n",
      "2011-11-30 12:55:00-07:00              7.05                 287.5   \n",
      "2011-11-30 13:00:00-07:00              7.23                 282.1   \n",
      "2011-11-30 13:05:00-07:00             10.80                 304.7   \n",
      "2011-11-30 13:10:00-07:00             10.96                 319.2   \n",
      "2011-11-30 13:15:00-07:00              9.57                 323.5   \n",
      "2011-11-30 13:20:00-07:00              5.41                 311.7   \n",
      "2011-11-30 13:25:00-07:00              8.19                 317.6   \n",
      "2011-11-30 13:30:00-07:00             11.50                 322.8   \n",
      "2011-11-30 13:35:00-07:00              9.86                 324.8   \n",
      "2011-11-30 13:40:00-07:00              9.19                 329.3   \n",
      "2011-11-30 13:45:00-07:00              6.24                 313.1   \n",
      "2011-11-30 13:50:00-07:00              3.71                 286.6   \n",
      "2011-11-30 13:55:00-07:00              2.53                 303.9   \n",
      "2011-11-30 14:00:00-07:00              3.38                   4.6   \n",
      "2011-11-30 14:05:00-07:00              4.97                 359.8   \n",
      "2011-11-30 14:10:00-07:00              6.11                 354.1   \n",
      "2011-11-30 14:15:00-07:00              6.98                 351.5   \n",
      "2011-11-30 14:20:00-07:00              3.36                 340.3   \n",
      "2011-11-30 14:25:00-07:00              2.53                 281.7   \n",
      "...                                     ...                   ...   \n",
      "2011-12-02 09:35:00-07:00              1.79                 298.5   \n",
      "2011-12-02 09:40:00-07:00              0.85                 318.1   \n",
      "2011-12-02 09:45:00-07:00              0.89                 334.5   \n",
      "2011-12-02 09:50:00-07:00              1.21                 271.4   \n",
      "2011-12-02 09:55:00-07:00              1.30                 292.3   \n",
      "2011-12-02 10:00:00-07:00              0.22                 298.4   \n",
      "2011-12-02 10:05:00-07:00              0.89                 325.4   \n",
      "2011-12-02 10:10:00-07:00              0.94                 302.0   \n",
      "2011-12-02 10:15:00-07:00              1.54                 203.6   \n",
      "2011-12-02 10:20:00-07:00              2.26                 248.7   \n",
      "2011-12-02 10:25:00-07:00              2.73                 255.4   \n",
      "2011-12-02 10:30:00-07:00              2.46                 252.4   \n",
      "2011-12-02 10:35:00-07:00              2.44                 312.5   \n",
      "2011-12-02 10:40:00-07:00              2.24                 318.8   \n",
      "2011-12-02 10:45:00-07:00              1.88                 301.6   \n",
      "2011-12-02 10:50:00-07:00              2.44                 261.4   \n",
      "2011-12-02 10:55:00-07:00              3.69                 230.5   \n",
      "2011-12-02 11:00:00-07:00              3.13                 242.9   \n",
      "2011-12-02 11:05:00-07:00              1.66                 216.0   \n",
      "2011-12-02 11:10:00-07:00              3.02                 257.6   \n",
      "2011-12-02 11:15:00-07:00              2.62                 283.1   \n",
      "2011-12-02 11:20:00-07:00              1.39                 255.5   \n",
      "2011-12-02 11:25:00-07:00              1.86                 283.6   \n",
      "2011-12-02 11:30:00-07:00              2.48                 287.8   \n",
      "2011-12-02 11:35:00-07:00              1.92                 328.8   \n",
      "2011-12-02 11:40:00-07:00              3.96                 260.9   \n",
      "2011-12-02 11:45:00-07:00              4.29                 262.5   \n",
      "2011-12-02 11:50:00-07:00              1.97                 234.7   \n",
      "2011-12-02 11:55:00-07:00              2.24                 252.9   \n",
      "2011-12-02 12:00:00-07:00              1.90                 235.7   \n",
      "\n",
      "                           wind_gust_set_1  \n",
      "Date_Time                                   \n",
      "2011-11-30 12:00:00-07:00            12.71  \n",
      "2011-11-30 12:05:00-07:00            13.49  \n",
      "2011-11-30 12:10:00-07:00            12.88  \n",
      "2011-11-30 12:15:00-07:00            13.56  \n",
      "2011-11-30 12:20:00-07:00            16.67  \n",
      "2011-11-30 12:25:00-07:00            14.47  \n",
      "2011-11-30 12:30:00-07:00            12.88  \n",
      "2011-11-30 12:35:00-07:00            20.65  \n",
      "2011-11-30 12:40:00-07:00            20.69  \n",
      "2011-11-30 12:45:00-07:00            15.77  \n",
      "2011-11-30 12:50:00-07:00            15.90  \n",
      "2011-11-30 12:55:00-07:00            13.02  \n",
      "2011-11-30 13:00:00-07:00            14.79  \n",
      "2011-11-30 13:05:00-07:00            15.48  \n",
      "2011-11-30 13:10:00-07:00            15.70  \n",
      "2011-11-30 13:15:00-07:00            13.98  \n",
      "2011-11-30 13:20:00-07:00            10.87  \n",
      "2011-11-30 13:25:00-07:00            12.15  \n",
      "2011-11-30 13:30:00-07:00            15.82  \n",
      "2011-11-30 13:35:00-07:00            12.88  \n",
      "2011-11-30 13:40:00-07:00            11.92  \n",
      "2011-11-30 13:45:00-07:00            11.83  \n",
      "2011-11-30 13:50:00-07:00             7.85  \n",
      "2011-11-30 13:55:00-07:00             6.62  \n",
      "2011-11-30 14:00:00-07:00             5.57  \n",
      "2011-11-30 14:05:00-07:00             8.28  \n",
      "2011-11-30 14:10:00-07:00             9.73  \n",
      "2011-11-30 14:15:00-07:00            10.49  \n",
      "2011-11-30 14:20:00-07:00             6.13  \n",
      "2011-11-30 14:25:00-07:00             5.30  \n",
      "...                                    ...  \n",
      "2011-12-02 09:35:00-07:00             3.89  \n",
      "2011-12-02 09:40:00-07:00             2.46  \n",
      "2011-12-02 09:45:00-07:00             2.66  \n",
      "2011-12-02 09:50:00-07:00             3.15  \n",
      "2011-12-02 09:55:00-07:00             2.10  \n",
      "2011-12-02 10:00:00-07:00             1.05  \n",
      "2011-12-02 10:05:00-07:00             2.19  \n",
      "2011-12-02 10:10:00-07:00             2.24  \n",
      "2011-12-02 10:15:00-07:00             3.38  \n",
      "2011-12-02 10:20:00-07:00             5.95  \n",
      "2011-12-02 10:25:00-07:00             4.52  \n",
      "2011-12-02 10:30:00-07:00             4.00  \n",
      "2011-12-02 10:35:00-07:00             4.56  \n",
      "2011-12-02 10:40:00-07:00             4.21  \n",
      "2011-12-02 10:45:00-07:00             4.25  \n",
      "2011-12-02 10:50:00-07:00             4.88  \n",
      "2011-12-02 10:55:00-07:00             5.12  \n",
      "2011-12-02 11:00:00-07:00             4.61  \n",
      "2011-12-02 11:05:00-07:00             3.80  \n",
      "2011-12-02 11:10:00-07:00             5.91  \n",
      "2011-12-02 11:15:00-07:00             4.83  \n",
      "2011-12-02 11:20:00-07:00             3.74  \n",
      "2011-12-02 11:25:00-07:00             4.16  \n",
      "2011-12-02 11:30:00-07:00             4.05  \n",
      "2011-12-02 11:35:00-07:00             4.29  \n",
      "2011-12-02 11:40:00-07:00             6.76  \n",
      "2011-12-02 11:45:00-07:00             5.79  \n",
      "2011-12-02 11:50:00-07:00             4.25  \n",
      "2011-12-02 11:55:00-07:00             4.05  \n",
      "2011-12-02 12:00:00-07:00             3.47  \n",
      "\n",
      "[577 rows x 6 columns]\n"
     ]
    }
   ],
   "source": [
    "dat = pd.read_csv('module8_WBB.csv',sep=',',skiprows=6,index_col=1,parse_dates=True)\n",
    "print(dat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wow! Easy! Everything looks like it was properly formatted, at least visually. Lets double check through. Using the dtypes method, we can check the data type of each column in our 'dat' dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Station_ID                  object\n",
      "air_temp_set_1             float64\n",
      "relative_humidity_set_1    float64\n",
      "wind_speed_set_1           float64\n",
      "wind_direction_set_1       float64\n",
      "wind_gust_set_1            float64\n",
      "dtype: object\n"
     ]
    }
   ],
   "source": [
    "print(dat.dtypes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "Now that our time column is properly formatted, we can easily manipulate our dateframe, such as resampling the data to different time itervals, or subsetting the dataframe based on a time range."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                           air_temp_set_1  relative_humidity_set_1  \\\n",
      "Date_Time                                                            \n",
      "2011-11-30 12:00:00-07:00       41.070833                61.262500   \n",
      "2011-11-30 13:00:00-07:00       37.112500                77.275833   \n",
      "2011-11-30 14:00:00-07:00       37.794167                76.843333   \n",
      "2011-11-30 15:00:00-07:00       38.309167                75.110833   \n",
      "2011-11-30 16:00:00-07:00       36.626667                81.808333   \n",
      "2011-11-30 17:00:00-07:00       36.086667                79.774167   \n",
      "2011-11-30 18:00:00-07:00       35.072500                82.941667   \n",
      "2011-11-30 19:00:00-07:00       36.170833                76.920000   \n",
      "2011-11-30 20:00:00-07:00       36.368333                75.253333   \n",
      "2011-11-30 21:00:00-07:00       35.672500                71.585000   \n",
      "2011-11-30 22:00:00-07:00       36.262500                61.963333   \n",
      "2011-11-30 23:00:00-07:00       36.397500                60.644167   \n",
      "2011-12-01 00:00:00-07:00       35.710833                62.371667   \n",
      "2011-12-01 01:00:00-07:00       34.572500                64.782500   \n",
      "2011-12-01 02:00:00-07:00       36.801667                49.818333   \n",
      "2011-12-01 03:00:00-07:00       38.271667                37.009167   \n",
      "2011-12-01 04:00:00-07:00       37.715833                36.103333   \n",
      "2011-12-01 05:00:00-07:00       37.290000                35.842500   \n",
      "2011-12-01 06:00:00-07:00       36.497500                34.757500   \n",
      "2011-12-01 07:00:00-07:00       35.994167                33.582500   \n",
      "2011-12-01 08:00:00-07:00       36.060000                31.395000   \n",
      "2011-12-01 09:00:00-07:00       36.868333                28.220000   \n",
      "2011-12-01 10:00:00-07:00       37.130833                28.402500   \n",
      "2011-12-01 11:00:00-07:00       37.059167                29.966667   \n",
      "2011-12-01 12:00:00-07:00       36.068333                32.129167   \n",
      "2011-12-01 13:00:00-07:00       35.382500                34.141667   \n",
      "2011-12-01 14:00:00-07:00       36.747500                33.377500   \n",
      "2011-12-01 15:00:00-07:00       36.472500                33.803333   \n",
      "2011-12-01 16:00:00-07:00       36.511667                33.147500   \n",
      "2011-12-01 17:00:00-07:00       34.111667                37.075000   \n",
      "2011-12-01 18:00:00-07:00       31.890000                42.536667   \n",
      "2011-12-01 19:00:00-07:00       31.927500                42.229167   \n",
      "2011-12-01 20:00:00-07:00       32.830000                38.487500   \n",
      "2011-12-01 21:00:00-07:00       32.498333                38.027500   \n",
      "2011-12-01 22:00:00-07:00       31.202500                40.455000   \n",
      "2011-12-01 23:00:00-07:00       31.441667                38.649167   \n",
      "2011-12-02 00:00:00-07:00       30.979167                40.770000   \n",
      "2011-12-02 01:00:00-07:00       31.785000                40.255833   \n",
      "2011-12-02 02:00:00-07:00       31.356667                44.642500   \n",
      "2011-12-02 03:00:00-07:00       31.098333                43.977500   \n",
      "2011-12-02 04:00:00-07:00       29.100833                49.737500   \n",
      "2011-12-02 05:00:00-07:00       28.359167                50.650000   \n",
      "2011-12-02 06:00:00-07:00       27.214167                54.053333   \n",
      "2011-12-02 07:00:00-07:00       25.525833                60.672500   \n",
      "2011-12-02 08:00:00-07:00       28.213333                54.267500   \n",
      "2011-12-02 09:00:00-07:00       32.146667                47.738333   \n",
      "2011-12-02 10:00:00-07:00       34.388333                44.708333   \n",
      "2011-12-02 11:00:00-07:00       36.105000                42.290833   \n",
      "2011-12-02 12:00:00-07:00       38.010000                38.910000   \n",
      "\n",
      "                           wind_speed_set_1  wind_direction_set_1  \\\n",
      "Date_Time                                                           \n",
      "2011-11-30 12:00:00-07:00          8.825833            293.066667   \n",
      "2011-11-30 13:00:00-07:00          7.932500            311.608333   \n",
      "2011-11-30 14:00:00-07:00          5.360833            276.608333   \n",
      "2011-11-30 15:00:00-07:00          5.049167            274.108333   \n",
      "2011-11-30 16:00:00-07:00          3.900833            299.675000   \n",
      "2011-11-30 17:00:00-07:00         10.550000            323.441667   \n",
      "2011-11-30 18:00:00-07:00          5.983333            216.864167   \n",
      "2011-11-30 19:00:00-07:00          2.363333            265.808333   \n",
      "2011-11-30 20:00:00-07:00          1.858333            205.978333   \n",
      "2011-11-30 21:00:00-07:00          3.364167            175.933333   \n",
      "2011-11-30 22:00:00-07:00          2.381667            211.918333   \n",
      "2011-11-30 23:00:00-07:00          2.191667            244.408333   \n",
      "2011-12-01 00:00:00-07:00          3.190000            209.165000   \n",
      "2011-12-01 01:00:00-07:00          2.494167            227.047500   \n",
      "2011-12-01 02:00:00-07:00          8.749167            264.300000   \n",
      "2011-12-01 03:00:00-07:00         27.408333             96.825000   \n",
      "2011-12-01 04:00:00-07:00         28.243333             66.291667   \n",
      "2011-12-01 05:00:00-07:00         23.196667             66.730833   \n",
      "2011-12-01 06:00:00-07:00         35.751667             66.654167   \n",
      "2011-12-01 07:00:00-07:00         39.820000             65.290000   \n",
      "2011-12-01 08:00:00-07:00         39.189167             62.622500   \n",
      "2011-12-01 09:00:00-07:00         45.686667             60.757500   \n",
      "2011-12-01 10:00:00-07:00         24.091667             74.785833   \n",
      "2011-12-01 11:00:00-07:00         17.085833             78.295000   \n",
      "2011-12-01 12:00:00-07:00         20.679167             95.750000   \n",
      "2011-12-01 13:00:00-07:00          9.855833            145.594167   \n",
      "2011-12-01 14:00:00-07:00         11.904167            143.310000   \n",
      "2011-12-01 15:00:00-07:00         11.442500             98.513333   \n",
      "2011-12-01 16:00:00-07:00         10.577500             93.065000   \n",
      "2011-12-01 17:00:00-07:00         17.293333             82.720833   \n",
      "2011-12-01 18:00:00-07:00         32.535000             80.155833   \n",
      "2011-12-01 19:00:00-07:00         24.338333             84.640000   \n",
      "2011-12-01 20:00:00-07:00         25.758333             62.915000   \n",
      "2011-12-01 21:00:00-07:00         25.779167             66.466667   \n",
      "2011-12-01 22:00:00-07:00         13.525000             49.516667   \n",
      "2011-12-01 23:00:00-07:00         15.765000             41.401667   \n",
      "2011-12-02 00:00:00-07:00          6.695833             85.031667   \n",
      "2011-12-02 01:00:00-07:00          7.520000             45.764167   \n",
      "2011-12-02 02:00:00-07:00          2.392500            112.525000   \n",
      "2011-12-02 03:00:00-07:00          3.465000             50.455000   \n",
      "2011-12-02 04:00:00-07:00          2.167500            188.166667   \n",
      "2011-12-02 05:00:00-07:00          2.823333            186.527500   \n",
      "2011-12-02 06:00:00-07:00          1.528333            168.907500   \n",
      "2011-12-02 07:00:00-07:00          1.802500            149.603333   \n",
      "2011-12-02 08:00:00-07:00          2.853333            243.021667   \n",
      "2011-12-02 09:00:00-07:00          1.479167            274.033333   \n",
      "2011-12-02 10:00:00-07:00          1.977500            275.891667   \n",
      "2011-12-02 11:00:00-07:00          2.545000            263.858333   \n",
      "2011-12-02 12:00:00-07:00          1.900000            235.700000   \n",
      "\n",
      "                           wind_gust_set_1  \n",
      "Date_Time                                   \n",
      "2011-11-30 12:00:00-07:00        15.224167  \n",
      "2011-11-30 13:00:00-07:00        12.490833  \n",
      "2011-11-30 14:00:00-07:00         8.716667  \n",
      "2011-11-30 15:00:00-07:00         9.271667  \n",
      "2011-11-30 16:00:00-07:00         6.715000  \n",
      "2011-11-30 17:00:00-07:00        13.445000  \n",
      "2011-11-30 18:00:00-07:00         8.852500  \n",
      "2011-11-30 19:00:00-07:00         3.912500  \n",
      "2011-11-30 20:00:00-07:00         3.116667  \n",
      "2011-11-30 21:00:00-07:00         5.665000  \n",
      "2011-11-30 22:00:00-07:00         4.261667  \n",
      "2011-11-30 23:00:00-07:00         4.582500  \n",
      "2011-12-01 00:00:00-07:00         5.182500  \n",
      "2011-12-01 01:00:00-07:00         5.108333  \n",
      "2011-12-01 02:00:00-07:00        17.446667  \n",
      "2011-12-01 03:00:00-07:00        42.325833  \n",
      "2011-12-01 04:00:00-07:00        44.200833  \n",
      "2011-12-01 05:00:00-07:00        42.333333  \n",
      "2011-12-01 06:00:00-07:00        53.620833  \n",
      "2011-12-01 07:00:00-07:00        57.690000  \n",
      "2011-12-01 08:00:00-07:00        55.303333  \n",
      "2011-12-01 09:00:00-07:00        61.510833  \n",
      "2011-12-01 10:00:00-07:00        38.980833  \n",
      "2011-12-01 11:00:00-07:00        25.346667  \n",
      "2011-12-01 12:00:00-07:00        30.819167  \n",
      "2011-12-01 13:00:00-07:00        17.442500  \n",
      "2011-12-01 14:00:00-07:00        22.301667  \n",
      "2011-12-01 15:00:00-07:00        23.310833  \n",
      "2011-12-01 16:00:00-07:00        20.975000  \n",
      "2011-12-01 17:00:00-07:00        26.629167  \n",
      "2011-12-01 18:00:00-07:00        42.610833  \n",
      "2011-12-01 19:00:00-07:00        34.619167  \n",
      "2011-12-01 20:00:00-07:00        39.640000  \n",
      "2011-12-01 21:00:00-07:00        38.998333  \n",
      "2011-12-01 22:00:00-07:00        24.290833  \n",
      "2011-12-01 23:00:00-07:00        23.865000  \n",
      "2011-12-02 00:00:00-07:00        12.403333  \n",
      "2011-12-02 01:00:00-07:00        12.390833  \n",
      "2011-12-02 02:00:00-07:00         4.327500  \n",
      "2011-12-02 03:00:00-07:00         5.329167  \n",
      "2011-12-02 04:00:00-07:00         3.623333  \n",
      "2011-12-02 05:00:00-07:00         4.056667  \n",
      "2011-12-02 06:00:00-07:00         2.994167  \n",
      "2011-12-02 07:00:00-07:00         2.645000  \n",
      "2011-12-02 08:00:00-07:00         4.336667  \n",
      "2011-12-02 09:00:00-07:00         3.063333  \n",
      "2011-12-02 10:00:00-07:00         3.862500  \n",
      "2011-12-02 11:00:00-07:00         4.686667  \n",
      "2011-12-02 12:00:00-07:00         3.470000  \n"
     ]
    }
   ],
   "source": [
    "dat_1hour = dat.resample('60Min').mean()\n",
    "print(dat_1hour)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "⚠️⚠️⚠️ Before moving on, is there something sloppy that I assumed here? ⚠️⚠️⚠️\n",
    "\n",
    "\n",
    "What can we do? Lets decompose our wind speed and direction into u- and v-wind components, and then take the average for each hour!\n",
    "Then we can just convert our u- and v-wind components back into a wind speed and direction!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "u = -1*dat['wind_speed_set_1']*np.sin(dat['wind_direction_set_1']*(np.pi/180))\n",
    "v = -1*dat['wind_speed_set_1']*np.cos(dat['wind_direction_set_1']*(np.pi/180))\n",
    "    \n",
    "dat['u_wind'] = u\n",
    "dat['v_wind'] = v\n",
    "    \n",
    "dat_1hour = dat.resample('60Min').mean()\n",
    "    \n",
    "dat_1hour['wind_speed_set_1'] = np.sqrt(dat_1hour['u_wind']**2 + dat_1hour['v_wind']**2)\n",
    "dat_1hour['wind_direction_set_1'] = (270-(np.arctan2(dat_1hour['v_wind'],dat_1hour['u_wind'])*(180/np.pi)))%360\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 5.77777778  4.21111111  3.71111111  4.41111111  2.78888889  2.37777778\n",
      "  2.31111111  2.71111111  2.82777778  2.32777778  2.72777778  2.85\n",
      "  2.27222222  1.66111111  3.5         3.68888889  3.46111111  3.12222222\n",
      "  2.77222222  2.37777778  2.51111111  3.22777778  3.45        3.61111111\n",
      "  2.56111111  2.77777778  3.27222222  2.85        2.87222222  1.81111111\n",
      "  0.22222222  0.92222222  1.01111111  0.66111111  0.43888889  0.27222222\n",
      " -0.11111111  0.16111111  0.01111111 -0.03888889 -0.53888889 -1.48888889\n",
      " -2.03888889 -2.9         0.28888889  1.18888889  2.32777778  3.27222222\n",
      "  3.33888889]\n"
     ]
    }
   ],
   "source": [
    "dat['air_temp_set_1'] = (dat['air_temp_set_1'] - 32) * (5/9)\n",
    "dat1_hr = dat.resample('60Min').min()\n",
    "dat1_hr = dat.resample('60Min').max()\n",
    "\n",
    "print(np.array(dat1_hr['air_temp_set_1']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br> \n",
    "\n",
    "# Do it yourself #3\n",
    "\n",
    "Can you compute the hourly max and min temperature in degrees of Celsius? Temperature is currently being reported in Fahrenheit."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "\n",
    "**Time conversion:** There may also be cases where you will need to convert your time objects between UTC and LST. For example, most environmental data sets are reported in UTC. However, it may be more intuitive to look at the data in LST, especially when looking at data such as temperature and relative humidity. Fortunately, Pandas makes it *relatively* easy to convert between time zones.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DatetimeIndex(['2011-11-30 12:00:00-07:00', '2011-11-30 12:05:00-07:00',\n",
      "               '2011-11-30 12:10:00-07:00', '2011-11-30 12:15:00-07:00',\n",
      "               '2011-11-30 12:20:00-07:00', '2011-11-30 12:25:00-07:00',\n",
      "               '2011-11-30 12:30:00-07:00', '2011-11-30 12:35:00-07:00',\n",
      "               '2011-11-30 12:40:00-07:00', '2011-11-30 12:45:00-07:00',\n",
      "               ...\n",
      "               '2011-12-02 11:15:00-07:00', '2011-12-02 11:20:00-07:00',\n",
      "               '2011-12-02 11:25:00-07:00', '2011-12-02 11:30:00-07:00',\n",
      "               '2011-12-02 11:35:00-07:00', '2011-12-02 11:40:00-07:00',\n",
      "               '2011-12-02 11:45:00-07:00', '2011-12-02 11:50:00-07:00',\n",
      "               '2011-12-02 11:55:00-07:00', '2011-12-02 12:00:00-07:00'],\n",
      "              dtype='datetime64[ns, pytz.FixedOffset(-420)]', name='Date_Time', length=577, freq=None)\n"
     ]
    }
   ],
   "source": [
    "#Before time conversion...\n",
    "print(dat.index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "Here, python assumed that the timestamp is in MT (hence the -07:00) since our timestamp had 'MST' when we read it in. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Lets apply a time conversion that converts our times in MST to UTC\n",
    "dat.index = dat.index.tz_convert('UTC')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DatetimeIndex(['2011-11-30 19:00:00+00:00', '2011-11-30 19:05:00+00:00',\n",
      "               '2011-11-30 19:10:00+00:00', '2011-11-30 19:15:00+00:00',\n",
      "               '2011-11-30 19:20:00+00:00', '2011-11-30 19:25:00+00:00',\n",
      "               '2011-11-30 19:30:00+00:00', '2011-11-30 19:35:00+00:00',\n",
      "               '2011-11-30 19:40:00+00:00', '2011-11-30 19:45:00+00:00',\n",
      "               ...\n",
      "               '2011-12-02 18:15:00+00:00', '2011-12-02 18:20:00+00:00',\n",
      "               '2011-12-02 18:25:00+00:00', '2011-12-02 18:30:00+00:00',\n",
      "               '2011-12-02 18:35:00+00:00', '2011-12-02 18:40:00+00:00',\n",
      "               '2011-12-02 18:45:00+00:00', '2011-12-02 18:50:00+00:00',\n",
      "               '2011-12-02 18:55:00+00:00', '2011-12-02 19:00:00+00:00'],\n",
      "              dtype='datetime64[ns, UTC]', name='Date_Time', length=577, freq=None)\n"
     ]
    }
   ],
   "source": [
    "#Print times after the conversion...\n",
    "print(dat.index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Did anything happen?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "**Cautionary tale:** Working with time objects can often be one of the more frustrating parts of programming, and this is often where a lot of bugs may arise. So always becareful when working with these data types! Also, do not get too discouraged. Even people like myself who have been programming for years often run into problems with time objects!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reading in timestamp and assigning it to the pandas index makes it really easy to filter the data accordingly. For example, lets select a range of data we want to look at..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                          Station_ID  air_temp_set_1  relative_humidity_set_1  \\\n",
      "Date_Time                                                                       \n",
      "2011-12-01 00:00:00+00:00        WBB        2.350000                    80.70   \n",
      "2011-12-01 00:05:00+00:00        WBB        2.322222                    79.28   \n",
      "2011-12-01 00:10:00+00:00        WBB        2.277778                    78.61   \n",
      "2011-12-01 00:15:00+00:00        WBB        2.327778                    78.33   \n",
      "2011-12-01 00:20:00+00:00        WBB        2.377778                    78.26   \n",
      "2011-12-01 00:25:00+00:00        WBB        2.327778                    78.83   \n",
      "2011-12-01 00:30:00+00:00        WBB        2.338889                    79.03   \n",
      "2011-12-01 00:35:00+00:00        WBB        2.350000                    79.28   \n",
      "2011-12-01 00:40:00+00:00        WBB        2.311111                    79.87   \n",
      "2011-12-01 00:45:00+00:00        WBB        2.261111                    80.30   \n",
      "2011-12-01 00:50:00+00:00        WBB        2.072222                    81.90   \n",
      "2011-12-01 00:55:00+00:00        WBB        1.927778                    82.90   \n",
      "2011-12-01 01:00:00+00:00        WBB        1.861111                    83.00   \n",
      "2011-12-01 01:05:00+00:00        WBB        1.827778                    83.20   \n",
      "2011-12-01 01:10:00+00:00        WBB        1.838889                    81.20   \n",
      "2011-12-01 01:15:00+00:00        WBB        1.577778                    82.10   \n",
      "2011-12-01 01:20:00+00:00        WBB        1.338889                    83.40   \n",
      "2011-12-01 01:25:00+00:00        WBB        1.188889                    84.90   \n",
      "2011-12-01 01:30:00+00:00        WBB        1.338889                    84.80   \n",
      "2011-12-01 01:35:00+00:00        WBB        1.461111                    84.90   \n",
      "2011-12-01 01:40:00+00:00        WBB        1.688889                    84.20   \n",
      "2011-12-01 01:45:00+00:00        WBB        1.938889                    82.70   \n",
      "2011-12-01 01:50:00+00:00        WBB        2.111111                    80.70   \n",
      "2011-12-01 01:55:00+00:00        WBB        2.311111                    80.20   \n",
      "2011-12-01 02:00:00+00:00        WBB        2.188889                    80.20   \n",
      "2011-12-01 02:05:00+00:00        WBB        2.111111                    81.20   \n",
      "2011-12-01 02:10:00+00:00        WBB        2.072222                    79.31   \n",
      "2011-12-01 02:15:00+00:00        WBB        2.111111                    78.24   \n",
      "2011-12-01 02:20:00+00:00        WBB        2.138889                    76.21   \n",
      "2011-12-01 02:25:00+00:00        WBB        2.388889                    72.50   \n",
      "...                              ...             ...                      ...   \n",
      "2011-12-01 09:35:00+00:00        WBB        3.338889                    43.54   \n",
      "2011-12-01 09:40:00+00:00        WBB        3.261111                    45.34   \n",
      "2011-12-01 09:45:00+00:00        WBB        3.311111                    43.63   \n",
      "2011-12-01 09:50:00+00:00        WBB        3.500000                    41.62   \n",
      "2011-12-01 09:55:00+00:00        WBB        3.350000                    43.36   \n",
      "2011-12-01 10:00:00+00:00        WBB        3.388889                    40.87   \n",
      "2011-12-01 10:05:00+00:00        WBB        3.472222                    40.19   \n",
      "2011-12-01 10:10:00+00:00        WBB        3.327778                    41.26   \n",
      "2011-12-01 10:15:00+00:00        WBB        3.338889                    41.63   \n",
      "2011-12-01 10:20:00+00:00        WBB        3.372222                    41.56   \n",
      "2011-12-01 10:25:00+00:00        WBB        3.427778                    35.37   \n",
      "2011-12-01 10:30:00+00:00        WBB        3.561111                    33.83   \n",
      "2011-12-01 10:35:00+00:00        WBB        3.577778                    33.60   \n",
      "2011-12-01 10:40:00+00:00        WBB        3.688889                    33.14   \n",
      "2011-12-01 10:45:00+00:00        WBB        3.588889                    33.82   \n",
      "2011-12-01 10:50:00+00:00        WBB        3.627778                    33.58   \n",
      "2011-12-01 10:55:00+00:00        WBB        3.438889                    35.26   \n",
      "2011-12-01 11:00:00+00:00        WBB        3.427778                    35.25   \n",
      "2011-12-01 11:05:00+00:00        WBB        3.422222                    35.10   \n",
      "2011-12-01 11:10:00+00:00        WBB        3.461111                    34.99   \n",
      "2011-12-01 11:15:00+00:00        WBB        3.238889                    36.07   \n",
      "2011-12-01 11:20:00+00:00        WBB        3.238889                    35.65   \n",
      "2011-12-01 11:25:00+00:00        WBB        3.177778                    35.88   \n",
      "2011-12-01 11:30:00+00:00        WBB        3.072222                    36.19   \n",
      "2011-12-01 11:35:00+00:00        WBB        3.022222                    36.51   \n",
      "2011-12-01 11:40:00+00:00        WBB        2.888889                    37.14   \n",
      "2011-12-01 11:45:00+00:00        WBB        2.977778                    36.66   \n",
      "2011-12-01 11:50:00+00:00        WBB        3.038889                    36.94   \n",
      "2011-12-01 11:55:00+00:00        WBB        3.138889                    36.86   \n",
      "2011-12-01 12:00:00+00:00        WBB        3.111111                    36.89   \n",
      "\n",
      "                           wind_speed_set_1  wind_direction_set_1  \\\n",
      "Date_Time                                                           \n",
      "2011-12-01 00:00:00+00:00              8.86                314.90   \n",
      "2011-12-01 00:05:00+00:00             10.67                319.00   \n",
      "2011-12-01 00:10:00+00:00             11.30                322.40   \n",
      "2011-12-01 00:15:00+00:00             11.32                319.90   \n",
      "2011-12-01 00:20:00+00:00             10.49                320.30   \n",
      "2011-12-01 00:25:00+00:00             10.02                324.20   \n",
      "2011-12-01 00:30:00+00:00             11.61                329.50   \n",
      "2011-12-01 00:35:00+00:00             11.18                323.10   \n",
      "2011-12-01 00:40:00+00:00             10.38                329.70   \n",
      "2011-12-01 00:45:00+00:00             10.54                320.60   \n",
      "2011-12-01 00:50:00+00:00             10.54                324.40   \n",
      "2011-12-01 00:55:00+00:00              9.69                333.30   \n",
      "2011-12-01 01:00:00+00:00              8.79                345.10   \n",
      "2011-12-01 01:05:00+00:00              7.72                356.80   \n",
      "2011-12-01 01:10:00+00:00             11.10                358.10   \n",
      "2011-12-01 01:15:00+00:00             11.50                356.30   \n",
      "2011-12-01 01:20:00+00:00             11.61                 11.69   \n",
      "2011-12-01 01:25:00+00:00              7.81                 27.91   \n",
      "2011-12-01 01:30:00+00:00              5.17                 63.33   \n",
      "2011-12-01 01:35:00+00:00              1.68                 58.64   \n",
      "2011-12-01 01:40:00+00:00              2.01                200.90   \n",
      "2011-12-01 01:45:00+00:00              2.33                300.90   \n",
      "2011-12-01 01:50:00+00:00              1.43                282.70   \n",
      "2011-12-01 01:55:00+00:00              0.65                240.00   \n",
      "2011-12-01 02:00:00+00:00              0.40                137.70   \n",
      "2011-12-01 02:05:00+00:00              0.60                329.10   \n",
      "2011-12-01 02:10:00+00:00              3.40                325.10   \n",
      "2011-12-01 02:15:00+00:00              3.89                336.00   \n",
      "2011-12-01 02:20:00+00:00              3.18                336.90   \n",
      "2011-12-01 02:25:00+00:00              1.81                349.00   \n",
      "...                                     ...                   ...   \n",
      "2011-12-01 09:35:00+00:00             11.05                354.90   \n",
      "2011-12-01 09:40:00+00:00              8.93                326.20   \n",
      "2011-12-01 09:45:00+00:00             10.20                 93.40   \n",
      "2011-12-01 09:50:00+00:00              6.24                256.80   \n",
      "2011-12-01 09:55:00+00:00              8.21                317.90   \n",
      "2011-12-01 10:00:00+00:00             11.79                 58.46   \n",
      "2011-12-01 10:05:00+00:00             12.35                 19.09   \n",
      "2011-12-01 10:10:00+00:00              8.23                 18.18   \n",
      "2011-12-01 10:15:00+00:00              4.52                352.50   \n",
      "2011-12-01 10:20:00+00:00              5.59                310.60   \n",
      "2011-12-01 10:25:00+00:00             22.24                 59.96   \n",
      "2011-12-01 10:30:00+00:00             33.58                 51.70   \n",
      "2011-12-01 10:35:00+00:00             41.18                 56.10   \n",
      "2011-12-01 10:40:00+00:00             50.64                 58.60   \n",
      "2011-12-01 10:45:00+00:00             49.77                 56.80   \n",
      "2011-12-01 10:50:00+00:00             50.76                 58.54   \n",
      "2011-12-01 10:55:00+00:00             38.25                 61.37   \n",
      "2011-12-01 11:00:00+00:00             36.95                 58.87   \n",
      "2011-12-01 11:05:00+00:00             36.91                 60.30   \n",
      "2011-12-01 11:10:00+00:00             29.93                 63.03   \n",
      "2011-12-01 11:15:00+00:00             29.86                 60.46   \n",
      "2011-12-01 11:20:00+00:00             30.00                 61.33   \n",
      "2011-12-01 11:25:00+00:00             28.99                 62.26   \n",
      "2011-12-01 11:30:00+00:00             28.25                 61.94   \n",
      "2011-12-01 11:35:00+00:00             23.35                 66.40   \n",
      "2011-12-01 11:40:00+00:00             22.77                 73.16   \n",
      "2011-12-01 11:45:00+00:00             26.35                 73.13   \n",
      "2011-12-01 11:50:00+00:00             24.67                 82.40   \n",
      "2011-12-01 11:55:00+00:00             20.89                 72.22   \n",
      "2011-12-01 12:00:00+00:00             22.75                 74.26   \n",
      "\n",
      "                           wind_gust_set_1     u_wind     v_wind  \n",
      "Date_Time                                                         \n",
      "2011-12-01 00:00:00+00:00            11.43   6.275891  -6.254022  \n",
      "2011-12-01 00:05:00+00:00            13.60   7.000150  -8.052751  \n",
      "2011-12-01 00:10:00+00:00            13.47   6.894640  -8.952873  \n",
      "2011-12-01 00:15:00+00:00            14.47   7.291479  -8.658910  \n",
      "2011-12-01 00:20:00+00:00            13.76   6.700674  -8.071001  \n",
      "2011-12-01 00:25:00+00:00            12.28   5.861276  -8.126859  \n",
      "2011-12-01 00:30:00+00:00            14.52   5.892520 -10.003515  \n",
      "2011-12-01 00:35:00+00:00            14.83   6.712698  -8.940474  \n",
      "2011-12-01 00:40:00+00:00            13.65   5.236997  -8.962046  \n",
      "2011-12-01 00:45:00+00:00            12.93   6.690060  -8.144612  \n",
      "2011-12-01 00:50:00+00:00            13.69   6.135576  -8.570082  \n",
      "2011-12-01 00:55:00+00:00            12.71   4.353901  -8.656769  \n",
      "2011-12-01 01:00:00+00:00            13.02   2.260197  -8.494446  \n",
      "2011-12-01 01:05:00+00:00            11.01   0.430942  -7.707963  \n",
      "2011-12-01 01:10:00+00:00            16.31   0.368022 -11.093897  \n",
      "2011-12-01 01:15:00+00:00            15.97   0.742122 -11.476030  \n",
      "2011-12-01 01:20:00+00:00            16.26  -2.352376 -11.369188  \n",
      "2011-12-01 01:25:00+00:00            10.04  -3.655736  -6.901572  \n",
      "2011-12-01 01:30:00+00:00             8.12  -4.619946  -2.320561  \n",
      "2011-12-01 01:35:00+00:00             3.20  -1.434576  -0.874295  \n",
      "2011-12-01 01:40:00+00:00             3.15   0.717043   1.877751  \n",
      "2011-12-01 01:45:00+00:00             3.89   1.999291  -1.196551  \n",
      "2011-12-01 01:50:00+00:00             3.02   1.395014  -0.314380  \n",
      "2011-12-01 01:55:00+00:00             2.24   0.562917   0.325000  \n",
      "2011-12-01 02:00:00+00:00             1.52  -0.269205   0.295852  \n",
      "2011-12-01 02:05:00+00:00             3.15   0.308125  -0.514839  \n",
      "2011-12-01 02:10:00+00:00             4.21   1.945296  -2.788516  \n",
      "2011-12-01 02:15:00+00:00             5.17   1.582206  -3.553692  \n",
      "2011-12-01 02:20:00+00:00             5.03   1.247632  -2.925032  \n",
      "2011-12-01 02:25:00+00:00             3.78   0.345364  -1.776745  \n",
      "...                                    ...        ...        ...  \n",
      "2011-12-01 09:35:00+00:00            18.23   0.982282 -11.006254  \n",
      "2011-12-01 09:40:00+00:00            20.29   4.967720  -7.420691  \n",
      "2011-12-01 09:45:00+00:00            26.35 -10.182046   0.604925  \n",
      "2011-12-01 09:50:00+00:00            17.09   6.075132   1.424909  \n",
      "2011-12-01 09:55:00+00:00            17.22   5.504203  -6.091622  \n",
      "2011-12-01 10:00:00+00:00            29.73 -10.048324  -6.167275  \n",
      "2011-12-01 10:05:00+00:00            21.83  -4.039104 -11.670824  \n",
      "2011-12-01 10:10:00+00:00            17.87  -2.567787  -7.819167  \n",
      "2011-12-01 10:15:00+00:00            11.48   0.589978  -4.481331  \n",
      "2011-12-01 10:20:00+00:00            12.10   4.244327  -3.637828  \n",
      "2011-12-01 10:25:00+00:00            49.57 -19.252637 -11.133444  \n",
      "2011-12-01 10:30:00+00:00            48.92 -26.352791 -20.812180  \n",
      "2011-12-01 10:35:00+00:00            56.46 -34.179906 -22.967944  \n",
      "2011-12-01 10:40:00+00:00            68.83 -43.223812 -26.383928  \n",
      "2011-12-01 10:45:00+00:00            69.43 -41.645760 -27.252222  \n",
      "2011-12-01 10:50:00+00:00            67.47 -43.298520 -26.491806  \n",
      "2011-12-01 10:55:00+00:00            54.22 -33.573257 -18.327545  \n",
      "2011-12-01 11:00:00+00:00            54.67 -31.629071 -19.102470  \n",
      "2011-12-01 11:05:00+00:00            54.87 -32.061189 -18.287379  \n",
      "2011-12-01 11:10:00+00:00            51.02 -26.674936 -13.573971  \n",
      "2011-12-01 11:15:00+00:00            40.02 -25.978550 -14.721908  \n",
      "2011-12-01 11:20:00+00:00            41.11 -26.321925 -14.392925  \n",
      "2011-12-01 11:25:00+00:00            39.71 -25.658147 -13.493687  \n",
      "2011-12-01 11:30:00+00:00            40.42 -24.929367 -13.288685  \n",
      "2011-12-01 11:35:00+00:00            36.69 -21.397070  -9.348150  \n",
      "2011-12-01 11:40:00+00:00            44.13 -21.793565  -6.596470  \n",
      "2011-12-01 11:45:00+00:00            48.38 -25.216045  -7.646801  \n",
      "2011-12-01 11:50:00+00:00            41.94 -24.453287  -3.262765  \n",
      "2011-12-01 11:55:00+00:00            37.45 -19.892211  -6.379032  \n",
      "2011-12-01 12:00:00+00:00            48.27 -21.896934  -6.171449  \n",
      "\n",
      "[145 rows x 8 columns]\n"
     ]
    }
   ],
   "source": [
    "print(dat['2011-12-01 00:00:00':'2011-12-01 12:00:00'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "What if we want to grab data that fall on the *12th hour of every day*?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                          Station_ID  air_temp_set_1  relative_humidity_set_1  \\\n",
      "Date_Time                                                                       \n",
      "2011-12-01 12:00:00+00:00        WBB        3.111111                    36.89   \n",
      "2011-12-01 12:05:00+00:00        WBB        3.050000                    36.80   \n",
      "2011-12-01 12:10:00+00:00        WBB        2.977778                    36.28   \n",
      "2011-12-01 12:15:00+00:00        WBB        2.877778                    36.44   \n",
      "2011-12-01 12:20:00+00:00        WBB        2.961111                    35.82   \n",
      "2011-12-01 12:25:00+00:00        WBB        2.927778                    35.96   \n",
      "2011-12-01 12:30:00+00:00        WBB        2.922222                    35.65   \n",
      "2011-12-01 12:35:00+00:00        WBB        3.122222                    33.92   \n",
      "2011-12-01 12:40:00+00:00        WBB        2.988889                    35.06   \n",
      "2011-12-01 12:45:00+00:00        WBB        2.777778                    35.91   \n",
      "2011-12-01 12:50:00+00:00        WBB        2.811111                    35.55   \n",
      "2011-12-01 12:55:00+00:00        WBB        2.738889                    35.83   \n",
      "2011-12-02 12:00:00+00:00        WBB       -1.738889                    50.04   \n",
      "2011-12-02 12:05:00+00:00        WBB       -1.488889                    48.59   \n",
      "2011-12-02 12:10:00+00:00        WBB       -1.700000                    50.37   \n",
      "2011-12-02 12:15:00+00:00        WBB       -1.922222                    51.27   \n",
      "2011-12-02 12:20:00+00:00        WBB       -2.050000                    51.73   \n",
      "2011-12-02 12:25:00+00:00        WBB       -2.227778                    52.03   \n",
      "2011-12-02 12:30:00+00:00        WBB       -2.388889                    51.57   \n",
      "2011-12-02 12:35:00+00:00        WBB       -2.450000                    52.25   \n",
      "2011-12-02 12:40:00+00:00        WBB       -2.338889                    51.95   \n",
      "2011-12-02 12:45:00+00:00        WBB       -2.161111                    50.36   \n",
      "2011-12-02 12:50:00+00:00        WBB       -1.877778                    48.63   \n",
      "2011-12-02 12:55:00+00:00        WBB       -1.927778                    49.01   \n",
      "\n",
      "                           wind_speed_set_1  wind_direction_set_1  \\\n",
      "Date_Time                                                           \n",
      "2011-12-01 12:00:00+00:00             22.75                 74.26   \n",
      "2011-12-01 12:05:00+00:00             16.80                 70.98   \n",
      "2011-12-01 12:10:00+00:00             20.56                 63.14   \n",
      "2011-12-01 12:15:00+00:00             18.39                 79.99   \n",
      "2011-12-01 12:20:00+00:00             23.42                 58.35   \n",
      "2011-12-01 12:25:00+00:00             18.54                 64.19   \n",
      "2011-12-01 12:30:00+00:00             22.10                 60.78   \n",
      "2011-12-01 12:35:00+00:00             26.93                 51.69   \n",
      "2011-12-01 12:40:00+00:00             23.55                 62.64   \n",
      "2011-12-01 12:45:00+00:00             26.40                 76.26   \n",
      "2011-12-01 12:50:00+00:00             31.56                 68.76   \n",
      "2011-12-01 12:55:00+00:00             27.36                 69.73   \n",
      "2011-12-02 12:00:00+00:00              3.33                149.60   \n",
      "2011-12-02 12:05:00+00:00              2.89                163.30   \n",
      "2011-12-02 12:10:00+00:00              3.18                166.70   \n",
      "2011-12-02 12:15:00+00:00              2.10                134.80   \n",
      "2011-12-02 12:20:00+00:00              0.54                133.60   \n",
      "2011-12-02 12:25:00+00:00              3.38                334.60   \n",
      "2011-12-02 12:30:00+00:00              3.62                345.70   \n",
      "2011-12-02 12:35:00+00:00              3.33                344.10   \n",
      "2011-12-02 12:40:00+00:00              2.91                352.90   \n",
      "2011-12-02 12:45:00+00:00              2.98                 21.97   \n",
      "2011-12-02 12:50:00+00:00              3.56                 38.88   \n",
      "2011-12-02 12:55:00+00:00              2.06                 52.18   \n",
      "\n",
      "                           wind_gust_set_1     u_wind     v_wind  \n",
      "Date_Time                                                         \n",
      "2011-12-01 12:00:00+00:00            48.27 -21.896934  -6.171449  \n",
      "2011-12-01 12:05:00+00:00            32.93 -15.882802  -5.475089  \n",
      "2011-12-01 12:10:00+00:00            41.47 -18.341847  -9.289255  \n",
      "2011-12-01 12:15:00+00:00            31.61 -18.110057  -3.196551  \n",
      "2011-12-01 12:20:00+00:00            42.61 -19.936728 -12.289153  \n",
      "2011-12-01 12:25:00+00:00            29.89 -16.690501  -8.072098  \n",
      "2011-12-01 12:30:00+00:00            36.31 -19.287813 -10.788432  \n",
      "2011-12-01 12:35:00+00:00            45.81 -21.131114 -16.694338  \n",
      "2011-12-01 12:40:00+00:00            44.49 -20.915613 -10.823106  \n",
      "2011-12-01 12:45:00+00:00            53.66 -25.644525  -6.270432  \n",
      "2011-12-01 12:50:00+00:00            52.86 -29.416164 -11.433411  \n",
      "2011-12-01 12:55:00+00:00            48.09 -25.665608  -9.478722  \n",
      "2011-12-02 12:00:00+00:00             5.08  -1.685092   2.872171  \n",
      "2011-12-02 12:05:00+00:00             4.16  -0.830472   2.768107  \n",
      "2011-12-02 12:10:00+00:00             4.43  -0.731558   3.094709  \n",
      "2011-12-02 12:15:00+00:00             3.94  -1.490099   1.479732  \n",
      "2011-12-02 12:20:00+00:00             1.52  -0.391053   0.372395  \n",
      "2011-12-02 12:25:00+00:00             4.94   1.449801  -3.053273  \n",
      "2011-12-02 12:30:00+00:00             4.88   0.894136  -3.507837  \n",
      "2011-12-02 12:35:00+00:00             4.25   0.912284  -3.202599  \n",
      "2011-12-02 12:40:00+00:00             3.94   0.359680  -2.887686  \n",
      "2011-12-02 12:45:00+00:00             4.07  -1.114881  -2.763592  \n",
      "2011-12-02 12:50:00+00:00             4.00  -2.234581  -2.771326  \n",
      "2011-12-02 12:55:00+00:00             3.47  -1.627278  -1.263157  \n"
     ]
    }
   ],
   "source": [
    "print(dat[dat.index.hour == 12])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "And you get the idea..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br> \n",
    "\n",
    "# Do it yourself #4\n",
    "\n",
    "Lets play around with some hourly air quality data from the EPA We are going to read in a very large data set from EPA's AirNow website, which one-stop source for air quality data.Good news is there is a lot of data, especially for the western U.S. The bad news? They give us everything! This data set is an example where programs like Microsoft Excel really start getting bogged down, and therefore requires a different *'tool'*, which in this case, will be \n",
    "Python!\n",
    "\n",
    "<img src='../images/WSFC_smoke_2020-08-22_00-00-00.png' width=700px align='center' style='padding-left:100px'>\n",
    "\n",
    "For this **Do it Yourself**, I want the class to read in this data using pandas read csv function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Set linux path to our file and concatenate the path and string\n",
    "filepath = '/uufs/chpc.utah.edu/common/home/u0703457/lin-group7/dvm/projects/UDAQ_2020-22/obs/AirNow/'\n",
    "filename = 'hourly_88101_PM25_2020.csv'\n",
    "filename = filepath+filename\n",
    "\n",
    "#Read in data with our pandas read_csv function. Only read in specific columns of this data set as given by\n",
    "#the usecols arguement. Rename the column names to something simpler.\n",
    "aq_dat = pd.read_csv(filename,sep=\",\",usecols=['State Code','County Code','Site Num','Latitude','Longitude','Date GMT','Time GMT','Sample Measurement'],parse_dates=[['Date GMT', 'Time GMT']])\n",
    "aq_dat = aq_dat.rename(columns={'Latitude': 'lat','Longitude':'lon','Date GMT_Time GMT':'Time','Sample Measurement':'pm25'}) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "**Step 1:** How many Sample Measurements (PM2.5) do we have?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 2:** Filter the above data set by latitude and longitude. Lets filter out everything west of *-112.5 W*\n",
    "and east of *-111.5 W* and north of *41 N* and south of *40 N*\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 3:** Now, lets filter this by time. Lets only grab data between 2020-08-21 and 2020-08-24. This can be done using the between method, which is super useful when working with pandas time series. For example:<br><br>\n",
    "`data[data['Time'].between('2020-08-21', '2020-08-24')]`\n",
    "<br><br>\n",
    "**QUESTION! Why didn't we assign the time to the index like the earlier example??**\n",
    "<br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 4:** After we've filtered our data set, how many Sample Measurements (PM2.5) do we have?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 5:** What is the highest PM2.5 reading that we have? What is the lowest reading? What is the average PM2.5 reading?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Step 6:** Looking at the data, does anything look odd? If so, how should we handle this?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br><br>\n",
    "\n",
    "# Want more practice!?\n",
    "Check out the following webpages:<br>\n",
    "https://pandas.pydata.org/pandas-docs/version/0.15/tutorials.html<br>\n",
    "https://www.youtube.com/watch?v=dcqPhpY7tWk&t=113s<br>\n",
    "https://www.earthdatascience.org/courses/use-data-open-source-python/use-time-series-data-in-python/date-time-types-in-pandas-python/subset-time-series-data-python/<br>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}