{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "This block contains comments as markdown code\n", "# **Chapter 1** \n", "**ATMOS 5340: Environmental Programming and Statistics** \n", "**John Horel **\n", "\n", "- Follow these directions for doing these steps using a linux terminal window\n", "- are you in your atmos_5340/chapter1 directory?\n", "- check the directory you are in: pwd\n", "- If not in that directory, then type: cd \n", "- And then type: cd atmos_5340/chapter1\n", "- Type the following (Note the dot after the space): cp ~u0035056/atmos_5340_2022/chapter1/* . \n", "- Have you already copied the data directory? If not, you'll need to do that too. Review the instructions for that\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Using Python modules\n", "\n", "`numpy` provides routines to handle arrays and many calculations efficiently and is imported by convention as `np`. Numpy functions are very good at handling homogeneous data arrays (and similar in that respect to matlab functions).\n", "\n", "`pandas` is really good at handling tabular/array data that may have heterogeneous types (floating and text, for example). It is imported by convention as `pd`. \n", "\n", "There are a couple sets of panda library routines (`Series`, and `DataFrame`) used so frequently that we'll import those directly too.\n", "\n", "`scipy` has a bunch of statistical functions and we'll import `stats` from `scipy`\n", "\n", "\n", "`pyplot` is a _submodule_ of matplotlib. It is typically imported as the alias `plt` to handle basic plotting" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# these are python modules used in the program\n", "import numpy as np\n", "import pandas as pd\n", "from pandas import Series, DataFrame\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alta snowfall\n", "https://utahavalanchecenter.org/alta-monthly-snowfall\n", "\n", "\n", "Look in the `data` folder at the called `alta_snow.csv`\n", "\n", "Open the `alta_snow.csv` file see the column contents and the units.\n", "\n", "- The 0th column is the Year at Season End\n", "- The 1st-6th column are the total snowfall in each month from November to April (in inches)\n", "- The 7th column is the Nov-Apr total snowfall (inches)\n", "\n", "Begins in the 1946 season and ends in 2022" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1946. 1947. 1948. 1949. 1950. 1951. 1952. 1953. 1954. 1955. 1956. 1957.\n", " 1958. 1959. 1960. 1961. 1962. 1963. 1964. 1965. 1966. 1967. 1968. 1969.\n", " 1970. 1971. 1972. 1973. 1974. 1975. 1976. 1977. 1978. 1979. 1980. 1981.\n", " 1982. 1983. 1984. 1985. 1986. 1987. 1988. 1989. 1990. 1991. 1992. 1993.\n", " 1994. 1995. 1996. 1997. 1998. 1999. 2000. 2001. 2002. 2003. 2004. 2005.\n", " 2006. 2007. 2008. 2009. 2010. 2011. 2012. 2013. 2014. 2015. 2016. 2017.\n", " 2018. 2019. 2020. 2021. 2022.]\n", "[1145.54 949.96 1394.46 1328.42 1211.58 886.46 1628.14 1043.94\n", " 972.82 1198.88 1168.4 980.44 1421.13 980.44 1004.57 828.04\n", " 1019.81 1018.54 1437.64 1455.42 1099.82 1381.76 1217.93 1437.894\n", " 1165.86 1223.01 1185.164 1261.11 1512.824 1536.7 1116.33 798.83\n", " 1332.23 1493.52 1305.56 993.14 1767.84 1617.98 1888.49 1160.78\n", " 1521.46 969.772 1042.162 1477.01 1137.92 1473.708 1003.3 1652.016\n", " 1245.362 1893.316 1427.48 1521.714 1460.246 1164.336 1132.84 1193.038\n", " 1441.958 1014.476 1449.832 1406.144 1609.09 904.24 1661.16 1468.12\n", " 1092.2 1404.62 836.93 971.55 908.05 679.45 998.22 1347.47\n", " 731.52 1206.5 1056.64 949.96 717.296]\n", "Min: 679.5 Max: 1893.3\n" ] } ], "source": [ "#read the year of the Alta snowfall data\n", "year = np.genfromtxt('../data/alta_snow.csv', delimiter=',',usecols=0,skip_header = 1)\n", "print(year)\n", "#read the seasonal total and convert from inches to cm\n", "snow = 2.54 * np.genfromtxt('../data/alta_snow.csv', delimiter=',', usecols=7, skip_header=1)\n", "#print out the data after converting it to cm\n", "print(snow)\n", "#what are the min and max values?\n", "print(\"Min: %.1f Max: %.1f\" % (np.min(snow),np.max(snow)))" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Alta Snow (cm)
19461145.5
1947950.0
19481394.5
19491328.4
19501211.6
......
2018731.5
20191206.5
20201056.6
2021950.0
2022717.3
\n", "

77 rows × 1 columns

\n", "
" ], "text/plain": [ " Alta Snow (cm)\n", "1946 1145.5\n", "1947 950.0\n", "1948 1394.5\n", "1949 1328.4\n", "1950 1211.6\n", "... ...\n", "2018 731.5\n", "2019 1206.5\n", "2020 1056.6\n", "2021 950.0\n", "2022 717.3\n", "\n", "[77 rows x 1 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#use pandas module to organize the data in a convenient manner\n", "#by default the pandas display format shows up to 5 places to the right of the decimal point, limit it to 1\n", "#this next line is really obtuse, so don't stress over what it means\n", "pd.set_option('display.float_format', lambda x: '%.1f' % x)\n", "#define a dataframe, df, from the snow organized by year\n", "df = pd.DataFrame(snow, index=year.astype(int),columns=['Alta Snow (cm)'])\n", "#list out the content of the dataframe\n", "df" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "#Create bar plot time series of Alta seasonal snowfall\n", "#create a list for the times for tick marks on the x axis. This will stop at 2020 (not 2030)\n", "decade_ticks = np.arange(1950,2030,10)\n", "\n", "#create a fig of Alta snowfall time series\n", "fig,(ax1) = plt.subplots(1,1,figsize=(10,3))\n", "ax1.bar(year,snow,color='green')\n", "ax1.set(xlim=(1945,2023),ylim=(600,2000))\n", "ax1.set(xlabel=\"Year\",ylabel=\"Snowfall (cm)\")\n", "ax1.set(xticks=decade_ticks)\n", "ax1.set(title=\"Alta Snowfall: 1946-2022\")\n", "#add grids to the plot\n", "ax1.grid(linestyle='--', color='grey', linewidth=.2)\n", "\n", "#save the figure to \n", "plt.savefig('alta_snowfall.png')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "#generate a Gaussian type empirical distribution for figure 1.3\n", "from numpy.random import normal\n", "sample = normal(loc=0, scale=1, size=1000000)\n", "# plot the histogram\n", "fig,(ax1) = plt.subplots(1,1,figsize=(5,5))\n", "ax1.hist(sample, bins=31, color='cyan',edgecolor='black',linewidth=1,align='mid')\n", "ax1.set(xlim=(-3,3),ylim=(0,150000))\n", "ax1.set(xlabel=\"Magnitude\",ylabel=\"Count\")\n", "ax1.set(title=\"Figure 1.3\")\n", "#add grids to the plot\n", "ax1.grid(linestyle='--', color='grey', linewidth=.2)\n", "#save the figure to \n", "plt.savefig('figure_1.3.png')\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 4 }