{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Module 2: Python Strings\n", "## Chapter 3 from the Alex DeCaria textbook: 'Strings'\n", "\n", "In programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.\n", "- Strings are normally enclosed using single, double, or triple quotation marks\n", "- Quotes used the start and end of a string must match\n", "- Strings are considered 'literals' in that they have no special meaning or numeric representation.\n", "- Quotations within a string are valid, if the quotations used to enclose the string is different than the quotations *within* the string\n", " - \"This string is going to work\" and \"This string isn't going to cause problems either\" or 'Derek says \"Hello\"' are all valid\n", "- Like numeric data types, string can also be manipulated within Python.\n", "\n", "\n", "**Before starting:** Make sure that you open up a Jupyter lab/notebook session using OnDemand so you can interactively follow along with today's lecture! Also, be sure this script is in your atmos5340/module_2 subdirectory!\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "\n", "## Special Characters\n", "\n", "Special characters are used in Python to denote formatting commands such as new lines (\\n), returns (\\r), and tabs (\\t). These can be included within a string!\n", "\n", "For example, lets say I want to print the Bias and root-mean-square error for a temperature error analysis from a weather forecast model. I can use the new line special character to print the results to a difference line!" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "bias = 2.2 \n", "RMSE = 3.5\n" ] } ], "source": [ "report = 'bias = 2.2 \\nRMSE = 3.5'\n", "print(report)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "this is technically more concise than just printing out the results using two lines of code. For example, we would of needed to do the following:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "bias = 2.2\n", "RMSE = 3.5\n" ] } ], "source": [ "bias = 'bias = 2.2'\n", "rmse = 'RMSE = 3.5'\n", "print(bias)\n", "print(rmse)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example below shows a combination of a new line character (\\n) and a tab (\\t) within the same string:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Today's weather in Salt Lake City will see a high temperature of 94\n", "\t with partly cloudy skies\n" ] } ], "source": [ "forecast = \"Today's weather in Salt Lake City will see a high temperature of 94\\n\\t with partly cloudy skies\"\n", "print(forecast)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, there may also be situations where you will need to include special characters such as \\n and \\t as 'literals'. To treat these special characters as literals, just add a leading forward slash \\ before the character. For example:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello \\n there!\n" ] } ], "source": [ "print('Hello \\\\n there!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "---\n", "---\n", "\n", "## Formatting strings \n", " \n", "Often times, a programmer may need to edit a string, such as inserting a new character, or converting a numerical data type to a string. Python has a number of handy functions that allows the programmer to easily do this!\n", "\n", "Python's **'format'** statement allows variables to be inserted into a string, and allows the user to manipulate the width and number of decimal places that are displayed. For example:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Today's predicted high temperature is: 90.8 F\n" ] } ], "source": [ "x = 90.8366284\n", " \n", "print(\"Today's predicted high temperature is: {0:6.1f} F\".format(x))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Here, the format statement is used to add 5 leading spaces to our temperature, with 0 decimal places, *thus 90.blahblah is rounded to 91*. The curly brackets proceeding the format statement tells Python where x should be inserted in our string, and how that variable should be manipulated. \n", "\n", "There are six format specifiers that the user should be aware of:\n", "\n", "| Specifier | Description | \n", "|--|--|\n", "|d|indicates integer data|\n", "|f|indicates floating-point data|\n", "|e|indicates floating-point in scientific notation|\n", "|g|indicates floating-point in scientific notation for exponents less than -4 and greater than 5|\n", "|%|indicates floating-point converted to a percentage|\n", "|s|indicates string data|\n", "\n", "More examples on how the format statement can be used to manipulate strings can reviewed in Chapter 3 Section 3.3 of the DeCaria text.\n", "\n", "
\n", "\n", "\n", "## String indexing\n", "\n", "Similar to lists and tuples, items within a single string (i.e characters) are assinged a unique index. This would be analogous to elements within a list. \n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "c\n", "u\n", "cumul\n", "s\n", "onimbus\n" ] } ], "source": [ "cloud = 'cumulonimbus'\n", "print(cloud[0])\n", "\n", " \n", "print(cloud[1])\n", "\n", "print(cloud[0:5])\n", "\n", "print(cloud[-1])\n", " \n", "print(cloud[5:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## String concatenation and multiplication\n", "\n", "In addition to making insertions within strings, strings can also be manipulated by concatenation. This can simply be done by adding a plus sign (+) between two strings. For example:\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "hotdog\n" ] } ], "source": [ "print('hot'+'dog')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "this can also be done multiple times as seen below:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "root-mean-square error\n" ] } ], "source": [ "print('root-'+'mean-'+'square '+'error')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings within a list can also be merged together using the join function:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I LOVE THE WEATHER!\n" ] } ], "source": [ "weather = ['I','LOVE','THE','WEATHER!']\n", "print(' '.join(weather))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings can also be multipled..." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "best best best best best \n" ] } ], "source": [ "print('best '*5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "\n", "## Searching for a string \n", "\n", "Python also contains tools that are useful for searching for specific characters and strings. The `find` and `index` methods are often employed for this type of task. \n", "- `string.find(target,[i,j])` will return an integer that repesents the value of an index that correponds to the first instance of the target within string.\n", "\n", "An example of the find method in action:\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n" ] } ], "source": [ "my_string = 'orange'\n", "print(my_string.find('a'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`string.index(target,[i,j])` works similarly to find, except it will return an error if a matching value is not found. Both find and index work from left to right. To search from the back of the string (right to left) use `rfind()` or `rindex()`.\n", "\n", "

\n", " \n", "## Other useful methods for manipulating strings....\n", "\n", "**Replacing text:** `string.replace(target,new)` replaces all occurances of the *target* with the new text *new*. The original string `string` is not changed, and instead, a copy is created. \n", "\n", "**Stripping text:** `string.strip()` removed leading white spaces, tabs, new line characters, ect... `string.rstrip()` works similarly, but in the reverse direction. \n", "\n", "**Splitting text:** `string.split()` method can be used to split a string `string` on a specific delimiter. This will return a list. For example:\n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['cat', ' dog', ' mouse', ' bird']\n" ] } ], "source": [ "my_animals = 'cat, dog, mouse, bird'\n", "print(my_animals.split(','))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['cat,', 'dog,', 'mouse,', 'bird']\n" ] } ], "source": [ "print(my_animals.split(' '))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Inserting:** Since Python strings are *immutable*, it is not possible to insert a character within a existing string. However, we can make use of the string subset command and the join command `+` to create a new string with our desired change. \n", "\n", "For example, lets say we wanted to add a space between the words Environmental and Programming in the string below:\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Environmental Programming\n" ] } ], "source": [ "my_string = 'EnvironmentalProgramming'\n", "new_string = my_string[:13]+' '+my_string[13:]\n", "print(new_string)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "---\n", "---\n", "\n", "# In-class excercises (Not graded!)\n", " \n", "\n", "1) Separate the string ('09-20-2021-12:00:00') into year, month, day, and hour, and save each of these substrings to a new variable.\n", "\n", "2) Join these variables together, while separating the year, month, day with a forward slash and space, for example ('09/20/2021 12:00:00')\n", "\n", "3) Create a print statement that prints out the time string you just created, which is concatenated with a sentence that says \"Todays date and time is: \". Your output should look something like:\n", "\n", " \"Todays date and time is: 09/20/2021 12:00:00\"\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "---\n", "---\n", "\n", "# Want more practice!\n", "> https://www.tutorialspoint.com/python/python_strings.htm
\n", "> https://www.w3schools.com/python/python_strings.asp
\n", "> https://www.johnny-lin.com/pyintro/ed01/free_pdfs/ch03.pdf (Section 3.2)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 4 }