Jared Foy Teach me good judgement and knowledge

Control structures and functions

Published December 18th, 2018 3:51 pm

My dissociation seems to be getting a lot better. Any way, here is a bunch of stuff about Python control structures and functions.

Chapter 4: Control Structures and Functions
Here we deal with branching and looping, then exception handling. We already covered most of the control structures but here we give more complete coverage as well as additional syntax, and how to raise exceptions and create custom exceptions. The third and largest section is devoted to custom functions, with detailed coverage of Python's extremely versatile arg handling. Custom functions allow use to package up and parametize functionality. This reduces the size of the code by eliminating code duplicaiton and provides code reuse. In the following chapter we'll see how to create custom modules, so we can make use of our custom functions across our programs.
Control Structures
Python gives conditional branching with if statements and looping with while and for...in statements.
Branching -- if statements
Looping -- for...in, while
Python also has a 'conditional expression' which is a kind of 'if' statement that is Python's answer to the 'ternary' operator (?:) which is used in C-style languages. We saw before that the general syntax for Python's conditional branch is:
if boolean_expression1:
suite1
elif boolean_expression2:
suite2
else:
else_suite
We don't need elif clauses, and the final else clause is optional as well. In some cases we can reduce an if...else statement down to a single 'conditional expression'. The suntax for a conditional expression is:
expression1 if boolean_expression else expression2
If the boolean_expression evals True, the resultof the conditional expression is expression1, otherwise expression2 is the result. One common programming pattern is to set a variable to a default value, and then change the value if necesary. Like when due to a request by the user, or to account for the platform on which the program is being run. This is the pattern using a conventional if statement.
offset = 20
if not sys.platform.startswith('win'):
offset = 10
The sys.platform var holds the name of the current platform, like 'win32'. We can do this same thing, but cleaner using the conditional expression syntax:
offset = 20 if sys.platform.startswith('win') else 10
When don't need to use parens but they can help us avoid a trap. Suppose we want to set a width var to 100 plus an extra 10 if margin is True, it might look like this:
width = 100 + 10 if margin else 0 #this is wrong
It works great if margin is True, but if margin is False then width is set to 0, not 100. This is because Python sees 100 + 10 as the expression1 part of the conditional expression. To fix this we use parens:
width = 100 + (10 if margin else 0)
This also clears it up for the human eye. Conditional expressions can be used to improve messages printed for users. Like when reporting the number of files processed, instead of printing '0 file(s)', '1 file(s)', we can instead use a couple conditional expressions:
print('{0} file{1}'.format(count if count != 0 else 'no'), ('s' if count != 1 else '')
What a beautiful and succinct expression. Here you can see we are using format() on a printed string. Inside the first value for field 0 we claim the value of count if it doesn't equal zero, and if it does we print 'no'. For the second template literal we print 's' if it doesn't equal 1, otherwise we input a blank string.
Looping
Python gives s a whil loop and a for...in loop, both of which have a more sophisticated syntax than the basics we showed in Chp. 1.
while Loops
This is the general syntax of a while loop:
while boolean_expression:
while_suite
else:
else_suite
The else clause is optional. As loong as the boolean_expression is True, the while block's suite is executed. If the expression becomes False, (or is False in the first place), the loop terminates, and if the else clause is present, its suite is then executed. Inside the while block's suite, if a continue statement is executed, control is immediately returned to the top of the loop, and the boolean_expression is evaluated again. If the loop does not terminate normally, any optional else clause's suite is skipped. The optional else clause is easily confused due to its name because the else clause's suite is always executed if the loop terminates normally. If the loop is broken out of due to a break statement, or a return statement (the return statement would be used in a function or method), or if an exception is raised, the else clause's suite is NOT executed. If it so happens that an exception occurs, Python skips the else clause and looks for a suitable exception handler. Even though this is slightly confusing, the else clause behaves the same way for while loops, for...in loops, and try...except blocks.
Let's see the else clause in action. The str.index() and list.index() methods return the index pos of a given string or item, or they raise a ValueError if the str or item isn't found. The str.find() method does the same thing, but on failure instead of raising an exception it returns an index of -1. There is no equivalent method of str.find() for lists but if we wanted a function that did this, we could create one using a while loop like so:
def list_find(list, target):
index = 0
while index < len(list)
if list[index] == target:
break
index += 1
else:
index = -1
return index
In this function we search the given list for the target. If the target is found, the break statement terminates the loop, causing the appropriate index position to be returned. If the target is not found, the loop runs to completion and terminates normally. After normal termination, the else suite is executed, and the index position is set to -1 and returned.
Like a while loop, the full syntax of the for...in loop also includes an optional else clause:
for expression in iterable:
for_suite
else:
else_suite
The expression is normally either a single variable or a sequence of variables, usually in the form of a tuple. IF a tuple or list is used for the expression, each item is unpacked into the expression's items. If a continue statement is executed inside the for...in loop's suite, control is immediately passed to the top of the loop and the next iteration begins. If the loop runs to completion it terminates, and any else suite is executed. If the loop is borken out of due to a break statement, or a return statement, or if an exception is raised, the else clause's suite isn't executed. Below is a for...in loop version of the list_find() function, like the last one, it shows the else clause in action:
def list_find(list, target):
for index, x in enumerate(list):
if x == target:
break
else:
index = -1
return index
The variables created in the for...in loop's expression continue to exist after the loop has terminated. Like all local variables, they cease to exist at the end of their enclosing scope. (I think this might be the first time that this text talked about scope at all!)
Exception Handling
Python indicates errors and exceptional conditions by raising exceptions, however, some third party libraries use old-fashioned error return values.
Catching and Raising Exceptions
Exceptions are caught using try...except blocks, which has syntax which usually looks like so:
try:
try_suite
except exception_group1 as err:
except_suite1
...
except exception_groupN as errN:
except_suiteN
else:
else_suite
finally:
finally_suite
In a try block there must be at least one except block, both the else and finally blocks are optional. The else block's suite is executed when the try block's suite has finished normally, it is not executed if an exception occurs, if there is a finally block it is always executed at the end. Each except clause's exception group can be a single exception or a parenthesized tuple of executions. For each group, the as err part is optional, if we use it the err variable will contain the exception that occurred and can be accessed in the exception block's suite. If an exception occurrs in the try block's suite, each except clause is tried in turn. If the exception matches an exception group, the corresponding suite is executed. To match an exception group, the exception must be of the same type as the (or one of the) exception types listed in the group, or the same type as the (or one of the) group's exception types' subclasses.
For example, if a KeyError exception occurs in a dictionary lookup, the first except clause that has an Exception class will match sing KeyError is an (indirect) subclass of Exception. If no group lists Exception (as is normally the case), but one did have a LookupError, the KeyError will match, because KeyError is a subclass of LookupError. If no group lists Exception or LookupError, but one does list KeyError, then that group will match. Pactically speaking, we want to capture an error as specifically as possible. In order to do so we should list the most likely error (as well as the most specific) error we are expecting as the first except block. Then the next one should be more general, so on and so forth. Accordingly, it is bad practiced to just do: except Exception: this is just lazy and bad practice, someone needs to study their exception handling tree. If we do this kind of lazy exception capturing, we are very likely to capture things that we don't want to capture, casting our net too widely. It's also bad to simply go: except: without giving any sort of exception object subclass, this is going to give us the same problems but worse.
If Python can't get any of the except blocks to match the exception then it will work its way up the call stack looking for a suitable handler. If none is found the program will terminate and print the exception and a traceback on the console. If no exceptions occur, any optional else block is executed. And in all cases (if no exceptions occur, if an exception occurs and is handled, or if an exception occurs that is passed up the call stack) any finally block's suite is always executed. If no exceptions occur, or if an exception occurs and is handled by one of the except blocks, the finally block's suite is executed at the end. However, if an exception occurs that doesn't match, first the finally block's suite is executed, and then the exception is passed up the call stack. This guarantee of execution can be very useful whe we want to ensure that resources are properly released.
Here is a final version of the list_find() function that uses exception handling:
def list_find(list, target):
try:
index = list.index(target)
except ValueError:
index = -1
return index
How cool! Here we effectively used the try..except block to turn an exception into a return value! The same approach can also be used to catch one kind of exception and raise another instead.
We can also use a try...finally block which is sometimes useful.
try:
try_suite
finally:
finally_suite
No matter what happens in the try block's suite (beside crashing!) the finally block will always be executed. The 'with' statement used with a context manager can be used to achieve a similar effect to useing a try...finally block. One common pattern of use for try...except...finally blocks is for handling file errors. Review this snippet:
def read_data(filename):
lines = []
fh = None
try:
fh = open(filename, encoding='utf-8')
for line in fh:
if line.strip():
lines.append(line)
except (IOError, OSError) as err:
print(err)
return []
finally:
if fh is not None:
fh.close()
return lines
Firstly, we set the file object to None because it is possible that the open() call will fail, in which case nothing would be assigned to fh and it will remain None, also an exception will be raised. If one of the exceptions we have specified occurs (either IOError or OSError), after printing the error message we return an empty list. Note that before returning, the finally block's suite will be executed so the file will be closed safely, that is if it had been opened in the first place. Notice as well that if an encoding error occurs, even though we don't catch the relevant exception which would be UnicodeDecodeError, the file will still be safely closed. In such cases the finally block's suite is executed and then the exception is passed up the call stack. Later we will see a more compact idiom for ensuring that files are safel closed, that does not require a finally block.
Raising Exceptions
Exceptions provide a useful means of changing the flow of control. We can take advantage of this either by using the built-in exceptions, or by creating our own, raising either kind when we want to. There are three syntaxes for raising exceptions:
raise exception(args)
raise exception(args) from original_exception
raise
When the first syntax is used the exception that is specified should be either one of the built-in exceptions, or a custom exception that is derived from Exception. If we give the exception some text as its argument, this text will be output if the exception is printed when it is caught. The second syntax is a variation of the first--the exception is raised as a chained exception (covered more later) that includes the original_exception. This syntax is used inside except suites. When the third syntax is used, that is, when we don't specify an exception, 'raise' will reraise the currently active exception, and if there is no current exception it will raise a TypeError.
Custom Exceptions
Custom exceptions are custom data types (also known as classes). Creating classes is covered in Chp 6, but since it is easy to create simple custom exception types, we will show the syntax here:
class exceptionName(baseException): pass
The base class should be Exception or a class that inherits from Exception. One use of custom exceptions is to break out of deeply nested loops. For example, if we have a 'table' object that holds records (rows), which hold fields (columns), which have multiple values (items), we could search for a particular value with code like this:
found = False
for row, record in enumerate(table):
for column, field in enumerate(record):
for index, item in enumerate(field):
if item == target
found = True
break
if found:
break
if found:
break
if found:
print('found at ({0}, {1}, {2})'.format(row, column, index))
else:
print('not found')
These fifteen lines of code are complicated by the fact that we must break out of each loop seperately. We can alternatively use a custom exception:
class FoundException(Exception): pass
try:
for row, record in enumerate(table):
for column, field in enumerate(record):
for index, item in enumerate(field):
if item == target:
raise FoundException()
except FoundException:
print('found at ({0}, {1}, {2})'.format(row, column, index))
else:
print('not found')
This cuts our code down to ten or eleven lines (if we use the else block) and it is also much easier to read. If the item is found we raise our custom exception and the except block's suite is executed. This means that the else block will be skipped. And if the item isn't found, no exception is raised and the else block then executes.
This example shows different ways we can handle exceptions. This is an excerpt from a program that reads al the HTML files it is given on the command line and performs some simple tests to verify that tags begin with "<" and end with ">", and also that entities are correctly formed. These are the four custom exceptions:
class InvalidEntityError(Exception): pass
class InvalidNumericEntityError(InvalidEntityError): pass
class InvalidAlphaEntityError(InvalidEntityError): pass
class InvalidTagContentError(Exception): pass
The second and third exceptions inherit from the first, this is useful and we will see why. The parse() function that uses the exceptions is more than 70 lines long, so here are only the relevant parts that show exception-handling:
fh = None
try:
fh = open(filename, encoding='utf-8')
errors = False
for lino, line in enumerat(fh, start=1):
for column, c in enumerate(line, start=1):
try:
The code begins by setting up the file object to None and putting all the file handling in a try block. The prog reads the file line by line and reads each line character by character. Notice that we have two try blocks; the outer one is used to handle file object exceptions, and the inner one is used to handle parsing exceptions. This program is using something call ed 'states' and I'm not sure if we have covered that yet in this book. We can use isinstance() in order to logically sort between exceptions. The built-in isinstance() function returns True if its first argument is the same type as the type given as its second argument. We could have used a seperate except block for each of the three custom parsing exceptions, but in the case with this program combining them means that we can avoid repeating the last four lines (from the print() call to raise), in each one. The program has two modes of use. If skip_on_first_error is False, the program continues checking a file even after a parsing error has occurred. This can lead to multiple error messages being output for each file. If skip_on_first_error is True, once a parsing error has occurred, after the (one and only) error message is printed, 'raise' is called to reraise the parsing exception and the outer(per-file) block is left to catch it.
elif state == PARSING_ENTITY:
raise EOFError("missing ';'at end of " + filename)
At the end of parsing a file, we need to check to see whether we have been left in the middle of an entity. If we have, we raise an EOFError, the built-in end-of-file exception, but we give it our own message text. If you would like to go back to this section in order to review it is page 171.
Custom Functions
Functions are a means by which we can package up and parameterize functionality. There are four kinds of functions which can be created in Python:
Global functions
Local functions
Lambda functions
Methods
Every function we have created so far has been a global function. Global objects (which includes functions) are accessible to any code in the same module (ie: the same .py file) in which the object is created. Global objects can also be accessed from other modules, as we go over next chapter. Local functions (also called nested functions) are functions that are defined inside other functions. These are visible only to the function in which they are defined. They are especially useful for creating small helper functions that have no use elsewhere. They are shown in chapter 7.
Note: the most freqently used online documentation for new users is the Library Reference, and for experienced users the Global Module Index. Both of these docs have links to pages covering Python's entire standard library and the Library Reference links to pages covering all of Python's built-in functionality. Inside the Python interpreter, we can use dir() which will list al the methods for the given argument (input the name of a module you want the methods of)
Lambda functions are expressions, so they can be created at their point of use, they are much more limited than normal functions.
Methods are funcs that are associated with a particular data type and can be used only in conjunction with the data type (more on those in chapter 6)
Python gives many built-in functions, the standard library and third party libraries add hundreds more (thousands counting all the methods), perhaps the function we want has already been written. Because of this, check the docs first. The general syntax for creating a global or local function is:
def functionName(parameters):
suite
The parameters are optional, if there is more than one they are written as a sequence of comma-separated identifiers, or as a sequence of identifier=value pairs as we'll go over shortly. Here is a function that calculates the area of a triangle using Heron's formula:
def heron(a, b, c):
s = (a + b + c)/ 2
return math.sqrt(s * (s - a) * (s - b) * (s - c))
Inside the function, each param (a, b, and c), is initialized wiht the corresponding value that was passed as an argument. When the func is called, we must supply all of the args, por ejemplo: heron(3,4,5). We will get a TypeError if we supply to few or many args. When we do a call like this we are said to be using 'positional arguments', because each arg that is passed is set as the value of the param in the corresponding position. So in this case, a is set 3, b to 4, and c to 5 when we call the function. Every func in Python returns a value, although it is acceptable and common to ignore the returned value. The return value is either a single value or a tuple of values, and the values returned can be collections, so there is no practical limitation on what we can return. We can leave a function at any point by using the return statement. If we use return with no args, or if we dont have a return statement at all, the function will return None. (in chp 6 we will cover yield statements) which can be used instead of return in certain kinds of functions. Some functions have parameters for which there can be a sensible default. For example, here is a function that counts the letters in a string, defaulting to the ASCII letters:
def letter_count(text, letters=string.ascii_letters):
letters = frozenset(letters)
count = 0
for char in text:
if char in letters:
count += 1
return count
We have specified a default value for the letters parameter by using the parameter=default syntx. This allows us to call letter_count() with just one argument. The parameter syntax does not permit us to follow parameters with default values with another parameter that doesn't have a default. So def bad(a, b=1,c) won't work. But we also aren't forced to pass our arguments in the order they appear in the function's definition. We can use 'keyword' arguments, passing each argument in the form 'name=value'. This next function returns the string it is given, or if it's longer than the specified length, it returns a shortened version withan indicator elipsis added:
def shorten(text, length=25, indicator='...'):
if len(text) > length:
text = text[:length - len(indicator)] + indicator
return text
Because both length and indicator have default values, they can be omitted when we call the function, in this case the default values for both will be used. We can properly call this function using the name=value syntax and eschewing the positions that the argument parameters were formed in like so:
shorten(length=7, text='This is some text', indicator='&')
Also note that positional arguments can't follow keyword arguments, this throws a syntax error. The difference between a mandatory parameter and an optional parameter is that a parameter with a default is optional because Python will make use of the default value attributed in the function's creating. A param with no default is mandatory because Python ain't gonna go and start making guesses about what you meant. Using default parameter values carefully can simplify our code and make calls look a lot nicer. Remember the built-in open() function has a mandatory arg (filename), and six optional arguments. By using a mixture of keyword and positional args we can specify the args we care about and omit the others which provide us with default values. Also nice about keyword args is that Boolean args are more readable as well.
When default values are given they are created at the time the def statement is executed not when the function is called. For immutable arguments like numbers and strings this doesn't make any difference, but for mutable arguments a subtle trap is lurking!
def append_if_even(x, list=[]): #this is wrong don't do it
if x % 2 == 0:
list.append(x)
return list
When this function is created the list parameter is set to refer to a new list. And whenever this function is called with just the first param, the default list will be the one that was created at the same time as the function itself therefore no new list will be created when we call the function again, after it has been called once. We usually don't want this behavior, we would expect a new empty list to be created each time the function is called with no secon argument. This is the idiom that we want to use for default mutable arguments:
def append_if_even(x, list=None):
if list is None:
list = []
if x % 2 == 0:
list.append(x)
return list
As you can see here we give a None keyword as the default value of what we will then turn in to a list if the second argument has been left blank. This allows us to create an empty list literal each time the function is called, instead of simply when it is first defined. Clever! This should be used for dictionaries, lists, sets, and any other mutable data types that we want to use as default arguments. This is a slightly shorter version which shares the same behavior:
def append_if_even(x, list=None):
list = [] if list is None else list
if x % 2 == 0:
list.append(x)
return list
Here we utilized a conditional expression in order to save a line of code for each parameter that has a mutable default argument. The savings may look minimal, but the beauty is there!
Names and Docstrings
Using good names for a function and its parameters goes a long way toward making the purpose and use of the function clear to other programmers (and when we forget too!) Here's a few rules of thumb to consider:
~USe a naming scheme, and use it consistently. In this book we use:
UPPERCASE for constants
TitleCase for classes (which includes exceptions)
camelCase for GUI functions and methods
lowercase or lower_case_with_underscores for everything elese
~For all names, avoid abbreviations, unless they are both standardized and widely used.
~Be proportional with variable and parameter names: x is a perfectly good name for an x-coordinate and i is fine for loop counters, but in general the name should be long enough to be descriptive. The name should describe the data's meaning rather than its type (ie:amount_due instead of money), that is unless the use is generic to a particular type.
~Functions and methods should have names that say what they do or what they return (depending on their emphasis), but never how they do it (because that might change)
These are few examples:
def find(l, s, i=0) #bad
def linear_search(l, s, i=0) #bad
def first_index_of(sorted_name_list, name, start=0) #GOOD
All three of these functions return the index position of the first occurrence of a name in a list of names, starting from the given starting index and using an algorithm that assumes the list is already sorted. The first one is bad because the name gives no clue as to what will be found, and its params indicate the required types but not what they mean. The second is bad because the func name describes the algorithm used and it might have changed since. This may not matter to users of the func but it will probably confuse maintainers if the name implies a linear search, but the algorithm implememented has been changed to a binary search. The third is good because the function is named to describe what is returned and the params indicate what is expected. None of these funcs have any way of indicating what happens if the name isn't found. Whatever does happen should be documented for users. We can add documentation to a function by using a 'docstring'. This is a string that comes immediately after the def line, and before the function's code proper begins. This would look like so:
def sorts_base_pairs(a, b, c):
"""Herein is the docstring which may span multiple
lines and may be longer than the actual function
"""
It's conventional to make the first line of the docstring a brief one-line description. Then have a blank line followed by a full description, and then to reproduce some examples as they would appear if typed in interactively.
Argument and Parameter Unpacking
We saw earlier that we can use the sequence unpacking operator (*) to supply positional arguments. We can also use the sequence unpacking operator in a function's parameter list. This is useful when we want to create functions that can take a variable number of positional arguments. Here is a product() func that computes the product of the arguments it is given:
def product(*args):
result = 1
for arg in args:
result *= arg
return result
This function has only one parameter called 'args'. Putting the * in front means that inside the function the args parameter will be a tuple with its items set to however many positional arguments are given. Here are a few example calls:
product(1,2,3,4) #args == (1,2,3,4); returns: 24
product(5,3,8) #args == (5,3,8); returns: 120
product(11) #args == (11,); returns 11
We can have keyword arguments following positional arguments, as this function to calculate the sum of its arguments, each raised to the given power, shows:
def sum_of_powers(*args, power=1):
result = 0
for arg in args:
result += arg ** power
return result
This sweet little function can be called with just positional args: sum_of_powers(1,3,5) or with positional and keyword args: sum_of_powers(1,3,5, power=2). It's also possible to use * as a parameter on its own. This is used to signify that there can be no positional args after the *, although keyword args are allowed. Here is a function that takes exactly three positional args and has one optional default argument.
def heron2(a,b,c,*,units="square meters"):
s = (a + b + c) / 2
area = math.sqrt(s * (s - a) * (s - b) * (s - c))
return "{0} {1}".format(area, units)
Remember, we can't call this function like so:
heron2(24,25,12,'sq inches') #this raises a TypeError
we must use the keyword=value syntax because otherwise it thinks you are inputing a positional argument, and it being that we created the function with the * we have disallowed more than 3 args. We can also use the * in order to prevent any positional arguments from being used during a call by doing so:
def print_setup(*, paper="Letter", copies=1, color=False):
Just as we can unpack a sequence to populate a function's positional arguments, we can also unpack a mapping using the mapping unpacking operator (**). We can use ** to pass a dictionary to the print_setup() function, like so:
options = dict(paper="A4", color=True)
print_setup(**options)
Using this syntax the dictionary 'options' key-value pairs are unpacked with each key's value being assigned to the parameter whose name is the same as the key, If the dictionary contains a key for which there is no corresponding param, a TypeError is raised. Any arg for which the dictionary has no corresponding item is set to the default value, however if the function was not given a default value for a parameter then a TypeError is raised.
We can also use the mapping unpacking operator with parameters. This allows us to create functions that will accept as many keyword arguments as are given. This function below takes a SSN and a surname positional arg, and any number of keyword args:
def add_person_details(ssn, surname, **kwargs):
print('SSN =', ssn)
print(" surname =", surname)
for key in sorted(kwargs):
print(' {0} = {1}'.format(key, kwargs[key]))
This function could be called with just two positional args, or with additional info:
add_person_details(83272127, "Luther", forename="Lexis", age=45)
This gives us more flexibility and we can accept both a variable number of positional args and a variable number of keyword args.
def print_args(*args, **kwargs):
for i, arg in enumerate(args):
print('positional argument {0} = {1}'.format(i,arg))
for key in kwargs:
print('keyword argument {0} = {1}'.format(key, kwargs[key]))
This function just prints the arguments that it is given. We can call it with no arguments, or with any number of positional and keyword arguments.
Accessing Variables in the Global Scope
Sometimes it's convenient to have a few global variables that are accessed by various functions in the program. This is usually okay for constants, but it isn't good practice for variables, although for short one-off programs it isn't unreasonable.
Note: we can set the default language in Python by going:
Language = 'en' #this sets to english
Remember, constants are best indicated by UPPERCASE variables. Python does not have a direct way to create constants, instead it relies on prammers to respect the convention. Elsewhere in the program we access the Language variable, and use it to choose the appropriate dictionary to use:
def print_digits(digits):
dictionary = ENGLISH if Language == "en" else FRENCH
for digit in digits:
print(dictionary[int(digit)], end=" ")
print()

When Python encounters the Language variable in the func it looks in the local (function) scope and lo and behold finds it not. So it then looks in the global (.py file) scope and finds it there! The end keyword arg used with the first print() call is explained here. The print() function accepts any number of positional arguments, sep, end, and file. All the keyword args have defaults. The sep param's default is space; if two or more positiona args are given, each is printed wiht the 'sep' (separator) in between, but if there is just one positional arg this parameter does nothing. The 'end' param's default is '\n', which is why a newline is printed at the end of calls to print(). The 'file' param's default is sys.stdout, the standard output stream, which is usually the console.