Marquette University, view of Wisconsin Avenue  

Module 23

Comprehension in Action

We are looking at elegant Python solutions for a number of frequent tasks.

File System Interaction

To create a listing of a directory, we can import the module os. It has a method listdir that returns all files (and sub-directories) in a directory. (If you are on a Windows system, directories are called Folders.) We can then use list-comprehension with the result of listdir as the range expression in order to filter for special file names. For example, to get a listing of all files with extension ".py", we just say:

[filename for filename in os.listdir(directoryname) if filename.endswith(".py")]

Creating Sub-Dictionaries

Dictionary comprehension allows us to create sub-dictionaries. In this example, the new dictionary has only even keys, but the binding between keys and values remains unchanged.

{ i:dictionary[i] for i in dictionary if i%2==0}

The Zip function

The zip function allows us to combine lists into lists of tuples or to take a list of tuples and create lists of individual values. A potential trap is the lack of warnings if you are zipping lists of different lengths together. The resulting list has the minimum length of the components.

Deep Copying versus Shallow Copying

Python allows you to build arbitrarily complicated data structures, where for example a list can contain a dictionary, whose values are tuples, etc. Problems can arise if we want to make a true copy of the data structure. A simple assignment a=b does not copy b but gives the same object two names. If I change the object using either name, it will be changed using the other name, because it is only one object. I can use a slice for instance to create a true copy of a list, as in listb = lista[:]. Now, if I change listb then lista is not changed and vice versa. Things become complicated with data structures at several levels. For example, if lista contains a list l as an element, then the slicing will not create a new copy of l. If I access l through lista, I can change it and the change will be visible if I access l through lista. However, if I delete l in lista, then it will not be deleted in listb. It is really that confusing until I carefully reconstruct what each operation does.

If a copy operation allows this behavior, then it is called a "shallow copy". Otherwise, it is a "deep copy". Because good deep copies can be difficult to implement, Python3 has a module for it, called copy. Its copy method creates shallow copies (with is sometimes all that is needed) and its deepcopy method creates a deep copy, but its execution is more involved and longer.

Frankly, it will take most people more than a couple of months of programming to appreciate and understand the differences, but I felt it necessary to mention it somewhere in this class, less you have to find out about this phenomenon on your own. I hope that having tried to understand it now, you will understand what happens in your own code in a few years from now.