Python Style Part 1 - PEP8

CS106A does not just teach coding. It has always taught how to write clean code with good style. Your section leader will talk with you about the correctness of code, but also pointers for good style.

Messy code works poorly — hard to read, hard to debug. To help you develop the right habits, we want you to always turn in clean code, and of course all of our examples should also have good clean style.

The Word is Readable

The goal of good style is "readable" code, meaning that when someone sweeps their eye over the code, what that code does shines through.

Experience has shown that readable code is less time consuming to build and debug. Debugging especially can be a huge time-sink, so learning to write clean code and keep it clean is how we will do things all the time. Like surgeons keeping their hands clean, it's the right way to do it and we will try to develop the habit all the time.

Readability 1 - Function and Variable Names

The first step to good readability is a good variable and function names, like this example:

...
url = get_browser_url()
if has_security_problem(url):
    display_alert('That url looks sketchy!')
else:
    html = download_url(url)
    display_html(html)
...

The narrative steps the code takes are apparent from the good function and variable names (essentially the verbs and nouns in the narrative). No other comments are necessary here. The names alone make it readable.

The functions are the actions in the code, so we prefer verb names like "display_alert" or "remove_url" that say what the function does. Variables are like the nouns of the code, so we prefer names that reflect what that variable stores like "urls" or "original_image" or "resized_image". We'll talk about naming in more detail in the Style 2 document.

Readability 2 - PEP8 Tactics

Python is developed as a collaborative, free and open source project. A "PEP" (Python Enhancement Proposal) is a written proposal used in Python development.

One of the earliest PEPs, PEP8, is a consensus set of style and formatting rules for writing "standard" style Python, so your code has the right look for anyone else reading or modifying it. This is very much like English spelling reform, where "public" is the standard spelling, and "publick" is out of use and looks weird and a little irritating.

Here are the most important PEP8 rules we will use. The code you turn in should abide by PEP8.

Indent 4 Spaces

Indent code with 4 spaces. A def/if/for/while line ends with a colon, and its contained lines begin on the next line, all indented 4 spaces, like this:

if color == 'red':
    color = 'green'
    print(color)

Early on with Python, some people use 2 spaces or tabs. Those practices have died out, and now 4 spaces is the standard.

Single Space Between Items

Most lines of code have a mix of variables and "operators" such as as + * / = >=. Use a single space to separate the operators and items and operators from each other:

bound = x * 12 + 13
if x >= 4:
    print(x + 2)

Parenthesis No Space

The left and right parenthesis and square brackets do not have spaces separating them from their contents. The left parenthesis for a function call is right next to the function name:

run_along(1, 2 * 3, 'hi')
bark()
lst = [1, 2, z * 6]

Space After Comma/Colon

A comma or colon has 1 space after it, but no spaces before it.

foo(1, 'hello', 42)  # comma: 0 space before, 1 space after
[1, 2, 3]            # e.g. list
{'a': 1, 'b': 2}     # e.g. dict

Slice Exception

Slices are an exception to the above rule. It's acceptable for the colon in a slice to have no spaces, and that's how it's most often written, like this:

s[start:end + 1]     # slice - no spaces

The PEP8 rule is permissive, that the slice colon should have no spaces, as above, or 1 space on either side, like a +.

Use a Consistent Quote Mark

Python strings can be written within single quotes like 'Hello' or double quotes like "there". PEP8 requires a program to pick one quote style and use it consistently. For CS106A, we recommend just using single quote. Single quote is tiny bit easier since it does not require the shift key. Not thinking about which quote mark to use will free up valuable brain cells for something.

2 Blank Lines Before Def

PEP8 requires 2 blank lines before each def in a file. This is one of the weaker PEP8 rules. We will not be very upset about your code if it has 3 lines between functions.

Blank Lines Within a Def

If the lines of code have have natural phases — a few lines setting up the file, a few lines sorting the keys — use blank lines within the def to separate these phases from each other, much like dividing an essay into paragraphs. The standard says to use blank lines "sparingly".

Prefer != and not in

This is a preference between two equivalent forms which both work correctly. We prefer the "shortcut" forms for not-equal and not-in testing, like this:

if not s == 'donut':        # NO
    ...

if s != 'donut':            # YES
    ...


if not 'donut' in foods:    # NO
    ...

if 'donut' not in foods:    # YES
    ...

These forms arrange the code more in line with the natural phrasing, like "s is not donut" or "donut is not in foods".

Do Not Write: if x == True

The way that boolean True/False is evaluated in an if/while test is a little more complicated than it appears. The simple rule is this: do not write if x == True or if x == False in an if/while test.

Say we have a print_greeting() function that takes in a shout_mode parameter.

Do not write this:

def print_greeting(words, shout_mode):
    if shout_mode == True:   # NO not like this
        print(words.upper())
    else
        print(words)

The if structure can distinguish True vs. False values itself. Therefore, write the code to give the boolean shout_mode to the if directly, like this:

def print_greeting(words, shout_mode):
    if shout_mode:           # YES like this
        print(words.upper())
    else:
        print(words)

To invert the logic do not use == False. Use not like this:

def print_greeting(words, shout_mode):
    if not shout_mode:
        print('Not shouting')
    else:
        print(words)

Remove Pass from Code

The pass directive in Python is a placeholder line that literally does nothing. It is very rarely used in regular production code. However, in coding exercises, there is often some boilerplate code with a pass marking the spot for the student code. Remove all the pass markers from your code before turning it in.

# Inline Comments

It's a good practice to add # "inline" comments to explain what especially non-obvious or interesting code does. Inline comments are not needed when the code itself does a good job of showing what the line does.

1. For the following lines, the code does a good job of telling the story, so no inline comments are needed beyond the good variable and function names. Most lines of code are like this.

count += 1
total_len = len(name) + len(address)

In particular, avoid putting in a comment that restates what the code itself says.

i += 1  # add 1 to i

2. Inline comments are appropriate as in the example below, explaining what a non-obvious bit of code accomplishes - addressing the goal at work, not the re-stating the operators used.

# increase i to the next multiple of 10
i = 10 * (i // 10 + 1)

The inline comment provides useful information that is not obvious from the line itself.

3. Another important use of inline comments is pointing out cases where code does not do what it appears to. This speaks to readability - the code does not do what its variable and function names suggest, so an inline comment can help clear things up.

vendors.add('AT&T')
vendors.add('Stanford University')
vendors.add('T-Mobile')
# Note: We pass 'T-Mobile' here, but for historical
# reasons the payment system re-writes and uses
# 'Vodafone' for this internally.

Example code used in teaching often has more inline comments than regular production code, since for teaching there are many times we want to point out or explain a technique shown on a line.

Compare With ==

What is the common way to compare two values in Python? The answer is: use the == operator like this:

if word == 'meh':
    print('Enthusiasm low')

It works for strings, it works for ints, it works for lists, it works for pretty much everything. Knowing the == operator is the main thing. However, there is one exception shown below.

The is None Exception

The == comparison operator is the common way to compare any two values, and we will not mark off CS106A code that simply uses ==.

However there is a rule in PEP8 that requires a different form of comparison for None so you should learn it eventually. Also, you may notice your Python editor complaining about x == None comparisons because of this PEP8 rule.

PEP8 requires that comparisons to the special values None, True and False should use the is or is not comparisons instead of ==, like this:

if word is None:             YES, "is None"
    print('Nothing here')

if result is not None:       YES, "is not None"
    print('Got result')

if word == None:             NO, avoid "== None"
    print('Nada')

By far the most common application of this rule is for the value None, so you can think of this as the is None rule. The reasons for this requirement are a little obscure, but explained below if you are curious. If code accidentally uses == instead of is, the code will almost certainly work perfectly anyway. The cases where == would not compare to None correctly, are very, very, rare.

Unfortunately, the converse is not true. If code is written to use is where the programmer meant to use ==, it will generally not work right. Therefore, programmers should focus on == which is common and reliable. Programmers should not think of the is operator as something to use frequently, except for this one mandated case about is None. We do not mark off student code that uses == None, but just mention the preferred is None form once we mention it in lecture.

See the section is operator for the details of its operation.

Why Require is None?

You would think that the is None form uses is because of something to do with memory and copies, but it's for another reason. The is None mandate exists because a datatype can provide a custom definition of == in such a way that it will claim that x == None is true in cases where it is actually false. This is a perverse definition of == but it is possible and has been done at times. No data type in a standard install of Python has this odd behavior, which is why == None generally works fine. However, it is possible that code in a project might have this behavior, so against that case, PEP8 requires the is None form. This avoids the problem since a datatype can customize == but the is operator cannot be customized.

History of Long Lines

In the old days of relatively small computer displays, projects would frequently have a rule that no line in the code could be wider than 80 or 100 characters, so that the code would fit on the display. This sort of rule has become less common. Often in Python, the simplest thing to do is just let a long line be long.

However if a line is so long that it's hard to read or work with, here are a couple techniques to break a long line into shorter, separate lines.

Break Up Line of Parameters

Suppose there is a function call line that is long because there are many parameters, like this:

draw_background_shape(selected_shape.x, selected_shape.y, sub_width + MARGIN, sub_height * MARGIN, selected_color)

To break up this long line, add a newline after a comma, and indent the later lines so the parameters have the same indent as the parameters above, like this:

draw_background_shape(selected_shape.x, selected_shape.y,
                      sub_width + MARGIN, sub_height * MARGIN,
                      selected_color)

Break Up a Long Line With Parenthesis

If Python sees a line with a left parenthesis ( without a matching right parenthesis, Python treats the later lines as continuations of the first line until the matching ) is reached. This works for [ and { as well.

For example, suppose have this long if-test:

if selected_x >= 0 and selected_x < shape.width and selected_y >= 0 and selected_y < shape.height:
   return True

The long test expression can be broken into several lines, adding a pair of parenthesis around the whole thing. The second and later lines should be indented an additional 4 spaces, so they do not accidentally line up with with the subsequent body lines.

if (selected_x >= 0 and
        selected_x < shape.width and  # additional indent
        selected_y >= 0 and
        selected_y < shape.height):
   return True

There is also a preference that the last word on each line is an operator, like and or +, so it reinforces the idea that there's more on the later lines.

Named Parameter Exception

There is an exception to the 1-space-between-items rule, but it's for a rare case. If a function call has named parameters, no spaces are needed around the = like this:

print('hi', end='')

I think the reasoning here is that this use of = is different the more common variable assignment =, so it's good to make this use look a little different.


Naming quirks:

Python Keywords

A few words like def, for, and if are fixed "keywords" in Python, so you cannot use those words as variable or function names. Here is the list of keywords:

>>> import keyword
>>> keyword.kwlist
['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']

The "len" Rule: Do Not Use Builtin Functions as a Variable Names

Python has many built-in functions like len() and min(). As a strange example of Python's flexibility, your code can use assignment = to change what len means within your code .. probably a huge mistake!

>>> len('hello')   # len function works
5
>>> len = 7        # redefine "len" to be 7, bad idea
>>> len('hello')   # oh noes!
TypeError: 'int' object is not callable

Here is the simple rule: do not use the name of a common function as a variable name. Here are the most common function names:

len abs min max int float list str dict range map filter format

Rather than memorize the list of all the function names to avoid, it is sufficient to avoid using the names of functions that your code uses, typically the well known functions like len, range, min, and so on.

Incidentally, this is why we avoid using "str" or "list" as variable names, instead using "s" and "lst".

Redefining a function in this way only affects your local code, not code elsewhere in the program. So if your code accidentally defines a variable named divmod, that will not interfere with code in another function calling the built in function divmod().

A Little Readability Joke

This is an optional point for anyone who reads this far down. What does readability mean? One could say it means the visuals of the code fit with the semantics of what it's going to do.

With that in mind, what do you think about this code that computes a total score, but the spacing technically violates PEP8:

total = score*10 + bonus*2

I'd say it looks ok. The spaces are missing, but the missing spaces are in agreement with the precedence — the multiplication will happen first, and the missing spaces sort of reinforce that.

What about this version?

total = score * 10+bonus * 2

You may have chuckled or winced a little looking at this one. It is just so wrong, sort of like the punchline of a joke. This may give you a little insight that the meaning of code and its appearance have some deep linkage. Given that linkage, it's maybe not surprising that messy looking code is more prone to confusion and bugs.

Links For SLs

Here are links to the sections for particular style rules, handy for pasting into grading notes.

 

Copyright 2020 Nick Parlante