Python Variables and Assignment

Python Variables

A Python variable is a named bit of computer memory, keeping track of a value as the code runs.

A variable is created with an "assignment" equal sign =, with the variable's name on the left and the value it should store on the right:

x = 42

In the computer's memory, each variable is like a box, identified by the name of the variable. In the box is a pointer to the current value for that variable.

alt: variable name x refers to int value 42

Later in the code, appearances of that variable name, e.g. x, retrieve its current value, in this case 42. The use of the variable name in the code does not have quotes around it or anything. The variable name x is just a bare word in the code.

Trying to retrieve the value of a variable that does not exist fails with an error (i.e. no = ever assigned that variable name).

Variable Assignment Rules

Here is a more complicated code example and a picture of memory after this code runs.

x = 10
y = 'hello'
y = 'bye'
z = y

variable x refers to 10, variable y refers to 'bye', z refers to the same 'bye'

Things to notice here...

1. The assignment x = 10 simply sets x to point to 10.

2. The assignment y = 'hello' sets y to point to 'hello'. Then the line y = 'bye' changes y to point to 'bye', overwriting the first pointer. Assigning a variable overwrites any existing pointer that variable had. Each assignment is like the phrase "now point to" — the variable now points to the new thing, and any previous setting is forgotten.

3. Assignment between two variables like z = y, sets z to point to the same thing as y. Now they both point to the same value. It does not set one variable to point to the other variable, although the code does kind of look like that. It also does not set up a permanent relationship between the variables, like they must always be the same now. Confusingly, in mathematics writing the symbol = does set up a permanent relationship. In code, z = y has a very limited meaning: set z to point to what y points to at this moment.

Every Value has a Type

Here is the same picture as above, but with more detail added.

variable x refers to 10 (int), variable y refers to 'bye' (str), the string 'hello' is garbage

In Python, every value in memory is tagged with its "type" - so we see the integer 10 has a little (int) off to its side — int is the name of the integer type in Python. The string 'hello' is tagged with str which is the name of the string type.

As Python runs, many operations depend on this feature, treating a value appropriately depending on its type. See here how the + operator behaves differently if it is given int vs. str values:

>>> 1 + 2         # int values
3
>>> 'a' + 'b'     # str values
'ab'
>>> '3' + '4'     # str that look like int
'34'

Memory and the Garbage Collector

The string 'hello' in the example above is shown in gray. It is not needed by the code after the third line runs — no variable points to it any longer, so it cannot be used. Memory like this, which is no longer accessible, is called "garbage" in computer code. A "garbage collector" is a system that reclaims garbage memory, such as 'hello' here, so its memory can be re-used to hold a new value. This is something Python does automatically behind the scenes. The garbage collector slows the running of the code down a little.

Many modern languages have a garbage collector to reclaim garbage memory automatically. A few languages instead make the programmer identify garbage memory on their own - this has the potential to run fast, but it is a chore for the programmer and a big source of bugs when the programmer mis-identifies garbage memory. The design of Python prioritizes programmer productivity, so it is natural that Python includes a garbage collector.

Variable Swap

Suppose we have two variables and we want to "swap" their values, so each takes on the value of the other. This is a little coding move that all programmers should know.

a = 42
b = 13

before swap: a is 42, b is 13

It might seem that one can begin with a = b, but this does not work, since it overwrites and thus loses the original value of a. The classic 3-line solution uses a temporary variable named "temp" to hold this value during the swap, like this:

temp = a
a = b
b = temp

Starting with the above diagram, you can trace through the three assignments, leading to this memory structure:

after swap: a is 13, b is 42, temp is 42

Variable Names are Superficial Labels

Normally variable names are chosen to reflect what data they contain. That said, there is one funny feature of variable names in code.

Consider the following computation

>>> x = 6
>>> y = x + x
>>> y
12

Using a couple variables, it computes that doubling 6 makes 12. Suppose instead it was written this way:

>>> alice = 6
>>> bob = alice + alice
>>> bob
12

This is exactly the same computation, just using different variable names. What matters in a computation is the structure — which value is used at each spot in the computation, not the words chosen. The variable names are just arbitrary labels, tying together the different parts on the computation. If we change a variable name consistently throughout the code, the computation will work the same. Python is not looking at the variable names and thinking about what those English words mean.

That said, though variable names are meaningless to Python, good code uses meaningful variable names to help the programmer keep their ideas straight a they write and edit the code.

 

Copyright 2020 Nick Parlante