Python Variables and Assignment

Variable Basics
What Does x = y Do?
Values Have Types
Swapping Two Vars
Local and Global Variables
Memory and Garbage Collection
Variable Names vs. Word Meaning

Python Variables

A Python variable is a named place to store a value.

A variable is created with an "assignment" = (one equal sign), with the variable's name on the left and the value it should store on the right:

x = 42

Variables and values are stored in the computer's memory, with each variable like a little box, labeled with the variable's name and containing a pointer to the variable's value.

alt: variable name x refers to int value 42

Using that variable later in the code retrieves its current value — e.g. x in the code retrieves 42. The variable name in the code is a plain word, not within quote marks or anything.

>>> x = 6   # Set var with =
>>> 1 + x   # Use var
7
>>>
>>> woot = 2
>>> woot * x
14
>>>
>>> 1 + y   # Var not-defined error
NameError: name 'y' is not defined

Trying to retrieve the value of a variable that was never defined fails with an error (i.e. no = ever assigned that variable name).

Now Point To

Using = on an existing variable simply changes the variable to point to the new value, forgetting the previous value. Subsequent uses of the variable will get the new value. The = is like the phrase "Now point to".

>>> x = 6
>>> x + 10
16
>>>
>>> x = 7   # Change x to 7
>>> x + 10  # Code uses the new value
17

variable x assigned to point to 6, then changed to point to 7

What does `x = y` do?

Assigning one variable to another sets both variables to point to the same value. So x = y sets x to point to the same value as y.

>>> y = 13
>>> x = y   # x same as y

variables x and y both point to 13

The assignment x = y does not set one variable to point to the other variable, although the code does kind of look like that. The assignment does not set up some kind of permanent, tracking relationship between the two variables. This is a little confusing, since in mathematics writing the symbol = does set up a permanent relationship. In code, x = y just changes x to point to what y points to at this moment.

Every Value has a Type

In Python, every value has a "type" which helps determine how that value is treated by the code.

For example, the value 42 is type int, which is the integer type. As you would expect, using the plus + operator with integer values performs addition.

>>> 42 + 10
52

A value like 'hello' is type str which is the string type. Using the + operator with strings does concatenation, combining two strings together to make one bigger string.

>>> 'hello' + 'there'
'hellothere'

At a high level, when the code runs, the + operator looks at the types of the values to select the appropriate operation.

In the computer's memory, its like each value is tagged with its type.

>>> x = 10
>>> y = 42
>>> z = 'hi'

each value is tagged with its type: variable x refers to 10 (int), variable y refers to 42 (int), z refers to 'hi' (str)

As the code runs, Python can follow each arrow and consider the type of each value to help select the correct operation.

>>> x + y  # int, so do addition
52
>>>
>>> z + z  # str, so do concatenation
'hihi'
>>>

Variable Swap

Suppose we have two variables and we want to "swap" their values, so each takes on the value of the other, for example swap the values of variables a and b:

a = 42
b = 13

Python has a kind sleight-of-hand, 1-line way of doing this using tuples, like this:

(a, b) = (b, a)

That one line will swap the two values.

If you don't care for that form, here is the spelled-out, classic 3-line sequence to swap two variables. It uses a temporary variable named "temp" to hold one of the values during the swap, like this:

temp = a
a = b
b = temp

Local Variables

The variables defined inside a function are called "local" variables to that function. Each function has its own local variables, kept separate from the local variables of all the other functions. A variable x in one function is different from a variable x in another function, even though they have the same name. This design has been found to be more productive — it is too confusing in practice if the variable x in your function starts interfering accidentally with a variable x over in some other function.

Sharing or moving data between functions is typically done with parameters and return values, not by sharing variables between functions. See the functions chapter. (In CS terminology, scoping governs how variable names work between functions.)

A "scope" is the CS term for an environment where variables are defined. Each function gets its own scope where its variables are defined and kept separate from the variables of other functions.

Global Variables

Most Python code is written just using local variables. For the curious, here is a discussion of global variables.

Consider the following Python example file:

counter = 0
MAX = 100

def foo():
    x = 2
    print(x)


def bar():
    x = 3 + counter       # get value of counter
    print(x, MAX)


def baz():
    global counter        # get/set counter
    counter = counter + 1

The first function foo() defines and uses a local variable x. Most of your Python code will look like this, doing your work with local variables defined in your functions.

There are also "global" variables, defined in the Python file but not inside of any function, so not indented at all. In the example above, the variables outer and MAX are global variables (also known as "module" variables). When Python encounters a variable in a function, it first looks for a matching local variable. If a matching local variable cannot be found (after possibly some other steps), Python looks for a matching global variable. The bar() function shows this, getting the values of the globals counter and MAX. Note that no extra syntax is required to get the value of such a global variable.

This is how constants such as MAX work in Python code. The constants are just global variables which are assigned a value once, and then accessed wherever they are needed. It's a convention to give such constants uppercase names, like MAX, which implies that the value a constant, not changing as the program runs. (Python does not enforce that the constant does not change; the programmer is supposed to know to not change the constant.)

Extra syntax is required to set the value of a global variable from within a function. The line global variable in a function declares that the function wants get/set access to the given global variable. The baz() function demonstrates this, getting and setting the value of the global counter variable.

Needing to get/set a global variable is somewhat rare. One example would be a variable to count how many times a function runs. The counter variable does this in the example, increasing by 1 each time the baz() function runs. This cannot be done with a local variable, since local variables are created anew each time the function runs. It can be tempting to use global variables to move data from one function to another, but that is not a good practice, and in particular it makes using and testing the function hard, as the flow of information into each function is muddied. Use parameters and return values to move data in and out of each function call in a crisp, well defined way.

Memory and Garbage Collection

As Python code runs, its need for bytes of memory to hold the values goes up and down dynamically. Python handles all this automatically — an automatic "garbage collection" (GC) scheme allocates bytes of memory when they are needed, and de-allocates them when they are not needed, enabling their re-use. Since Python has garbage collection, programmers can use operators like + and = in a natural way, and Python takes care of the memory needs behind the scenes.

Programmers do not need to think about the details of memory use, but here is a quick look at how Python manages memory.

Reference Counting

The main strategy Python uses to manage memory is called "reference counting" — each value in memory has an associated "refcount", counting how many references (i.e. arrows) are pointing to that value.

For example, this code sets up two variables, each pointing to a string.

>>> x = 'apple'
>>> y = 'banana'

In memory, each string has a refcount of 1, since each has one arrow pointing to it.

vars x and y each point to a string, each string has a refcount of 1

Now suppose the code does the assignment x = y, changing x to point to 'banana'.

vars x and y each point to a string, each string has a refcount of 1

Now the refcount of 'banana' is 2, since 2 arrows point to it, and the refcount of 'apple' is 0 since no arrows point to it.

The key thing about refcounts is that when the refcount goes down to 0, there are no remaining references to that memory, so no code will ever access that memory again. Therefore, that memory is "garbage", and it can be re-used to hold some new value. Python manages a refcount for each value in memory, increasing and decreasing each refcount as the code runs. When a refcount reaches 0, Python marks that memory as available for re-use.

Secondary Garbage Collector

Python also has a secondary garbage collector that looks through memory periodically to handle rare cases where reference counting does not work. (Aside: the case where reference counting does not work is a circular structure of elements where each element refers to the next to form a ring. In this case, its possible for each element can have a refcount of 1, even though the ring itself is garbage. Python needs a secondary garbage collection system to garbage collect circular structures like this.)

Managing the refcounts and running the secondary garbage collector adds a little overhead cost to the running of Python code, but it frees the programmer from having to manage the memory manually, and this is generally an attractive tradeoff. Most modern programming languages have some sort of automatic garbage collection.

In contrast, the important and foundational language "C" provides a more direct connection to the underlying hardware, and in particular programmers must manage memory on their own. This can be very flexible and run fast, but it is a chore for the programmer and a source of bugs when the programmer makes a mistake in managing the memory. The design of Python prioritizes programmer productivity, so it is natural that Python includes automatic memory management.

Variable Names are Superficial Labels

Normally variable names are chosen to reflect what data they contain. That said, there is one funny feature of variable names in code.

Consider the following computation

>>> x = 6
>>> y = x + x
>>> y
12

Using a couple variables, it computes that doubling 6 makes 12. Suppose instead it was written this way:

>>> alice = 6
>>> bob = alice + alice
>>> bob
12

This is exactly the same computation, just using different variable names. What matters in a computation is the structure — which value is used at each spot in the computation, not the words chosen. The variable names are just arbitrary labels, tying together the different parts on the computation. If we change a variable name consistently throughout the code, the computation will work the same. Python is not looking at the variable names and incorporating some aspect of what those words mean in English.

That said, though variable names are meaningless to Python, good code uses meaningful variable names to help the programmer keep their ideas straight as they write and edit the code.

History

Big revision 2025, re-ordered, add the section about memory management.

Python Variables and Assignment