Intro to Python Day 3: Object Oriented Programming and Functions

Will Horne

Reminders

Office hours from 10-12, or by appointment

Glad to see a few people stop by today!

I know that many of you are in the Math Lecture (and likely also the LaTex lecture). If you want to meet via zoom at some other time, email me.

Optional readings are posted on the syllabus

Practice is key!

Course Outline

Monday: Intro and Coding Basics

Tuesday: Basics Continued and Control Flow

Today: Object Oriented Programming and Functions

Thursday: Data Analysis and APIs

Friday: Web Scraping and Text-as-Data

Review

Write a program that checks whether a number, input by the user, is even or not.

An Answer

if number % 2 == 0:
    print(f"{number} is even!")
else:
    print(f"{number} is odd!")

More Review

Update your program to check which numbers from 1 to 10 are even. For each number, also indicate whether it is divisible by 3.

For each number, your output should clearly distinguish among four cases:

  • divisible by 2 only
  • divisible by both 2 and 3
  • divisible by 3 only
  • divisible by neither

An Answer

for number in range(1, 11):
    if number % 2 == 0:
        print(f"{number} is even!")
        if number % 3 == 0:  # Nest this if inside the first if
            print(f"{number} is also divisible by 3!")
    elif number % 3 == 0:    # note the indentation here!
        print(f"{number} is only divisible by 3!")
    else:                    # note the indentation here!
        print(f"{number} is not divisible by 2 or 3!")
1 is not divisible by 2 or 3!
2 is even!
3 is only divisible by 3!
4 is even!
5 is not divisible by 2 or 3!
6 is even!
6 is also divisible by 3!
7 is not divisible by 2 or 3!
8 is even!
9 is only divisible by 3!
10 is even!

An indentation tip for R users

R and Python have different approaches to dealing with nesting and conditionality.

If I wanted to write a for loop in R, it would look like this

#| eval: false
for (number in 1:9) {
  if (number %% 2 == 0) {
    cat(number, "is even!\n")
    if (number %% 3 == 0) {  # Nested inside the first if
      cat(number, "is also divisible by 3!\n")
    }
  } else if (number %% 3 == 0) {  # else-if at the same level as the first if
    cat(number, "is only divisible by 3!\n")
  } else {  # else at the same level as the first if
    cat(number, "is not divisible by 2 or 3!\n")
  }
}

I think the Python version is more readable!

Key concepts from Last Time

  • Branching with if, elif and else

  • While and For loops for iteration

  • Lists

    • Structure

    • Methods

Strings are Immutable Objects

List methods were updating the list in memory.

This is because lists are mutable objects, so we can change them with methods.

String methods (and methods applied to other immutables) do not update the object in memory — instead they return a new object. Let’s take a look in Colab.

Recall yesterday’s exercise

Yesterday we wrote a loop that lowercased a list of newspaper titles and then extended it with a list of TV news networks:

papers = ["New York Times", "Washington Post", "Boston Globe", "Philadelphia Inquirer", "Atlanta Journal-Constitution"]

for i in range(len(papers)):
    papers[i] = papers[i].lower()  # note that `for paper in papers` doesn't update papers[i]

stations = ["Fox News", "CNN", "MSNBC", "Newsmax"]
papers.extend(stations)
papers
['new york times',
 'washington post',
 'boston globe',
 'philadelphia inquirer',
 'atlanta journal-constitution',
 'Fox News',
 'CNN',
 'MSNBC',
 'Newsmax']

This works, but the range(len(papers)) pattern is clunky. Python gives us a cleaner way.

A different approach to the loop

enumerate is a built-in function that takes an iterable and returns index-value pairs.

0: New York Times

1: Washington Post

and so on…

for idx, paper in enumerate(papers):
    print(idx, paper)

Take a moment to think about how we might be able to use enumerate to simplify our loop

Loop with Enumerate

for idx, paper in enumerate(papers):
    papers[idx] = paper.lower()

print(papers)
['new york times', 'washington post', 'boston globe', 'philadelphia inquirer', 'atlanta journal-constitution', 'fox news', 'cnn', 'msnbc', 'newsmax']

A relevant string method

If we have a string of multiple words (i.e., a sentence, paragraph, speech), string.split() will break it by any delimiter. The default is whitespace " ".

We can specify whatever delimiter we want — let’s try in Colab.

Another Enumerate Example

Create the following sentence

sentence = "I truly love learning python!"

Write a for loop (or while loop, if you want) that prints each word, with its index, one by one.

Hint: use .split() to get each word

Output should be something like:

Word 0: I

Word 1: truly

etc…

Dictionaries

Dictionaries are objects in Python that consist of key:value pairs

For example, suppose we wanted to create a dictionary of employees and their salaries

salaries = {"Jane":100,
"Mike": 80,
"Ali": 85,
"Diego": 80}

print(salaries)
{'Jane': 100, 'Mike': 80, 'Ali': 85, 'Diego': 80}

Keys and Values

  • keys

    • must be unique (i.e., can’t have Jane twice)

    • must be immutable type (int, string, float, bool, tuple)

  • values

    • any type

    • Mike and Diego can have the same salary

    • can be lists, other dictionaries, etc.

  • In Python 3.7+, dictionaries maintain insertion order (sets still don’t)

You Try

Create a Dictionary using political parties as keys, and politicians who are members of that party as values. For example, you could have a key value pair of Democrat: Kamala Harris or Labour: Keir Starmer.

Dictionary Methods

  • .keys() will give you the keys

  • .values() will give you the values

  • .update() will update key:value pairs

    • useful for batch updating, i.e., importing keys:values from another dictionary

Using Dictionaries

salaries = {"Jane":100,
"Mike": 80,
"Ali": 85,
"Diego": 80}

salaries["Mike"] ## return Mike's salary
salaries["Mike"] = 90 ## Mike got a raise! Update his salary
salaries["Allison"] = 105 ## New Hire! Add a new Key:Value pair

print(salaries)
{'Jane': 100, 'Mike': 90, 'Ali': 85, 'Diego': 80, 'Allison': 105}

Iterating over a dictionary

We can iterate over dictionaries. Consider the following dictionary, and write a for loop to print out each student’s grade.

grades = {"Ali": "A+", "Bella": "A", "Will": "C-", "Sam": "B"}

Hint: grades[person] (or whatever index) will return that person’s grade.

Your Turn

  • Create a dictionary using any method of your choosing

  • Find the value(s) associated with the key in the second position

  • print all the keys

  • Create a second dictionary from a list

  • Add that dictionary to the original

  • Iterate over the dictionary to print “The key is [key] and the value is [value]” for each entry

Break?

If we have yet to take our first break, we should take a break now!

elif it’s been over an hour since our last break, this would also be a good time to take a break!

else let’s continue on with class!

Functions

  • Reusable pieces of code

    • If you are going to do something over and over again, make it into a function!
  • functions are not run until they are called/invoked somewhere.

    • Once you create a function, it’s saved in your (global) environment

Function characteristics

  • name
  • parameters/arguments (0 or more)
  • might have a docstring explaining use
  • has a body
  • returns something

Example Function

def is_even(i):
    """
    Input: i, an integer
    Returns True if i is even, otherwise False
    """
    return i % 2 == 0

is_even(4)
True

Functions: Names and Arguments

  • def is the keyword used to define the function

    • : triggers the body of the function

    • same indentation rules

  • name of the function comes after def

  • () contains the parameters or arguments of the function

Functions: Docstring

  • the docstring, enclosed in “““, provides info on how to use the function

  • the docstring can be called with help()

def is_even(i):
    """
    Input: i, an integer
    Returns True if i is even, otherwise False
    """
    return i % 2 == 0

help(is_even)
Help on function is_even in module __main__:

is_even(i)
    Input: i, an integer
    Returns True if i is even, otherwise False

Caution! Variable Scope

  • Scope is the mapping of objects to environments

  • Usually, we are in global scope

    • when we enter a function, a new scope/frame/environment is created

    • unless we call something from the global environment, all our variables are within the local environment of the function call

    • Once the function exits, any intermediate values will be discarded

Variable Scope: Lookup

def p(y):
    print(x)

x = 5

p(10)

What will this print?

def p(x):
    print(x)

x = 5

p(10)

What about this?

Variable Scope: Reassignment

x = 3
def square(x):
    x = x**2
    return x

z = square(x)
print(x)

What will this print?

Functions: Return

  • return can only be used inside of a function

    • functions can have multiple returns

    • only one of them will be used each time a function is invoked

      • Think conditional logic w/in function
  • Once return is hit, the function’s scope is exited and nothing else in the function is run

Returns in Functions

def check_number(number):
    if number > 0:
        return "positive"
    elif number < 0:
        return "negative"
    else:
        return "zero"

As soon as we hit return, the function will exit. Let’s try this out in colab.

Your Turn!

Write a function to check whether or not a number is divisible by 6. Test that function on the following numbers; 17, 64, 108, 157, 200.

Then, add an additional condition to check if the number is also divisible by 9.

An Answer

def is_div_six(number):
    if number % 6 == 0:
        return "Number is divisible by 6"
    else:
        return "Number is not divisible by 6"

is_div_six(200)
'Number is not divisible by 6'
def is_div_six(number):
    if number % 6 == 0:
        if number % 9 == 0:
            return "Number is divisible by 6 and 9"
        else:
            return "Number is divisible by 6"
    else:
        return "Number is not divisible by 6"

is_div_six(108)
'Number is divisible by 6 and 9'

Dictionaries in Functions

def word_freq(text):
    words_list = text.split()
    freq = {}
    for word in words_list:
        if word in freq:
            freq[word] += 1
        else:
            freq[word] = 1
    return freq

Run this function on some quote you find (or make up). What does it return? How might it be useful?

What would you include as a docstring?

You Try

Imagine we had a list of employees and salaries from some company, and we wanted to extract a new dictionary of high earners (specifically, those earning more than the mean salary at the company).

We write a function to do this. Then, test it out with a dictionary you create

Break Time?

If we have yet to take our second break, we should take a break now!

else let’s continue on with class!

Python Modules

  • Python modules are files (.py) that (mainly) contain function definitions

  • they allow us to organize, distribute code; to share and reuse others’ code too

    • Can easily create, save and load our own custom modules
  • keep code coherent and self-contained

  • one can import modules or some functions from modules

Python Modules: The Standard Library

The standard library already contains a bunch of useful modules. For example, we can load the math module.

Instead of writing our own power function

def raise_to_power(a, b):
    return a**b

raise_to_power(2, 3)
8

We can import a module that has already defined this function

import math
math.pow(2, 3)
8.0

Another example

There are tons of useful modules in the standard library

from datetime import date

today = date.today()

print("Today's date is:", today)
Today's date is: 2026-06-05

Here, I don’t necessarily want to import the whole datetime module, so I can instead just import date.

The standard modules come with your Python install. Colab also has many common libraries installed. Tomorrow we will cover how to install other libraries!

Comprehensions

  • Shorthand code that can replace while/for loops and if/elif/else statements

  • Can be used for lists, sets and dictionaries

  • Can make code shorter and easier to read

Comprehension Syntax

In general, a comprehension will look something like this

[expr for value in object if condition] ## do something to a value in some object (optionally if some condition is met)

As a loop, it might look like this

result = []
for value in object:
    if condition:
        result.append(expr)

Example

Simple example - create a copy of a list

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
new_list = []
for number in numbers:
    new_list.append(number)
print(new_list)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Simpler with a list comprehension

new_list = [number for number in numbers]
print(new_list)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Another List Comprehension

Square every item in a list, creating a new list of squared numbers

numbers = [2, 3, 4, 5]
new_list = []
for number in numbers:
    new_list.append(number**2)
print(new_list)
[4, 9, 16, 25]

With a list comprehension

numbers = [2, 3, 4, 5]

squared = [num**2 for num in numbers]

print(squared)
[4, 9, 16, 25]

One More List Comprehension

Example - Square only even numbers

some_squared = []

for num in numbers:
    if num % 2 == 0:
        some_squared.append(num**2)

print(some_squared)
[4, 16]

With a list comprehension

some_squared = [num**2 for num in numbers if num % 2 == 0]

print(some_squared)
[4, 16]

You Try

Write your own list comprehension for a list of strings that creates a new list of only long words (words of more than 5 characters).

Zip()

zip() will combine two lists by index, which can be useful if we have lists that match by index. For example, consider

countries = ["USA", "Canada", "Mexico"]
capitals = ["Washington, DC.", "Ottawa", "Mexico City"]

merged = zip(countries, capitals)

print(merged)
<zip object at 0xffcd5847be80>

So - this is kind of weird. It returns a zip object - how can we make this useful?

Note, pandas has nicer options for joining that aren’t just by index, so we aren’t limited to having indexes match!

A Dictionary Comprehension

We might wish to combine two dictionaries

countries = ["USA", "Canada", "Mexico"]
capitals = ["Washington, DC.", "Ottawa", "Mexico City"]
capital_dict = {}
for i in range(len(countries)):
    capital_dict[countries[i]] = capitals[i]

print(capital_dict)
{'USA': 'Washington, DC.', 'Canada': 'Ottawa', 'Mexico': 'Mexico City'}

Dictionary Comprehension (with zip())

capital_dict = {key:value for key, value in zip(countries, capitals)}

print(capital_dict)
{'USA': 'Washington, DC.', 'Canada': 'Ottawa', 'Mexico': 'Mexico City'}

Zip objects are iterables, so we can iterate over them with loops or comprehensions

More Dictionary Comprehensions

Extract just the USA key:value pair (Get the capital of the USA)

usa_cap = {key: value for key, value in zip(countries, capitals) if key == "USA"}

print(usa_cap)
{'USA': 'Washington, DC.'}

Or, use the value to extract the key

ottawa_country = {key: value for key, value in zip(countries, capitals) if value == "Ottawa"}

print(ottawa_country)
{'Canada': 'Ottawa'}

Your Turn!

Take one of our loops from yesterday (or come up with your own), and re-implement it using a comprehension

NumPy

NumPy is short for numerical Python. It’s a foundational package for data analysis in Python, and many packages depend on numpy arrays as a data type.

Many later libraries, like pandas, are built on the functions and data structures of NumPy. Machine learning frameworks like tensorflow also build on this infrastructure.

The most common and useful data structure in NumPy is the array. Arrays are structured objects in Python that contain data of all the same type.

Importing Libraries

numpy is not a built-in feature of Python, so we need to load it

by convention we load numpy as np

import numpy as np 

NP arrays

numpy Arrays are useful because they are very computationally efficient ways of storing multi-dimensional data. On average, between 10x to 100x faster than other Python approaches.

Most data in the social sciences is multidimensional, so this is crucial for our purposes!

# create an array of data
data = np.array([[1.5, -0.1, 3], [0, -3, 6.5]])

data
array([[ 1.5, -0.1,  3. ],
       [ 0. , -3. ,  6.5]])

Math with Arrays

data * 10

data + 5

data * data 
array([[2.250e+00, 1.000e-02, 9.000e+00],
       [0.000e+00, 9.000e+00, 4.225e+01]])

Array Methods

We can check the shape of data stored in an array

data.shape
(2, 3)

Or the type of data stored in the array

data.dtype
dtype('float64')

More Array Methods

We can make an array of zeros with arbitrarily many dimensions like so

np.zeros((2, 3, 2))
array([[[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]]])

Or an ordered array from 0 to 19 like this

np.arange(20)
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Slicing and Indexing Arrays

Like with lists, we can slice and index arrays. Let’s start with a 1D array. As with other data types, arrays are zero indexed in Python

arr = np.arange(10)

arr[5]

arr[5:8]
array([5, 6, 7])

We can update the values of arrays

arr[5:8] = 12

arr
array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

Warning!

You do need to be careful with slicing arrays. Even if you save a slice of an array to a new object, NumPy still recognizes that slice as part of the original array. And if you change the values in that new object, it will also change the original object in memory.

# uh-oh

arr_slice = arr[5:8]
arr_slice

arr_slice[1] = 12345

arr
array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,
           9])

Comparing Arrays

arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])

arr2 > arr
array([[False,  True, False],
       [ True, False,  True]])

Multidimensional Arrays

We can create arrays of arbitrarily many dimensions, although if things are getting very complex we might want to think about other ways to store our data

arr_3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

arr_3d
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

Pay close attention to the placement of the brackets. Easy to mess this up!

Multidimensional Indexing

arr_3d[1]
array([[ 7,  8,  9],
       [10, 11, 12]])
arr_3d[1, 0]
array([7, 8, 9])
arr_3d[1, 0, 1]
np.int64(8)

Tomorrow

  • Tomorrow — Data analysis with pandas, plus accessing APIs

  • Questions: come to office hours (10 AM – 12 PM daily), or email me

  • Recommended reading: McKinney, Python for Data Analysis, Ch. 3–7

  • Slides will be posted after class on Canvas and at will-horne.github.io/icpsr-2026