Introduction to Python - Day 1

Will Horne

Introductions

  • Who am I?

    • Will Horne. You can call me Will, or if you’d prefer, Dr. Horne is fine too!

    • I’m an assistant professor of Political Science at Clemson University.

      • PhD (Princeton, 2022)
    • Substantive Interests: Parties, Representation, Class Politics, Polarization

    • Methods: Text-as-Data, Causal Inference

  • Who are you?

How to get in touch

Office: TBA

Office Hours: Daily, 10 AM – 12 PM, or by appointment

Email: roberho@umich.edu

Zoom Link: TBA (available on Canvas). Please let me know a time in advance!

Why should I learn Python?

Chart of the most-used programming languages from a recent developer survey, with Python ranked at or near the top.
  • Two of the biggest methods growth areas in political science right now — text-as-data and large-scale web data — are Python-first ecosystems.

  • It’s also the most portable skill you can pick up in grad school if you end up outside academia.

Python: Strengths and Weaknesses

Strengths

  • Free and open source

  • Massive ecosystem of user-contributed packages

  • Leading language for ML and text-as-data

Weaknesses

  • Steeper learning curve than point-and-click software

  • Environment setup can be painful

  • Syntax is unforgiving (one missing colon and nothing runs)

  • Traditional stats packages still lean toward R

Goals of this course

This is not a programming class. The goal isn’t to become a Python programmer — it’s to introduce Python as a data analysis tool.

  • Foundations (Mon–Tue): syntax, data types, control flow

  • Tools (Wed): functions, modules, NumPy

  • Working with Data (Thu–Fri): pandas, APIs, scraping, text-as-data

Pre-Requisites

No prior coding experience required — this is a true beginner course.

That said, many of you have R or Stata experience. I’ll draw R analogies throughout (I’m an R user too); I can’t speak to Stata. If you’re already comfortable in R, the first day or two will feel slow, but we move faster from Wednesday on.

Materials

There are no required books for the course. However, I recommend the following (all free online):

You will also need a Google account to use Colab (free tier is fine). Or, you can install Python and Jupyter notebooks on your machine (more involved, will not cover in class).

Course Outline

  • Today: Intro and Coding Basics

  • Tomorrow: Basics Continued and Control Flow

  • Wednesday: Object Oriented Programming and Functions

  • Thursday: Data Analysis and APIs

  • Friday: Web Scraping and Text-as-Data

Where we’re going this week

By Friday, with the skills from this course, you should be able to:

  • Pull cross-national democracy scores from V-Dem and reproduce a figure from a published article

  • Hit a public API (e.g. congress.gov, the U.S. Census) and pull structured data into a pandas DataFrame

  • Scrape political news or party press releases from a static website

  • Run basic text analysis on a corpus of party manifestos or congressional speeches

If any of those sound out of reach right now — that’s fine. By Friday, they won’t be.

Google Colab

Go to colab.google.com. Use your existing Google account, or create one for free.

Note - there are other (good) options, including IDEs like Spyder, Visual Studio Code, and Positron (Posit’s newer Python/R IDE — worth a look if you already use RStudio). You can also code Python directly in RStudio via the reticulate package.

Set Up

Once you have set up a colab account, try running the following line of code to make sure everything works

print("hello world!")
hello world!

You can navigate between blocks of code, and blocks of text, similar to a markdown file.

What is computer code?

Code

A series of clear and specific commands to perform or automate a task

How would you write a series of clear and specific commands to instruct someone to make a peanut butter and jelly sandwich?

Code For Computers

A set of instructions for a friend often assumes a lot of prior knowledge. I.e., my friend knows how to find peanut butter and take it out of the pantry.

Computers will do exactly what we tell them, and no more. They are not creative, and do not have intuition. So, we need to be very precise and detailed.

Begin in center of kitchen. Turn left 43 degrees. Move one meter south. Lift right arm. Open Cabinet Door. Etc….

AI assistants accept fuzzy English, but the interpreter underneath still reads every character literally — that’s why understanding what it expects matters.

On Generative AI and Coding

  • AI assistants (Claude, ChatGPT, Copilot, Cursor) are very good at Python.

    • So good that the failure mode has shifted. The risk used to be broken code. The risk now is working code you don’t understand.
  • When the AI is wrong, it’s usually wrong in plausible ways:

    • Hallucinated function names that don’t exist

    • Subtly wrong pandas operations (wrong axis, wrong groupby, silent type coercion)

    • Analysis that runs cleanly but doesn’t answer your actual question

Working with AI Productively

  • You can’t catch these mistakes if you can’t read Python at a reasonable level.

    • That’s the whole point of this course: build the foundation that lets you use AI tools as a force multiplier, not a black box.
  • For the first half of this week, type the code yourself. Muscle memory matters more than you’d think.

  • When you do use AI, treat its output the way you’d treat a function someone else wrote:

    • Read it before you run it.

    • Test on a small example first.

    • Ask it to explain its own output back to you — and check whether the explanation matches the code.

Python Basics

  • Python follows a clear syntax

    • grammatical rules and formatting norms
  • Object-Oriented

    • Almost everything in Python is an object — it has a type, and the type determines what you can do with it.

Basic Python Syntax

function(object) or object.method() ## do something to an object

object = “Hello World” ## create and save an object

print(object) ## apply a function to the object

Comments

Anything after # on a line is a comment — Python ignores it. Use comments to explain what your code does (to your future self, mostly).

# This is a comment — Python skips it.
x = 5     # Comments can also live at the end of a line.
print(x)
5

Objects in Python

  • Objects are assigned a value with =

  • Objects have types.

    • Different types of objects have different methods associated with them, and may have different behaviors
  • You can find the type of object in Python by applying the type function: type(object)

Scalar (or primitive) Objects

  • int type objects represent integers, e.g. 100, -5, 435

  • float type objects represent real numbers, e.g. 52.3 or -3.14159

  • bool type objects (aka logicals) take on True or False (notice capitalization, different from R)

  • str strings represent characters/language, like “Bernie Sanders” or “House Resolution 1”

Type Conversions

  • objects of different types can be converted from one to another

    • Be careful, can change the value of the object in unexpected ways!
  • Some objects cannot be converted, or can only be converted to a limited range of types

Conversion Examples

What happens?

Predict what each line returns before you run it.

a = 3.99999
int(a)
b = True
int(b)
c = 5
float(c)

What about converting a string to an int? Try it in Colab.

You Try

int and float are examples of functions. These specific functions convert between types, but it seems that int doesn’t perform exactly as you might expect.

  • In Colab, create an object called test and assign it a value of 5.6

  • Find the type of test

  • Instead of int, try round. Save the output as a new object. What value is stored, and what is its type?

  • Repeat with 5.4

Note: round isn’t a type converter — it’s a rounding function that happens to return an int. We’re comparing it to int to show how their behaviors differ.

Operators

Operator What does it do?
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus
** Exponentiation
// Floor division

The less obvious ones:

7 % 3    # remainder after division
1
7 // 3   # integer (floor) division
2

Once we get to pandas, these same operators work elementwise across whole columns of data.

Break

Let’s take a ~10 minute break, then we will come back and talk about how to use and combine objects.

Using Objects

  • When you reference an object, Python returns its value.
r = 5
pi = 3.14159
area = pi * (r ** 2)
print(area)
78.53975

Rebinding

We can rebind (or, redefine) a variable name to a different value. Everything that is run from that point forward in your script will use the new value.

Note - this can get really messy. People are often sloppy, forget they have redefined the object, etc. Just because we can, doesn’t mean we should.

What happens if you rebind r = 100 and then run the area script again?

You Try

In Google Colab

  • Create an object called inch and assign it some value.

  • Create an object called metric and assign it a value of 2.54.

  • Write an expression called cm that converts inch to centimeters.

Heads-up: don’t name a variable in — that’s a reserved keyword in Python.

Let’s Talk About Strings

  • Strings are Python’s text data type — anything you’d write in quotes.

  • We’ll spend extra time on strings. NLP and text-as-data are some of the biggest applications of Python in political science.

Assigning Values to Strings

  • We can assign strings values by wrapping a series of characters in single or double quotation marks

    • Triple quotation marks indicate a special type of string, a multi-line string
value_1 = "Wolverine"
value_2 = "Tiger"
print(value_1, value_2)
print(value_1 + value_2)
Wolverine Tiger
WolverineTiger
print(value_1 + " " + value_2)
Wolverine Tiger

Triple quotes let you write a string that spans multiple lines:

quote = """The best argument against democracy
is a five-minute conversation
with the average voter."""

print(quote)
The best argument against democracy
is a five-minute conversation
with the average voter.

More Examples

name = "Will"
email = "roberho@umich.edu"

print(name + " can be reached at " + email) ## Notice the spacing
Will can be reached at roberho@umich.edu

We can’t add strings and integers together

height = 72

print("Will is " + height + " inches tall")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 3
      1 height = 72
----> 3 print("Will is " + height + " inches tall")

TypeError: can only concatenate str (not "int") to str

Our first error message! Notice that it identifies where the error occurs, and gives an explanation.

You will see many error messages over the course of this week — that’s normal, not a sign you’re doing something wrong. Read them. They tell you exactly which line broke and usually what went wrong.

Strings and Numbers

Of course, we want to be able to include numbers in our strings. If we can’t add/concatenate strings and integers (or floats), what can we do?

We can put quotes around the numeric value to ensure python treats them as strings.

height = "72"

print("Will is " + height + " inches tall")
Will is 72 inches tall

Or, we can change the type of height and then add it. This is what you’d typically do — quoting the number means you lose the ability to use it as a number later.

height = 72

print("Will is " + str(height) + " inches tall")
Will is 72 inches tall

Indexing

Strings are made up of characters. You can pull out an individual character by its position.

Unlike R, Python uses zero-indexing.

city = "Ann Arbor"

print(city[1])
n
print(city[3])
 

What’s happening here? Is the code returning a character?

Indexing and Concatenation

We can do all sorts of things with string indexing. For example, we can make new strings.

new_word = city[0] + city[6] + city[6] + city[4]
print(new_word)
AbbA

Take a minute to create a word of your own, and save it as new_word

Negative Indexing

Negative indexing starts at the end. The last character is -1 (not -0), the second-to-last is -2, and so on.

It’s easy to confuse yourself — be mindful when mixing positive and negative indices.

city = "Ann Arbor"

print(city[-1])
r

You Try

Assign “July 6, 2026” to Date.

Use string indexing to extract the year from the string, and save the year as a new object, Year.

Solutions

Positive Indexing

Date = "July 6, 2026"

Year = Date[8] + Date[9] +Date[10] + Date[11]

print(Year)
2026

Negative Indexing

Year = Date[-4] + Date[-3] + Date[-2] + Date[-1]

print(Year)
2026

A More Elegant Solution

There’s almost always a better way. The Python idiom for “the last four characters” is slicing, which we’ll properly introduce tomorrow:

Year = Date[-4:]

print(Year)
2026

This is what you’d actually write.

F strings

f, or formatted, strings, are a convenient way to embed python objects inside of strings. The syntax is triggered by f" or F" and the object(s) are embedded in brackets {object}.

name = "Yuki"
message = f"Hello, {name}!"
print(message)  
Hello, Yuki!

F String Example

This is most useful when we are working with more complex data structures. We will see several examples throughout the course.

Here is a relatively simple example:

x = 10
y = 5
result = f"The sum of {x} and {y} is {x + y}."
print(result)
The sum of 10 and 5 is 15.

Break

Boolean Logic

  • Booleans might be the simplest data type in Python

    • Take on values of True or False
x = 5
y = x < 3
print(y)
False
  • Boolean Expressions are expressions that result in boolean, rather than numeric, values

    • Check for equivalence with ==
names_match = "Will" == "Willl"
print(names_match)
False

Comparison Operators

Operator What does it do?
A < B Checks if A is less than B
A <= B Checks if A is less than or equal to B
A > B Checks if A is greater than B
A >= B Checks if A is greater than or equal to B
A == B Checks if A is equal to B
A != B Checks if A is not equal to B

Watch out: = is assignment, == is comparison. Confusing them is the single most common beginner mistake.

x = 5      # assignment
x == 5     # comparison
True

Examples

x = 14
compare_result = (x <= 13)

compare_result
False
party = "Democrat"
is_dem = (party == "Democrat")

is_dem
True

Type Conversion with bool()

  • bool() converts any value to a boolean

  • True: any non-zero number, any non empty-string

  • False: 0, empty string

print(bool(0), bool(42), bool(""), bool("hello"))
False True False True
  • Can convert from Boolean to int or string

    • True = 1, False = 0.

    • str() will return the boolean as a string

If Statements

Imagine you are headed to work. As you get ready to leave, you check the forecast…

If there is a greater than 50% chance of rain, you take an umbrella. (This logic might explain why I always find myself caught without one).

Else, you just head out the door.

If Statements Graphically

Flowchart of if/else logic: a diamond decision node tests the condition; True branches to action B; False branches to action C.

If the condition is True, do B. Else proceed to C.

Example

chance_of_rain = 0.62

if chance_of_rain > 0.5:
    print("Take an umbrella!")
else:
    print("Leave it at home.")
Take an umbrella!

Notice the syntax.

if condition :

code (indented 4 spaces by convention)

else runs whenever the if condition is False. We’ll do more with else and elif tomorrow.

One Use Case: Taking User Input

  • One way we can use boolean logic is to check whether user input meets some criteria.
  • We can do this with the input function, which prompts a user for input
    • Input automatically records input as a string, but we can coerce it to other types if we want.
number = int(input('What is your favorite number?'))
my_fav = 5
if my_fav == number:  ## if statements need a : after the condition
    print('We have the same favorite number')  ## indentation is part of Python's syntax (more tomorrow)

Note - I can’t enter input in markdown slides, so let’s check out how this works in colab

Another Example

How could we write a script that checks whether the user has enough money to make a 20% down payment on a house?

home_cost = int(input('How much does the home cost? '))
available_funds = int(input('How much money do you have immediately available? '))
down_payment = home_cost * 0.2
if available_funds >= down_payment:
    print('You can afford the down payment!')

Note that we can make these conditionals a lot more complex, so that they can handle many different conditions. This will turn out to be very useful. More on this tomorrow!

Your Turn

I’m a fan of the 80s/90s alt rock band the Cure, who are probably best known for their hit song “Friday I’m in Love”.

Write a program that asks the user for the day of the week. If they enter Friday, it should print(“It’s Friday, I’m in Love”), and otherwise it shouldn’t print anything.

Wrapping Up

  • Tomorrow — Loops, More Complex Data Types, Functions

  • Questions: come to office hours, in person or via Zoom — link is on Canvas

  • Reading (recommended, not required): OpenStax Introduction to Python Programming, Chapters 1–3

  • Slides will be posted after each lecture on Canvas and at will-horne.github.io/icpsr-2026