= [68, 54, 72, 66, 90, 102, 49]
heartrates heartrates
[68, 54, 72, 66, 90, 102, 49]
Today, we will look at how we iterate through data structures. Consider a the following list:
= [68, 54, 72, 66, 90, 102, 49]
heartrates heartrates
[68, 54, 72, 66, 90, 102, 49]
If we want to compute some simple calculations on this list, we have some built-in functions to do that, such as sum()
, max()
, and so forth.
But what is going on behind the scenes in these functions? They all have to iterate through each element of the List. For sum()
, the function has to iterate through each element of the list to add up the total sum. For max()
, the function has to iterate through each element of the list and see if it has encountered a value bigger than it has seen before.
When we use a function on a data structure, we are almost always iterating through the elements of the data structure, and doing something with the elements as we go. So far in our journey in learning Python, we have left the iteration for the built-in function to carry out (which are optimized for performance). But this course will be focused on building custom functions that require us to iterate through the data on our own. This process will create a huge amount of flexibility and power in what you can do with your data!
It turns out that we can iterate over many types of data structures in Python. These Data Structures are considered as “Iterable”:
List
Tuple
String (yes, actually!)
Series (but not recommended)
DataFrame
Ranges
Dictionary
When a data structure is considered iterable, there are a few things you can do with it:
Access elements or subset of the data structure via the bracket [ ]
operator.
Use the in
, not in
statements to check for presence of an element in the data structure.
Examine the length via len()
.
Iterate through the data structure via a For-Loop.
You have seen examples of the first three actions already. Let’s see how we can iterate through all of these iterable data structures via the For-Loop.
A “For-Loop” allows you to iterate over an iterable data structure, and execute a block of code once for each iteration. Here is what the syntax looks like:
for <variable> in <iterable>:
block of code
The following code will iterate through each element of the list heartrates
and print out each element:
= [68, 54, 72, 66, 90, 102]
heartrates for rate in heartrates:
print("Current heartrate:", rate)
Current heartrate: 68
Current heartrate: 54
Current heartrate: 72
Current heartrate: 66
Current heartrate: 90
Current heartrate: 102
Here is what the Python interpreter is doing:
heartrates
as a list.rate
is assigned to the next element of heartrates
. If it is the first time, rate
is assigned as the first element of heartrates
.rate
is printed.heartrates
.Now you see why it’s called a “For-Loop”: for an element of the iterable data structure, do the “block of code”, and loop back to the top for the next element. You can have multiple lines of code in the indented section for the block of code.
The following code will add up all the elements of a list:
= [68, 54, 72, 66, 90, 102, 49]
heartrates = 0
total
for rate in heartrates:
= total + rate
total print("Current total:", total)
print("Final total:", total)
Current total: 68
Current total: 122
Current total: 194
Current total: 260
Current total: 350
Current total: 452
Current total: 501
Final total: 501
We just reconstructed the sum()
function!
Another way of seeing what happened is to use the following tool that allows you to step through Python code execution line-by-line and see the variables change.
If it doesn’t load properly, here is the link.
Sometimes you want to modify each element of an iterable data structure. However, if you modify the variable that is changing in the For-Loop, it won’t change the original value in the data structure.
import math
print("Before:", heartrates)
for rate in heartrates:
= math.log(rate)
rate
print("After:", heartrates)
Before: [68, 54, 72, 66, 90, 102, 49]
After: [68, 54, 72, 66, 90, 102, 49]
The code rate = math.log(rate)
changes the value of rate, but it is not connected to heartrates
anymore. Instead, we need to change heartrates[index]
, where index
is an integer that goes through all the indicies of heartrates
.
We can do this with the enumerate()
function:
= [68, 54, 72, 66, 90, 102, 49]
heartrates print("Before:", heartrates)
for index, value in enumerate(heartrates):
print("Index:", index, " value:", value)
= math.log(value)
heartrates[index] #heartrates[index] = math.log(heartrates[index]) #this is okay also.
print("After:", heartrates)
Before: [68, 54, 72, 66, 90, 102, 49]
Index: 0 value: 68
Index: 1 value: 54
Index: 2 value: 72
Index: 3 value: 66
Index: 4 value: 90
Index: 5 value: 102
Index: 6 value: 49
After: [4.219507705176107, 3.9889840465642745, 4.276666119016055, 4.189654742026425, 4.499809670330265, 4.624972813284271, 3.8918202981106265]
What’s going on here? The enumerate()
function returns something that resembles a list of tuples for us to iterate through, where the first element of the tuple is the iteration index, and the second element of the tuple is the iteration element. We access this tuple through the short-hand index, m
at the start of the For-Loop. Let’s see what enumerate()
looks like:
print(list(enumerate(heartrates)))
[(0, 4.219507705176107), (1, 3.9889840465642745), (2, 4.276666119016055), (3, 4.189654742026425), (4, 4.499809670330265), (5, 4.624972813284271), (6, 3.8918202981106265)]
Let’s see this example step by step:
If it doesn’t load properly, here is the link.
You can loop through a Tuple just like you did with a List, but remember that you can’t modify it!
You can loop through a String by iterating on each letter within the String.
= "I am hungry"
message for text in message:
print(text)
I
a
m
h
u
n
g
r
y
However, Strings are immutable, similar to Tuples. So if you iterate via enumerate()
, you won’t be able to modify the original String.
When you loop through a Dictionary, you loop through the Keys of the Dictionary:
= {'happy': 8, 'sad': 2, 'joy': 7.5, 'embarrassed': 3.6, 'restless': 4.1, 'apathetic': 3.8, 'calm': 7}
sentiment for key in sentiment:
print("key:", key)
key: happy
key: sad
key: joy
key: embarrassed
key: restless
key: apathetic
key: calm
The .items()
method for Dictionary is similar to the enumerate()
function: it returns a list of tuples, and within each tuple the first element is a key, and the second element is a value.
sentiment.items()
dict_items([('happy', 8), ('sad', 2), ('joy', 7.5), ('embarrassed', 3.6), ('restless', 4.1), ('apathetic', 3.8), ('calm', 7)])
for key, value in sentiment.items():
print(key, "corresponds to ", value)
happy corresponds to 8
sad corresponds to 2
joy corresponds to 7.5
embarrassed corresponds to 3.6
restless corresponds to 4.1
apathetic corresponds to 3.8
calm corresponds to 7
Ranges are a collection of sequential numbers, such as:
1, 2, 3, 4, 5
1, 3, 5
10, 15, 20, 25, 30
It seems natural to treat Ranges as Lists, but the neat thing about them is that only the bare minimum information is stored: the start, end, and step size. This could be a huge reduction in memory…if you need a sequence of numbers between 1 and 1 million, you can either store all 1 million values in a list, or you can just have a Range that holds the start: 1, the end: 1 million, and the step size: 1. That’s a big difference!
You can create a Range via the following ways:
range(stop)
which starts at 0 and ends in stop
- 1.
range(start, stop)
which starts at start
and ends in stop
- 1
range(start, stop, step)
which starts at start
and ends in stop
- 1, with a step size of step
.
When you create a Range object, it just tells you what the input values you gave it.
range(5, 50, 5)
range(5, 50, 5)
Convert to a list to see its actual values:
list(range(5, 50, 5))
[5, 10, 15, 20, 25, 30, 35, 40, 45]
To use Ranges in a For-Loop, it’s straightforward:
for i in range(5, 50, 5):
print(i)
5
10
15
20
25
30
35
40
45
Exercise for week 2 can be found here.