loops#

Loops are probably one of the more conceptually difficult concepts for new programmers, but once they are used a bit in practice, it becomes easier to grasp.

Loops are ways of working with collections of data (like lists) by isolating each item from the list, one at a time. A loop will go through each item in a list, like a list of breakfast items, to do a specific action to each item, like to display the item.

breakfast = ['egg sandwich', 'coffee', 'biscuits and gravy', 'strawberries', 'omelette']

for item in breakfast:
  print(item)
egg sandwich
coffee
biscuits and gravy
strawberries
omelette

Here, we’ve also introduced a new function, print(). This function will “print,” or display, whatever data you pass into the parentheses.

Some historical context for the name of the print() function: it comes from a time before screens, when computers literally printed the output of their computations.

Loops work with strings as well. We might print each character (or letter) from a string like “hello”:

for letter in 'hello':
  print(letter)
h
e
l
l
o

The syntax for writing a loop contains two lines. The first line identifies one item from a collection of items, such as a list. You can think of the first line saying something like “for (each) item (inside this) breakfast (list).” Then, the second line specifies the action done to that item, in this case, the instruction to print that item.

for item in breakfast:
    print(item)
egg sandwich
coffee
biscuits and gravy
strawberries
omelette
for each item inside breakfast:
  display each item

a note on variable names#

The variable which indicates each individual item (item) can be assigned to any name. That’s because the variable is assigned on the fly, as the loop is being executed. This goes against usual practices with variable assignments, which always need to be assigned beforehand, or there will be a NameError because the variable hasn’t been defined.

for x in y:
    print(x)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 for x in y:
      2     print(x)

NameError: name 'y' is not defined

For example, we could name the individual item x, or any other letter, and the loop would run just fine. However, it’s helpful to give as meaningful names as possible, to avoid code that gives no indication about the data contained in the variables. This will be really useful for other people reading your code as well. The idea is to be semantically meaningful while keeping your code as concise as possible.

When using variables in loops, the convention is to use the letter i as shorthand for indicating an item from a list.

for i in breakfast:
    print(i)

string methods#

Loops are great ways to do things en masse to data, such as to individual words within a text, in other words, to string type data that is contained within a list.

For example, you might want to lowercase all of the letters from our dataset to prepare it for analysis (explained in further detail in the “Text Cleaning” workshop). To do that, use the lower() method.

'HELLO'.lower()

Now, let’s combine this string method with what we know about loops. Let’s try it on a list of items with varying levels of capitalization.

cities = ['Lisbon', 'Setubal', 'Alcacer do Sal', 'Lagos']
for city in cities:
   print(city.lower())

Here, the loop goes through each item in the list of “cities,” and lowercases that item while also printing it (remember the print() function?). Here, the lower() method is attached to the “city” variable, which is assigned to each item, or city name, in the list.

Let’s try to do the same with a longer list, like the text from the “Feminist Data Manifest-NO”. First, though, we need to split our text, which is currently one long string, into individual strings within a list. For that, we use the string method, split(), which we saw briefly above when we talked about list methods.

text = ''' 
  1. We refuse to operate under the assumption that risk and harm
  associated with data practices can be bounded to  mean the same
  thing for everyone, everywhere, at every time. We commit to
  acknowledging how historical and systemic patterns of violence 
  and exploitation produce differential vulnerabilities for 
  communities.
  2. We refuse to be disciplined by data, devices, and practices 
  that seek to shape and normalize racialized, gendered, and 
  differently-abled bodies in ways that make us available to be 
  tracked, monitored, and surveilled. We commit to taking back 
  control over the ways we behave, live, and engage with data and 
  its technologies.
  3. We refuse the use of data about people in perpetuity. We 
  commit to embracing agency and working with intentionality, 
  preparing bodies or corpuses of data to be laid to rest when they 
  are not being used in service to the people about whom they were 
  created.
  4. We refuse to understand data as disembodied and thereby 
  dehumanized and departicularized. We commit to understanding 
  data as always and variously attached to bodies; we vow to 
  interrogate the biopolitical implications of data with a keen 
  eye to gender, race, sexuality, class, disability, nationality, 
  and other forms of embodied difference.
  5. We refuse any code of phony “ethics” and false proclamations of 
  transparency that are wielded as cover, as tools of power, as forms 
  for escape that let the people who create systems off the hook from 
  accountability or responsibility. We commit to a feminist data 
  ethics that explicitly seeks equity and demands justice by helping 
  us understand and shift how power works.'''

words = text.split()

# print out just the first 10 words
words[:10]

We can see our text as a list of strings, saved to the variable “words.” Now, we can do things programmatically to each string in the list, like lowercase the items.

words_lower = []
for i in words:
  words_lower.append(i.lower())

words_lower

This loop goes through each item in the list of words, lowercases it, then adds that word to a new list, called words_lower.

Notice here that I’ve introduced a few new things:

  1. Above the loop, I’ve created an empty list, called words_lower where we will eventually drop our lowercased words.

  2. The first line of the loop uses the i variable to stand in for each “item” in the list, which is a popular convention.

  3. The last line of the loop uses a list method to add data to our empty list. Notice that the list method append() is attached to a list object, while the string method lower() is attached to a string object.