saving scraped data#

For our project, remember that we want to scrape information about each bill contained within the bill cards.

And like all good programmers, we broke this our task up into a number of steps, some of which we’ve already done in the previous notebook:

  1. isolate the bill_cards data from the rest of the webpage (already done)

  2. pick out the information we want from the bill cards (already done)

  3. process our information into lists

  4. adding more data to our lists

  5. save that information to a csv file

Now, we are on step three, processing elements and saves them into a list. Each of these steps itself contains smaller steps, which we will figure out as we go along.

Before continuing our work, we will import the libraries we need and create our soup object (that holds our website content), and our bill_cards object (which holds our bill card data).

import requests
from bs4 import BeautifulSoup
site = requests.get('https://translegislation.com/bills/2024/US')
html_code = site.content
soup = BeautifulSoup(html_code, 'lxml')
---------------------------------------------------------------------------
FeatureNotFound                           Traceback (most recent call last)
Cell In[2], line 3
      1 site = requests.get('https://translegislation.com/bills/2024/US')
      2 html_code = site.content
----> 3 soup = BeautifulSoup(html_code, 'lxml')

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/bs4/__init__.py:364, in BeautifulSoup.__init__(self, markup, features, builder, parse_only, from_encoding, exclude_encodings, element_classes, **kwargs)
    362     possible_builder_class = builder_registry.lookup(*features)
    363     if possible_builder_class is None:
--> 364         raise FeatureNotFound(
    365             "Couldn't find a tree builder with the features you "
    366             "requested: %s. Do you need to install a parser library?"
    367             % ",".join(features)
    368         )
    369     builder_class = possible_builder_class
    371 # At this point either we have a TreeBuilder instance in
    372 # builder, or we have a builder_class that we can instantiate
    373 # with the remaining **kwargs.

FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
# to get the element and class for the cards, use the inspector

bill_cards = soup.find_all('div', class_ ='css-4rck61')

Now, we can write our loop that grabs all the elements we want.

# runs the loop on the bill cards

for item in bill_cards[:10]: # only the first ten cards, just to check if it is working
    print(item.h3.text) # title
    print(item.h2.text) # caption
    print(item.span.text) # category
    print(item.p.text) # description (if any)
    print(item.a['href']) # add https://translegislation.com/bills/2023/US
US HB1064
Ensuring Military Readiness Act of 2023
MILITARY
To provide requirements related to the eligibility of transgender individuals from serving in the Armed Forces.
/bills/2024/US/HB1064
US HB1112
Ensuring Military Readiness Act of 2023
MILITARY
To provide requirements related to the eligibility of individuals who identify as transgender from serving in the Armed Forces.
/bills/2024/US/HB1112
US HB1276
Protect Minors from Medical Malpractice Act of 2023
HEALTHCARE
To protect children from medical malpractice in the form of gender transition procedures.
/bills/2024/US/HB1276
US HB1399
Protect Children’s Innocence Act
HEALTHCARE
To amend chapter 110 of title 18, United States Code, to prohibit gender affirming care on minors, and for other purposes.
/bills/2024/US/HB1399
US HB1490
Preventing Violence Against Female Inmates Act of 2023
INCARCERATION
To secure the dignity and safety of incarcerated women.
/bills/2024/US/HB1490
US HB1585
Prohibiting Parental Secrecy Policies In Schools Act of 2023
EDUCATION
To require a State receiving funds pursuant to title II of the Elementary and Secondary Education Act of 1965 to implement a State policy to prohibit a school employee from conducting certain social gender transition interventions.
/bills/2024/US/HB1585
US HB216
My Child, My Choice Act of 2023
EDUCATION
To prohibit Federal education funds from being provided to elementary schools that do not require teachers to obtain written parental consent prior to teaching lessons specifically related to gender identity, sexual orientation, or transgender studies, and for other purposes.
/bills/2024/US/HB216
US HB3101
TPA Act Traditional Passport Act
OTHER
To prohibit the issuance of a passport with any gender designation other than "male" or "female", and for other purposes.
/bills/2024/US/HB3101
US HB3102
TSA Act Traditional Screening Application Act
OTHER
To prohibit the Transportation Security Administration from using the "X" gender designation in the TSA PreCheck advanced security program, and for other purposes.
/bills/2024/US/HB3102
US HB3328
Protecting Children From Experimentation Act of 2023
HEALTHCARE
To amend chapter 110 of title 18, United States Code, to prohibit gender transition procedures on minors, and for other purposes.
/bills/2024/US/HB3328

step 3: process our information into lists#

Now, the next step is to assign a variable for each item. This allows us to save the data to the variable name, and later, to add it to a list.

for item in bill_cards[:10]:
    title = item.h3.text
    caption = item.h2.text
    category = item.find('span').text
    description = item.p.text
    link = 'https://translegislation.com/bills/2023/passed' + item.a['href']
    print(title, caption, category, description, link)
US HB1064 Ensuring Military Readiness Act of 2023 MILITARY To provide requirements related to the eligibility of transgender individuals from serving in the Armed Forces. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1064
US HB1112 Ensuring Military Readiness Act of 2023 MILITARY To provide requirements related to the eligibility of individuals who identify as transgender from serving in the Armed Forces. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1112
US HB1276 Protect Minors from Medical Malpractice Act of 2023 HEALTHCARE To protect children from medical malpractice in the form of gender transition procedures. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1276
US HB1399 Protect Children’s Innocence Act HEALTHCARE To amend chapter 110 of title 18, United States Code, to prohibit gender affirming care on minors, and for other purposes. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1399
US HB1490 Preventing Violence Against Female Inmates Act of 2023 INCARCERATION To secure the dignity and safety of incarcerated women. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1490
US HB1585 Prohibiting Parental Secrecy Policies In Schools Act of 2023 EDUCATION To require a State receiving funds pursuant to title II of the Elementary and Secondary Education Act of 1965 to implement a State policy to prohibit a school employee from conducting certain social gender transition interventions. https://translegislation.com/bills/2023/passed/bills/2024/US/HB1585
US HB216 My Child, My Choice Act of 2023 EDUCATION To prohibit Federal education funds from being provided to elementary schools that do not require teachers to obtain written parental consent prior to teaching lessons specifically related to gender identity, sexual orientation, or transgender studies, and for other purposes. https://translegislation.com/bills/2023/passed/bills/2024/US/HB216
US HB3101 TPA Act Traditional Passport Act OTHER To prohibit the issuance of a passport with any gender designation other than "male" or "female", and for other purposes. https://translegislation.com/bills/2023/passed/bills/2024/US/HB3101
US HB3102 TSA Act Traditional Screening Application Act OTHER To prohibit the Transportation Security Administration from using the "X" gender designation in the TSA PreCheck advanced security program, and for other purposes. https://translegislation.com/bills/2023/passed/bills/2024/US/HB3102
US HB3328 Protecting Children From Experimentation Act of 2023 HEALTHCARE To amend chapter 110 of title 18, United States Code, to prohibit gender transition procedures on minors, and for other purposes. https://translegislation.com/bills/2023/passed/bills/2024/US/HB3328

It works! Now let’s save it to lists.

# a bunch of empty lists where we will dump our data
titles = []
captions = []
categories = []
descriptions = []
links = []

# our for loop that saves each item we want from the bill_cards
for item in bill_cards:
    title = item.h3.text
    category = item.find('span').text
    caption = item.h2.text
    description = item.p.text
    link = 'https://translegislation.com/bills/2023/passed' + item.a['href']
    
    # adding the items to the empty lists
    titles.append(title)
    categories.append(category)
    captions.append(caption)
    descriptions.append(description)
    links.append(link)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[10], line 13
     11 category = item.find('span').text
     12 caption = item.h2.text
---> 13 description = item.p.text
     14 link = 'https://translegislation.com/bills/2023/passed' + item.a['href']
     16 # adding the items to the empty lists

AttributeError: 'NoneType' object has no attribute 'text'

individual challenge:#

Google this error, try to understand what it means. And then try out a solution from Stak Overflow, making sure to change out the variable names.

# a bunch of empty lists where we will dump our data
titles = []
captions = []
categories = []
descriptions = []
links = []

# our for loop that saves each item we want from the bill_cards
for item in bill_cards:
    title = item.h3.text
    category = item.find('span').text
    caption = item.h2.text
    if item.h2.text is not None:
        description = item.h2.text
    else:
        description = 'No bill description'
    link = 'https://translegislation.com/bills/2023/passed' + item.a['href']
    
    # adding the items to the empty lists
    titles.append(title)
    categories.append(category)
    captions.append(caption)
    descriptions.append(description)
    links.append(link)

step 4: adding more data to our lists#

Before saving our dataset to a spreadsheet, we are going to do a bit more data gathering. This will enable us to make a more robust dataset at the end. Here, we are going to get the link directly to the bill page on LegiScan.

Like the previous sections, I’m going to use comments to write some pseudo-code that separates out the steps of the larger task. This is good practice for all programmers.

## now, we will get the link to state bill, in the following steps:

## first, make a list of URLs:
## then, for each URL, make a soup.
## then, for each soup, get the link to the state bill, called "extension"
## then, add the link extension to the root, saving it as "urls"
## finally, add the urls to a new list, called "legiscan links"
for item in bill_cards[:10]:
    extension = 'https://translegislation.com/' + item.a['href']
    print(extension)
https://translegislation.com//bills/2024/US/HB1064
https://translegislation.com//bills/2024/US/HB1112
https://translegislation.com//bills/2024/US/HB1276
https://translegislation.com//bills/2024/US/HB1399
https://translegislation.com//bills/2024/US/HB1490
https://translegislation.com//bills/2024/US/HB1585
https://translegislation.com//bills/2024/US/HB216
https://translegislation.com//bills/2024/US/HB3101
https://translegislation.com//bills/2024/US/HB3102
https://translegislation.com//bills/2024/US/HB3328
urls = []
for item in bill_cards:
    extension = 'https://translegislation.com/' + item.a['href']
    urls.append(extension)
urls[:10]
['https://translegislation.com//bills/2024/US/HB1064',
 'https://translegislation.com//bills/2024/US/HB1112',
 'https://translegislation.com//bills/2024/US/HB1276',
 'https://translegislation.com//bills/2024/US/HB1399',
 'https://translegislation.com//bills/2024/US/HB1490',
 'https://translegislation.com//bills/2024/US/HB1585',
 'https://translegislation.com//bills/2024/US/HB216',
 'https://translegislation.com//bills/2024/US/HB3101',
 'https://translegislation.com//bills/2024/US/HB3102',
 'https://translegislation.com//bills/2024/US/HB3328']
# making a soup object of *every* page that is linked
# this may take several seconds

soups = []
for item in urls:
    site = requests.get(item)
    html_code = site.content
    soup = BeautifulSoup(html_code, 'lxml')
    soups.append(soup)
legiscan_links = []
congress_links = []
for item in soups:
    # we are getting two links here, one to legiscan and one to the congress website
    links = item.find_all('a', class_='chakra-link css-oga2ct')
    anchor1 = links[0]['href'] # link to legiscan
    legiscan_links.append(anchor1)
    anchor2 = links[1]['href'] # link to congress
    congress_links.append(anchor2)
legiscan_links
['https://legiscan.com/US/text/HB1064/id/2737306',
 'https://legiscan.com/US/text/HB1112/id/2742708',
 'https://legiscan.com/US/text/HB1276/id/2755407',
 'https://legiscan.com/US/text/HB1399/id/2796538',
 'https://legiscan.com/US/text/HB1490/id/2761146',
 'https://legiscan.com/US/text/HB1585/id/2763467',
 'https://legiscan.com/US/text/HB216/id/2654610',
 'https://legiscan.com/US/text/HB3101/id/2830677',
 'https://legiscan.com/US/text/HB3102/id/2815463',
 'https://legiscan.com/US/text/HB3328/id/2818358',
 'https://legiscan.com/US/text/HB3329/id/2818922',
 'https://legiscan.com/US/text/HB3462/id/2827206',
 'https://legiscan.com/US/text/HB3887/id/2833147',
 'https://legiscan.com/US/text/HB429/id/2674746',
 'https://legiscan.com/US/text/HB4365/id/2846650',
 'https://legiscan.com/US/text/HB4367/id/2866079',
 'https://legiscan.com/US/text/HB4398/id/2835893',
 'https://legiscan.com/US/text/HB4665/id/2843997',
 'https://legiscan.com/US/text/HB4821/id/2849882',
 'https://legiscan.com/US/text/HB5/id/2761423',
 'https://legiscan.com/US/text/HB5327/id/2839779',
 'https://legiscan.com/US/text/HB5636/id/2842877',
 'https://legiscan.com/US/text/HB5893/id/2847457',
 'https://legiscan.com/US/text/HB5894/id/2847458',
 'https://legiscan.com/US/text/HB6040/id/2847526',
 'https://legiscan.com/US/text/HB6258/id/2852634',
 'https://legiscan.com/US/text/HB6658/id/2866510',
 'https://legiscan.com/US/text/HB6728/id/2866449',
 'https://legiscan.com/US/text/HB7183/id/2930086',
 'https://legiscan.com/US/text/HB7187/id/2930340',
 'https://legiscan.com/US/text/HB734/id/2787793',
 'https://legiscan.com/US/text/HB736/id/3023788',
 'https://legiscan.com/US/text/HB7725/id/2974064',
 'https://legiscan.com/US/text/HB8070/id/3011826',
 'https://legiscan.com/US/text/HB8433/id/3012543',
 'https://legiscan.com/US/text/HB8580/id/3020996',
 'https://legiscan.com/US/text/HB8708/id/3016017',
 'https://legiscan.com/US/text/HB8752/id/3013700',
 'https://legiscan.com/US/text/HB8771/id/3020997',
 'https://legiscan.com/US/text/HB8774/id/3020988',
 'https://legiscan.com/US/text/HB8997/id/3014401',
 'https://legiscan.com/US/text/HB8998/id/3021001',
 'https://legiscan.com/US/text/HB9026/id/3014403',
 'https://legiscan.com/US/text/HB9027/id/3014469',
 'https://legiscan.com/US/text/HB9028/id/3014468',
 'https://legiscan.com/US/text/HB9029/id/3014470',
 'https://legiscan.com/US/text/HB9218/id/3019830',
 'https://legiscan.com/US/text/HB9586/id/3021417',
 'https://legiscan.com/US/text/HB985/id/2727973',
 'https://legiscan.com/US/text/HJR160/id/3003197',
 'https://legiscan.com/US/text/HJR165/id/3015576',
 'https://legiscan.com/US/text/HR115/id/2692544',
 'https://legiscan.com/US/text/HR1223/id/2996410',
 'https://legiscan.com/US/text/HR282/id/2773507',
 'https://legiscan.com/US/text/HR298/id/2786011',
 'https://legiscan.com/US/text/HR518/id/2828339',
 'https://legiscan.com/US/text/HR536/id/2830680',
 'https://legiscan.com/US/text/HR769/id/2852616',
 'https://legiscan.com/US/text/SB1595/id/2819634',
 'https://legiscan.com/US/text/SB1597/id/2819703',
 'https://legiscan.com/US/text/SB1709/id/2827463',
 'https://legiscan.com/US/text/SB187/id/2696929',
 'https://legiscan.com/US/text/SB200/id/2702901',
 'https://legiscan.com/US/text/SB2357/id/2836565',
 'https://legiscan.com/US/text/SB2394/id/2836690',
 'https://legiscan.com/US/text/SB2797/id/2841880',
 'https://legiscan.com/US/text/SB3035/id/2844163',
 'https://legiscan.com/US/text/SB3438/id/2865749',
 'https://legiscan.com/US/text/SB3729/id/2927908',
 'https://legiscan.com/US/text/SB435/id/2727671',
 'https://legiscan.com/US/text/SB457/id/2734132',
 'https://legiscan.com/US/text/SB4638/id/3014404',
 'https://legiscan.com/US/text/SB613/id/2746832',
 'https://legiscan.com/US/text/SB635/id/2752091',
 'https://legiscan.com/US/text/SB752/id/2760328',
 'https://legiscan.com/US/text/SJR90/id/3003899',
 'https://legiscan.com/US/text/SJR96/id/3009679',
 'https://legiscan.com/US/text/SR267/id/2831179',
 'https://legiscan.com/US/text/SR53/id/2696872',
 'https://legiscan.com/US/text/SR669/id/2998369']
congress_links
['https://www.congress.gov/bill/118th-congress/house-bill/1064/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/1112/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/1276/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/1399/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/1490/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/1585/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/216/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3101/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3102/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3328/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3329/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3462/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/3887/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/429/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/4365/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/4367/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/4398/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/4665/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/4821/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/5/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/5327/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/5636/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/5893/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/5894/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/6040/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/6258/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/6658/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/6728/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/7183/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/7187/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/734/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/736/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/7725/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8070/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8433/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8580/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8708/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8752/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8771/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8774/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8997/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/8998/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9026/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9027/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9028/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9029/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9218/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/9586/all-info',
 'https://www.congress.gov/bill/118th-congress/house-bill/985/all-info',
 'https://www.congress.gov/bill/118th-congress/house-joint-resolution/160/all-info',
 'https://www.congress.gov/bill/118th-congress/house-joint-resolution/165/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/115/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/1223/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/282/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/298/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/518/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/536/all-info',
 'https://www.congress.gov/bill/118th-congress/house-resolution/769/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/1595/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/1597/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/1709/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/187/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/200/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/2357/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/2394/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/2797/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/3035/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/3438/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/3729/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/435/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/457/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/4638/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/613/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/635/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-bill/752/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-joint-resolution/90/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-joint-resolution/96/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-resolution/267/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-resolution/53/all-info',
 'https://www.congress.gov/bill/118th-congress/senate-resolution/669/all-info']

step 5: saving our data to a CSV#

This is the final step. First, we will import two libraries for working with tabular data pandas and csv.

Then, we will add each of our lists into the “DataFrame” (the pandas term for a tabular type of object), where they will appear as separate columns. Finally, we will save our DataFrame as a .csv file.

# importing the necessary libraries

import pandas as pd
import csv
# creating empty lists to hold all of our data

titles = []
captions = []
categories = []
descriptions = []

# extracting the data from the bill cards
for item in bill_cards:
    title = item.h3.text
    category = item.find('span').text
    caption = item.h2.text
    if item.h2.text is not None:
        description = item.h2.text
    else:
        description = 'No bill description'
    
    # adding the items to the empty lists
    titles.append(title)
    categories.append(category)
    captions.append(caption)
    descriptions.append(description)
    # remember that "legiscan_links" is already saved as a list, so we don't have to create it here
# creating a dataframe, with separate columns to hold each of our lists
df = pd.DataFrame(
    {'title': titles,
     'caption': captions,
     'category': categories,
     'description': descriptions,
     'url': urls,
     'legiscan': legiscan_links,
     'congress': congress_links
    })
# checking the first 5 lines of the dataframe

df.head()
title caption category description url legiscan congress
0 US HB1064 Ensuring Military Readiness Act of 2023 MILITARY Ensuring Military Readiness Act of 2023 https://translegislation.com//bills/2024/US/HB... https://legiscan.com/US/text/HB1064/id/2737306 https://www.congress.gov/bill/118th-congress/h...
1 US HB1112 Ensuring Military Readiness Act of 2023 MILITARY Ensuring Military Readiness Act of 2023 https://translegislation.com//bills/2024/US/HB... https://legiscan.com/US/text/HB1112/id/2742708 https://www.congress.gov/bill/118th-congress/h...
2 US HB1276 Protect Minors from Medical Malpractice Act of... HEALTHCARE Protect Minors from Medical Malpractice Act of... https://translegislation.com//bills/2024/US/HB... https://legiscan.com/US/text/HB1276/id/2755407 https://www.congress.gov/bill/118th-congress/h...
3 US HB1399 Protect Children’s Innocence Act HEALTHCARE Protect Children’s Innocence Act https://translegislation.com//bills/2024/US/HB... https://legiscan.com/US/text/HB1399/id/2796538 https://www.congress.gov/bill/118th-congress/h...
4 US HB1490 Preventing Violence Against Female Inmates Act... INCARCERATION Preventing Violence Against Female Inmates Act... https://translegislation.com//bills/2024/US/HB... https://legiscan.com/US/text/HB1490/id/2761146 https://www.congress.gov/bill/118th-congress/h...
# saving the dataframe as a csv file

df.to_csv('bill_data.csv')

And that’s all! If you are on google colab, check your sidebar under the “files” tab. You should see a .csv file containing the data we’ve scraped from the translegislation.com website. Well done!

In the next section, we will look at an API method for getting legislative data, and save that data to a CSV file. In that activity, you’ll see the differences in handling data acrossn web scraping and API methods.