{ "cells": [ { "cell_type": "markdown", "id": "cd70a97c", "metadata": {}, "source": [ "# dataframes" ] }, { "cell_type": "code", "execution_count": 46, "id": "b44d1d15", "metadata": {}, "outputs": [], "source": [ "# import the pandas library for working with tabular (spreadsheet) data\n", "\n", "import pandas as pd" ] }, { "cell_type": "markdown", "id": "15cdb89d", "metadata": {}, "source": [ "We will first work in our data in the form of a dataframe. A dataframe is a two-dimensional data structure that holds data in a table with rows and columns. It is the same as a dictionary or `dict` type of data." ] }, { "cell_type": "code", "execution_count": 47, "id": "f6256daa", "metadata": {}, "outputs": [], "source": [ "# pulling up a CSV of search results from congress.gov \n", "# if this code doesn't work, download the csv from link \n", "# below, and upload it to your google colab space\n", "# https://bit.ly/congress_csv\n", "\n", "df = pd.read_csv('https://bit.ly/transgender_raw_data')" ] }, { "cell_type": "code", "execution_count": 48, "id": "c5a0ff81", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
0H.R. 1112https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Banks, Jim [Rep.-R-IN-3]Republican2/21/23House - Armed ServicesReferred to the House Committee on Armed Servi...2/21/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
1S. 435https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Rubio, Marco [Sen.-R-FL]Republican2/15/23Senate - Armed ServicesRead twice and referred to the Committee on Ar...2/15/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
2H.Res. 886https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Supporting the goals and principles of Transge...Jayapal, Pramila [Rep.-D-WA-7]Democratic11/21/23House - JudiciaryReferred to the House Committee on the Judiciary.11/21/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
3S.Res. 464https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)A resolution supporting the goals and principl...Hirono, Mazie K. [Sen.-D-HI]Democratic11/15/23Senate - JudiciaryStar Print ordered on resolution.12/4/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
4H.Res. 269https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Recognizing that it is the duty of the Federal...Jayapal, Pramila [Rep.-D-WA-7]Democratic3/30/23House - Judiciary, Education and the Workforce...Sponsor introductory remarks on measure. (CR H...4/19/23...NaNNaNNaN<p>This resolution expresses support for impl...NaNNaNNaNNaNNaNNaN
..................................................................
281NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNI welcome that discussion.\\n Mr. Chair, I yie...NaN
282NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNbiotechnology equipment or service produced or...NaN
283NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNUnited States assistance has \\n been provi...NaN
284NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNDirector of National Intelligence, shall submi...NaN
285NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNdemanded.\\n A recorded vote was ordered.\\n T...NaN
\n", "

286 rows × 650 columns

\n", "
" ], "text/plain": [ " Legislation Number URL \\\n", "0 H.R. 1112 https://www.congress.gov/bill/118th-congress/h... \n", "1 S. 435 https://www.congress.gov/bill/118th-congress/s... \n", "2 H.Res. 886 https://www.congress.gov/bill/118th-congress/h... \n", "3 S.Res. 464 https://www.congress.gov/bill/118th-congress/s... \n", "4 H.Res. 269 https://www.congress.gov/bill/118th-congress/h... \n", ".. ... ... \n", "281 NaN NaN \n", "282 NaN NaN \n", "283 NaN NaN \n", "284 NaN NaN \n", "285 NaN NaN \n", "\n", " Congress \\\n", "0 118th Congress (2023-2024) \n", "1 118th Congress (2023-2024) \n", "2 118th Congress (2023-2024) \n", "3 118th Congress (2023-2024) \n", "4 118th Congress (2023-2024) \n", ".. ... \n", "281 NaN \n", "282 NaN \n", "283 NaN \n", "284 NaN \n", "285 NaN \n", "\n", " Title \\\n", "0 Ensuring Military Readiness Act of 2023 \n", "1 Ensuring Military Readiness Act of 2023 \n", "2 Supporting the goals and principles of Transge... \n", "3 A resolution supporting the goals and principl... \n", "4 Recognizing that it is the duty of the Federal... \n", ".. ... \n", "281 NaN \n", "282 NaN \n", "283 NaN \n", "284 NaN \n", "285 NaN \n", "\n", " Sponsor Party of Sponsor Date of Introduction \\\n", "0 Banks, Jim [Rep.-R-IN-3] Republican 2/21/23 \n", "1 Rubio, Marco [Sen.-R-FL] Republican 2/15/23 \n", "2 Jayapal, Pramila [Rep.-D-WA-7] Democratic 11/21/23 \n", "3 Hirono, Mazie K. [Sen.-D-HI] Democratic 11/15/23 \n", "4 Jayapal, Pramila [Rep.-D-WA-7] Democratic 3/30/23 \n", ".. ... ... ... \n", "281 NaN NaN NaN \n", "282 NaN NaN NaN \n", "283 NaN NaN NaN \n", "284 NaN NaN NaN \n", "285 NaN NaN NaN \n", "\n", " Committees \\\n", "0 House - Armed Services \n", "1 Senate - Armed Services \n", "2 House - Judiciary \n", "3 Senate - Judiciary \n", "4 House - Judiciary, Education and the Workforce... \n", ".. ... \n", "281 NaN \n", "282 NaN \n", "283 NaN \n", "284 NaN \n", "285 NaN \n", "\n", " Latest Action Latest Action Date \\\n", "0 Referred to the House Committee on Armed Servi... 2/21/23 \n", "1 Read twice and referred to the Committee on Ar... 2/15/23 \n", "2 Referred to the House Committee on the Judiciary. 11/21/23 \n", "3 Star Print ordered on resolution. 12/4/23 \n", "4 Sponsor introductory remarks on measure. (CR H... 4/19/23 \n", ".. ... ... \n", "281 NaN NaN \n", "282 NaN NaN \n", "283 NaN NaN \n", "284 NaN NaN \n", "285 NaN NaN \n", "\n", " ... Related Bill.211 Related Bill.212 Related Bill.213 \\\n", "0 ... NaN NaN NaN \n", "1 ... NaN NaN NaN \n", "2 ... NaN NaN NaN \n", "3 ... NaN NaN NaN \n", "4 ... NaN NaN NaN \n", ".. ... ... ... ... \n", "281 ... NaN NaN NaN \n", "282 ... NaN NaN NaN \n", "283 ... NaN NaN NaN \n", "284 ... NaN NaN NaN \n", "285 ... NaN NaN NaN \n", "\n", " Latest Summary Amends Bill \\\n", "0

Ensuring Military Readiness Act of 2023... NaN \n", "1

Ensuring Military Readiness Act of 2023... NaN \n", "2

This resolution expresses support for the ... NaN \n", "3

This resolution expresses support for the ... NaN \n", "4

This resolution expresses support for impl... NaN \n", ".. ... ... \n", "281 NaN NaN \n", "282 NaN NaN \n", "283 NaN NaN \n", "284 NaN NaN \n", "285 NaN NaN \n", "\n", " Date Offered Date Submitted Date Proposed \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", ".. ... ... ... \n", "281 NaN NaN NaN \n", "282 NaN NaN NaN \n", "283 NaN NaN NaN \n", "284 NaN NaN NaN \n", "285 NaN NaN NaN \n", "\n", " Amendment Text (Latest) Amends Amendment \n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", ".. ... ... \n", "281 I welcome that discussion.\\n Mr. Chair, I yie... NaN \n", "282 biotechnology equipment or service produced or... NaN \n", "283 United States assistance has \\n been provi... NaN \n", "284 Director of National Intelligence, shall submi... NaN \n", "285 demanded.\\n A recorded vote was ordered.\\n T... NaN \n", "\n", "[286 rows x 650 columns]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# checking out the df object\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "2c568ff2", "metadata": {}, "source": [ "## What is a `dataframe` object?\n", "- tabular format\n", "- dictionary structure, with a particular syntax for accessing elements\n", " - key-value pairs\n", " - df['column']" ] }, { "cell_type": "code", "execution_count": 49, "id": "27b9f938", "metadata": {}, "outputs": [], "source": [ "filipa = {\n", " 'name': ['filipa', 'da gama', 'calado'],\n", " 'age': 34,\n", " 'degree': 'literature',\n", " 'job': 'digital scholarship specialist',\n", "}" ] }, { "cell_type": "code", "execution_count": 50, "id": "57ef2713", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(filipa)" ] }, { "cell_type": "code", "execution_count": 51, "id": "6af4ac80", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['filipa', 'da gama', 'calado']" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filipa['name']" ] }, { "cell_type": "code", "execution_count": 52, "id": "ef51f022", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "34" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filipa['age']" ] }, { "cell_type": "code", "execution_count": 53, "id": "10db67fe", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'digital scholarship specialist'" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filipa['job']" ] }, { "cell_type": "code", "execution_count": 54, "id": "8d58553a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'filipa'" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# within a DF is a Series, a list type of object. Use list indexing to pull out items\n", "\n", "filipa['name'][0]" ] }, { "cell_type": "code", "execution_count": 55, "id": "e614328f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'calado'" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filipa['name'][2]" ] }, { "cell_type": "markdown", "id": "13652794", "metadata": {}, "source": [ "## viewing data: `info()` `head()` `tail()`" ] }, { "cell_type": "code", "execution_count": 56, "id": "a3ffc0fb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 286 entries, 0 to 285\n", "Columns: 650 entries, Legislation Number to Amends Amendment\n", "dtypes: float64(4), object(646)\n", "memory usage: 1.4+ MB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "code", "execution_count": 57, "id": "4f2d9783", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
0H.R. 1112https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Banks, Jim [Rep.-R-IN-3]Republican2/21/23House - Armed ServicesReferred to the House Committee on Armed Servi...2/21/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
1S. 435https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Rubio, Marco [Sen.-R-FL]Republican2/15/23Senate - Armed ServicesRead twice and referred to the Committee on Ar...2/15/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
2H.Res. 886https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Supporting the goals and principles of Transge...Jayapal, Pramila [Rep.-D-WA-7]Democratic11/21/23House - JudiciaryReferred to the House Committee on the Judiciary.11/21/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
3S.Res. 464https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)A resolution supporting the goals and principl...Hirono, Mazie K. [Sen.-D-HI]Democratic11/15/23Senate - JudiciaryStar Print ordered on resolution.12/4/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
4H.Res. 269https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Recognizing that it is the duty of the Federal...Jayapal, Pramila [Rep.-D-WA-7]Democratic3/30/23House - Judiciary, Education and the Workforce...Sponsor introductory remarks on measure. (CR H...4/19/23...NaNNaNNaN<p>This resolution expresses support for impl...NaNNaNNaNNaNNaNNaN
\n", "

5 rows × 650 columns

\n", "
" ], "text/plain": [ " Legislation Number URL \\\n", "0 H.R. 1112 https://www.congress.gov/bill/118th-congress/h... \n", "1 S. 435 https://www.congress.gov/bill/118th-congress/s... \n", "2 H.Res. 886 https://www.congress.gov/bill/118th-congress/h... \n", "3 S.Res. 464 https://www.congress.gov/bill/118th-congress/s... \n", "4 H.Res. 269 https://www.congress.gov/bill/118th-congress/h... \n", "\n", " Congress \\\n", "0 118th Congress (2023-2024) \n", "1 118th Congress (2023-2024) \n", "2 118th Congress (2023-2024) \n", "3 118th Congress (2023-2024) \n", "4 118th Congress (2023-2024) \n", "\n", " Title \\\n", "0 Ensuring Military Readiness Act of 2023 \n", "1 Ensuring Military Readiness Act of 2023 \n", "2 Supporting the goals and principles of Transge... \n", "3 A resolution supporting the goals and principl... \n", "4 Recognizing that it is the duty of the Federal... \n", "\n", " Sponsor Party of Sponsor Date of Introduction \\\n", "0 Banks, Jim [Rep.-R-IN-3] Republican 2/21/23 \n", "1 Rubio, Marco [Sen.-R-FL] Republican 2/15/23 \n", "2 Jayapal, Pramila [Rep.-D-WA-7] Democratic 11/21/23 \n", "3 Hirono, Mazie K. [Sen.-D-HI] Democratic 11/15/23 \n", "4 Jayapal, Pramila [Rep.-D-WA-7] Democratic 3/30/23 \n", "\n", " Committees \\\n", "0 House - Armed Services \n", "1 Senate - Armed Services \n", "2 House - Judiciary \n", "3 Senate - Judiciary \n", "4 House - Judiciary, Education and the Workforce... \n", "\n", " Latest Action Latest Action Date ... \\\n", "0 Referred to the House Committee on Armed Servi... 2/21/23 ... \n", "1 Read twice and referred to the Committee on Ar... 2/15/23 ... \n", "2 Referred to the House Committee on the Judiciary. 11/21/23 ... \n", "3 Star Print ordered on resolution. 12/4/23 ... \n", "4 Sponsor introductory remarks on measure. (CR H... 4/19/23 ... \n", "\n", " Related Bill.211 Related Bill.212 Related Bill.213 \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " Latest Summary Amends Bill Date Offered \\\n", "0

Ensuring Military Readiness Act of 2023... NaN NaN \n", "1

Ensuring Military Readiness Act of 2023... NaN NaN \n", "2

This resolution expresses support for the ... NaN NaN \n", "3

This resolution expresses support for the ... NaN NaN \n", "4

This resolution expresses support for impl... NaN NaN \n", "\n", " Date Submitted Date Proposed Amendment Text (Latest) Amends Amendment \n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", "[5 rows x 650 columns]" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 58, "id": "30fd982e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "

\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
281NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNI welcome that discussion.\\n Mr. Chair, I yie...NaN
282NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNbiotechnology equipment or service produced or...NaN
283NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNUnited States assistance has \\n been provi...NaN
284NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNDirector of National Intelligence, shall submi...NaN
285NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNdemanded.\\n A recorded vote was ordered.\\n T...NaN
\n", "

5 rows × 650 columns

\n", "
" ], "text/plain": [ " Legislation Number URL Congress Title Sponsor Party of Sponsor \\\n", "281 NaN NaN NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN NaN NaN \n", "\n", " Date of Introduction Committees Latest Action Latest Action Date ... \\\n", "281 NaN NaN NaN NaN ... \n", "282 NaN NaN NaN NaN ... \n", "283 NaN NaN NaN NaN ... \n", "284 NaN NaN NaN NaN ... \n", "285 NaN NaN NaN NaN ... \n", "\n", " Related Bill.211 Related Bill.212 Related Bill.213 Latest Summary \\\n", "281 NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN \n", "\n", " Amends Bill Date Offered Date Submitted Date Proposed \\\n", "281 NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN \n", "\n", " Amendment Text (Latest) Amends Amendment \n", "281 I welcome that discussion.\\n Mr. Chair, I yie... NaN \n", "282 biotechnology equipment or service produced or... NaN \n", "283 United States assistance has \\n been provi... NaN \n", "284 Director of National Intelligence, shall submi... NaN \n", "285 demanded.\\n A recorded vote was ordered.\\n T... NaN \n", "\n", "[5 rows x 650 columns]" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail()" ] }, { "cell_type": "markdown", "id": "583bc932", "metadata": {}, "source": [ "We can also explore columns using the dictionary syntax to access each column" ] }, { "cell_type": "code", "execution_count": 59, "id": "6ee7a8b9", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 2/21/23\n", "1 2/15/23\n", "2 11/21/23\n", "3 11/15/23\n", "4 3/30/23\n", " ... \n", "281 NaN\n", "282 NaN\n", "283 NaN\n", "284 NaN\n", "285 NaN\n", "Name: Date of Introduction, Length: 286, dtype: object" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Date of Introduction']" ] }, { "cell_type": "markdown", "id": "f2462df2", "metadata": {}, "source": [ "## cleaning rows & columns: `dropna()` `.notna()` `drop()`" ] }, { "cell_type": "code", "execution_count": 60, "id": "78f6390e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
\n", "

0 rows × 650 columns

\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Legislation Number, URL, Congress, Title, Sponsor, Party of Sponsor, Date of Introduction, Committees, Latest Action, Latest Action Date, Cosponsor, Cosponsor.1, Cosponsor.2, Cosponsor.3, Cosponsor.4, Cosponsor.5, Cosponsor.6, Cosponsor.7, Cosponsor.8, Cosponsor.9, Cosponsor.10, Cosponsor.11, Cosponsor.12, Cosponsor.13, Cosponsor.14, Cosponsor.15, Cosponsor.16, Cosponsor.17, Cosponsor.18, Cosponsor.19, Cosponsor.20, Cosponsor.21, Cosponsor.22, Cosponsor.23, Cosponsor.24, Cosponsor.25, Cosponsor.26, Cosponsor.27, Cosponsor.28, Cosponsor.29, Cosponsor.30, Cosponsor.31, Cosponsor.32, Cosponsor.33, Cosponsor.34, Cosponsor.35, Cosponsor.36, Cosponsor.37, Cosponsor.38, Cosponsor.39, Cosponsor.40, Cosponsor.41, Cosponsor.42, Cosponsor.43, Cosponsor.44, Cosponsor.45, Cosponsor.46, Cosponsor.47, Cosponsor.48, Cosponsor.49, Cosponsor.50, Cosponsor.51, Cosponsor.52, Cosponsor.53, Cosponsor.54, Cosponsor.55, Cosponsor.56, Cosponsor.57, Cosponsor.58, Cosponsor.59, Cosponsor.60, Cosponsor.61, Cosponsor.62, Cosponsor.63, Cosponsor.64, Cosponsor.65, Cosponsor.66, Cosponsor.67, Cosponsor.68, Cosponsor.69, Cosponsor.70, Cosponsor.71, Cosponsor.72, Cosponsor.73, Cosponsor.74, Cosponsor.75, Cosponsor.76, Cosponsor.77, Cosponsor.78, Cosponsor.79, Cosponsor.80, Cosponsor.81, Cosponsor.82, Cosponsor.83, Cosponsor.84, Cosponsor.85, Cosponsor.86, Cosponsor.87, Cosponsor.88, Cosponsor.89, ...]\n", "Index: []\n", "\n", "[0 rows x 650 columns]" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# only drops rows containing all NaN: missing value\n", "# this doesn't work for our dataset. Why?\n", "\n", "df.dropna()" ] }, { "cell_type": "code", "execution_count": 61, "id": "f0b35d9c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
281NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNI welcome that discussion.\\n Mr. Chair, I yie...NaN
282NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNbiotechnology equipment or service produced or...NaN
283NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNUnited States assistance has \\n been provi...NaN
284NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNDirector of National Intelligence, shall submi...NaN
285NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNdemanded.\\n A recorded vote was ordered.\\n T...NaN
\n", "

5 rows × 650 columns

\n", "
" ], "text/plain": [ " Legislation Number URL Congress Title Sponsor Party of Sponsor \\\n", "281 NaN NaN NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN NaN NaN \n", "\n", " Date of Introduction Committees Latest Action Latest Action Date ... \\\n", "281 NaN NaN NaN NaN ... \n", "282 NaN NaN NaN NaN ... \n", "283 NaN NaN NaN NaN ... \n", "284 NaN NaN NaN NaN ... \n", "285 NaN NaN NaN NaN ... \n", "\n", " Related Bill.211 Related Bill.212 Related Bill.213 Latest Summary \\\n", "281 NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN \n", "\n", " Amends Bill Date Offered Date Submitted Date Proposed \\\n", "281 NaN NaN NaN NaN \n", "282 NaN NaN NaN NaN \n", "283 NaN NaN NaN NaN \n", "284 NaN NaN NaN NaN \n", "285 NaN NaN NaN NaN \n", "\n", " Amendment Text (Latest) Amends Amendment \n", "281 I welcome that discussion.\\n Mr. Chair, I yie... NaN \n", "282 biotechnology equipment or service produced or... NaN \n", "283 United States assistance has \\n been provi... NaN \n", "284 Director of National Intelligence, shall submi... NaN \n", "285 demanded.\\n A recorded vote was ordered.\\n T... NaN \n", "\n", "[5 rows x 650 columns]" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check the tail. There are still some values there.\n", "\n", "df.tail()" ] }, { "cell_type": "code", "execution_count": 62, "id": "4bcf2e7c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 286 entries, 0 to 285\n", "Columns: 650 entries, Legislation Number to Amends Amendment\n", "dtypes: float64(4), object(646)\n", "memory usage: 1.4+ MB\n" ] } ], "source": [ "# check df info. We still have the same number of entries (rows)\n", "\n", "df.info()" ] }, { "cell_type": "code", "execution_count": 63, "id": "c0d708e9", "metadata": {}, "outputs": [], "source": [ "# try isolating a column, dropping a row if it has any NaN within\n", "# uses syntax for accessing columns from DF\n", "\n", "df = df[df['Legislation Number'].notna()]" ] }, { "cell_type": "code", "execution_count": 64, "id": "d3f8157e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 148 entries, 0 to 280\n", "Columns: 650 entries, Legislation Number to Amends Amendment\n", "dtypes: float64(4), object(646)\n", "memory usage: 752.7+ KB\n" ] } ], "source": [ "# notice the line with \"entries\"\n", "# we have 148 rows, but our index goes up to 280\n", "\n", "df.info()" ] }, { "cell_type": "code", "execution_count": 65, "id": "1697be86", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Legislation Number', 'URL', 'Congress', 'Title', 'Sponsor',\n", " 'Party of Sponsor', 'Date of Introduction', 'Committees',\n", " 'Latest Action', 'Latest Action Date', 'Cosponsor', 'Cosponsor.1',\n", " 'Cosponsor.2', 'Cosponsor.3', 'Cosponsor.4', 'Cosponsor.5',\n", " 'Cosponsor.6', 'Cosponsor.7', 'Cosponsor.8', 'Cosponsor.9',\n", " 'Cosponsor.10', 'Cosponsor.11', 'Cosponsor.12', 'Cosponsor.13',\n", " 'Cosponsor.14', 'Cosponsor.15', 'Cosponsor.16', 'Cosponsor.17',\n", " 'Cosponsor.18', 'Cosponsor.19', 'Cosponsor.20', 'Cosponsor.21',\n", " 'Cosponsor.22', 'Cosponsor.23', 'Cosponsor.24', 'Cosponsor.25',\n", " 'Cosponsor.26', 'Cosponsor.27', 'Cosponsor.28', 'Cosponsor.29',\n", " 'Cosponsor.30', 'Cosponsor.31', 'Cosponsor.32', 'Cosponsor.33',\n", " 'Cosponsor.34', 'Cosponsor.35', 'Cosponsor.36', 'Cosponsor.37',\n", " 'Cosponsor.38', 'Cosponsor.39', 'Cosponsor.40', 'Cosponsor.41',\n", " 'Cosponsor.42', 'Cosponsor.43', 'Cosponsor.44', 'Cosponsor.45',\n", " 'Cosponsor.46', 'Cosponsor.47', 'Cosponsor.48', 'Cosponsor.49',\n", " 'Cosponsor.50', 'Cosponsor.51', 'Cosponsor.52', 'Cosponsor.53',\n", " 'Cosponsor.54', 'Cosponsor.55', 'Cosponsor.56', 'Cosponsor.57',\n", " 'Cosponsor.58', 'Cosponsor.59', 'Cosponsor.60', 'Cosponsor.61',\n", " 'Cosponsor.62', 'Cosponsor.63', 'Cosponsor.64', 'Cosponsor.65',\n", " 'Cosponsor.66', 'Cosponsor.67', 'Cosponsor.68', 'Cosponsor.69',\n", " 'Cosponsor.70', 'Cosponsor.71', 'Cosponsor.72', 'Cosponsor.73',\n", " 'Cosponsor.74', 'Cosponsor.75', 'Cosponsor.76', 'Cosponsor.77',\n", " 'Cosponsor.78', 'Cosponsor.79', 'Cosponsor.80', 'Cosponsor.81',\n", " 'Cosponsor.82', 'Cosponsor.83', 'Cosponsor.84', 'Cosponsor.85',\n", " 'Cosponsor.86', 'Cosponsor.87', 'Cosponsor.88', 'Cosponsor.89',\n", " 'Cosponsor.90', 'Cosponsor.91', 'Cosponsor.92', 'Cosponsor.93',\n", " 'Cosponsor.94', 'Cosponsor.95', 'Cosponsor.96', 'Cosponsor.97',\n", " 'Cosponsor.98', 'Cosponsor.99', 'Cosponsor.100', 'Cosponsor.101',\n", " 'Cosponsor.102', 'Cosponsor.103', 'Cosponsor.104', 'Cosponsor.105',\n", " 'Cosponsor.106', 'Cosponsor.107', 'Cosponsor.108', 'Cosponsor.109',\n", " 'Cosponsor.110', 'Cosponsor.111', 'Cosponsor.112', 'Cosponsor.113',\n", " 'Cosponsor.114', 'Cosponsor.115', 'Cosponsor.116', 'Cosponsor.117',\n", " 'Cosponsor.118', 'Cosponsor.119', 'Cosponsor.120', 'Cosponsor.121',\n", " 'Cosponsor.122', 'Cosponsor.123', 'Cosponsor.124', 'Cosponsor.125',\n", " 'Cosponsor.126', 'Cosponsor.127', 'Cosponsor.128', 'Cosponsor.129',\n", " 'Cosponsor.130', 'Cosponsor.131', 'Cosponsor.132', 'Cosponsor.133',\n", " 'Cosponsor.134', 'Cosponsor.135', 'Cosponsor.136', 'Cosponsor.137',\n", " 'Cosponsor.138', 'Cosponsor.139', 'Cosponsor.140', 'Cosponsor.141',\n", " 'Cosponsor.142', 'Cosponsor.143', 'Cosponsor.144', 'Cosponsor.145',\n", " 'Cosponsor.146', 'Cosponsor.147', 'Cosponsor.148', 'Cosponsor.149',\n", " 'Cosponsor.150', 'Cosponsor.151', 'Cosponsor.152', 'Cosponsor.153',\n", " 'Cosponsor.154', 'Cosponsor.155', 'Cosponsor.156', 'Cosponsor.157',\n", " 'Cosponsor.158', 'Cosponsor.159', 'Cosponsor.160', 'Cosponsor.161',\n", " 'Cosponsor.162', 'Cosponsor.163', 'Cosponsor.164', 'Cosponsor.165',\n", " 'Cosponsor.166', 'Cosponsor.167', 'Cosponsor.168', 'Cosponsor.169',\n", " 'Cosponsor.170', 'Cosponsor.171', 'Cosponsor.172', 'Cosponsor.173',\n", " 'Cosponsor.174', 'Cosponsor.175', 'Cosponsor.176', 'Cosponsor.177',\n", " 'Cosponsor.178', 'Cosponsor.179', 'Cosponsor.180', 'Cosponsor.181',\n", " 'Cosponsor.182', 'Cosponsor.183', 'Cosponsor.184', 'Cosponsor.185',\n", " 'Cosponsor.186', 'Cosponsor.187', 'Cosponsor.188', 'Cosponsor.189',\n", " 'Cosponsor.190', 'Cosponsor.191', 'Cosponsor.192', 'Cosponsor.193',\n", " 'Cosponsor.194', 'Cosponsor.195', 'Cosponsor.196', 'Cosponsor.197',\n", " 'Cosponsor.198', 'Cosponsor.199', 'Cosponsor.200', 'Cosponsor.201',\n", " 'Cosponsor.202', 'Cosponsor.203', 'Cosponsor.204', 'Cosponsor.205',\n", " 'Cosponsor.206', 'Cosponsor.207', 'Cosponsor.208', 'Cosponsor.209',\n", " 'Cosponsor.210', 'Cosponsor.211', 'Cosponsor.212', 'Cosponsor.213',\n", " 'Cosponsor.214', 'Number of Cosponsors', 'Subject', 'Subject.1',\n", " 'Subject.2', 'Subject.3', 'Subject.4', 'Subject.5', 'Subject.6',\n", " 'Subject.7', 'Subject.8', 'Subject.9', 'Subject.10', 'Subject.11',\n", " 'Subject.12', 'Subject.13', 'Subject.14', 'Subject.15',\n", " 'Subject.16', 'Subject.17', 'Subject.18', 'Subject.19',\n", " 'Subject.20', 'Subject.21', 'Subject.22', 'Subject.23',\n", " 'Subject.24', 'Subject.25', 'Subject.26', 'Subject.27',\n", " 'Subject.28', 'Subject.29', 'Subject.30', 'Subject.31',\n", " 'Subject.32', 'Subject.33', 'Subject.34', 'Subject.35',\n", " 'Subject.36', 'Subject.37', 'Subject.38', 'Subject.39',\n", " 'Subject.40', 'Subject.41', 'Subject.42', 'Subject.43',\n", " 'Subject.44', 'Subject.45', 'Subject.46', 'Subject.47',\n", " 'Subject.48', 'Subject.49', 'Subject.50', 'Subject.51',\n", " 'Subject.52', 'Subject.53', 'Subject.54', 'Subject.55',\n", " 'Subject.56', 'Subject.57', 'Subject.58', 'Subject.59',\n", " 'Subject.60', 'Subject.61', 'Subject.62', 'Subject.63',\n", " 'Subject.64', 'Subject.65', 'Subject.66', 'Subject.67',\n", " 'Subject.68', 'Subject.69', 'Subject.70', 'Subject.71',\n", " 'Subject.72', 'Subject.73', 'Subject.74', 'Subject.75',\n", " 'Subject.76', 'Subject.77', 'Subject.78', 'Subject.79',\n", " 'Subject.80', 'Subject.81', 'Subject.82', 'Subject.83',\n", " 'Subject.84', 'Subject.85', 'Subject.86', 'Subject.87',\n", " 'Subject.88', 'Subject.89', 'Subject.90', 'Subject.91',\n", " 'Subject.92', 'Subject.93', 'Subject.94', 'Subject.95',\n", " 'Subject.96', 'Subject.97', 'Subject.98', 'Subject.99',\n", " 'Subject.100', 'Subject.101', 'Subject.102', 'Subject.103',\n", " 'Subject.104', 'Subject.105', 'Subject.106', 'Subject.107',\n", " 'Subject.108', 'Subject.109', 'Subject.110', 'Subject.111',\n", " 'Subject.112', 'Subject.113', 'Subject.114', 'Subject.115',\n", " 'Subject.116', 'Subject.117', 'Subject.118', 'Subject.119',\n", " 'Subject.120', 'Subject.121', 'Subject.122', 'Subject.123',\n", " 'Subject.124', 'Subject.125', 'Subject.126', 'Subject.127',\n", " 'Subject.128', 'Subject.129', 'Subject.130', 'Subject.131',\n", " 'Subject.132', 'Subject.133', 'Subject.134', 'Subject.135',\n", " 'Subject.136', 'Subject.137', 'Subject.138', 'Subject.139',\n", " 'Subject.140', 'Subject.141', 'Subject.142', 'Subject.143',\n", " 'Subject.144', 'Subject.145', 'Subject.146', 'Subject.147',\n", " 'Subject.148', 'Subject.149', 'Subject.150', 'Subject.151',\n", " 'Subject.152', 'Subject.153', 'Subject.154', 'Subject.155',\n", " 'Subject.156', 'Subject.157', 'Subject.158', 'Subject.159',\n", " 'Subject.160', 'Subject.161', 'Subject.162', 'Subject.163',\n", " 'Subject.164', 'Subject.165', 'Subject.166', 'Subject.167',\n", " 'Subject.168', 'Subject.169', 'Subject.170', 'Subject.171',\n", " 'Subject.172', 'Subject.173', 'Subject.174', 'Subject.175',\n", " 'Subject.176', 'Subject.177', 'Subject.178', 'Subject.179',\n", " 'Subject.180', 'Subject.181', 'Subject.182', 'Subject.183',\n", " 'Subject.184', 'Subject.185', 'Subject.186', 'Subject.187',\n", " 'Subject.188', 'Subject.189', 'Subject.190', 'Subject.191',\n", " 'Subject.192', 'Subject.193', 'Subject.194', 'Subject.195',\n", " 'Subject.196', 'Subject.197', 'Subject.198', 'Subject.199',\n", " 'Subject.200', 'Subject.201', 'Number of Related Bills',\n", " 'Related Bill', 'Related Bill.1', 'Related Bill.2',\n", " 'Related Bill.3', 'Related Bill.4', 'Related Bill.5',\n", " 'Related Bill.6', 'Related Bill.7', 'Related Bill.8',\n", " 'Related Bill.9', 'Related Bill.10', 'Related Bill.11',\n", " 'Related Bill.12', 'Related Bill.13', 'Related Bill.14',\n", " 'Related Bill.15', 'Related Bill.16', 'Related Bill.17',\n", " 'Related Bill.18', 'Related Bill.19', 'Related Bill.20',\n", " 'Related Bill.21', 'Related Bill.22', 'Related Bill.23',\n", " 'Related Bill.24', 'Related Bill.25', 'Related Bill.26',\n", " 'Related Bill.27', 'Related Bill.28', 'Related Bill.29',\n", " 'Related Bill.30', 'Related Bill.31', 'Related Bill.32',\n", " 'Related Bill.33', 'Related Bill.34', 'Related Bill.35',\n", " 'Related Bill.36', 'Related Bill.37', 'Related Bill.38',\n", " 'Related Bill.39', 'Related Bill.40', 'Related Bill.41',\n", " 'Related Bill.42', 'Related Bill.43', 'Related Bill.44',\n", " 'Related Bill.45', 'Related Bill.46', 'Related Bill.47',\n", " 'Related Bill.48', 'Related Bill.49', 'Related Bill.50',\n", " 'Related Bill.51', 'Related Bill.52', 'Related Bill.53',\n", " 'Related Bill.54', 'Related Bill.55', 'Related Bill.56',\n", " 'Related Bill.57', 'Related Bill.58', 'Related Bill.59',\n", " 'Related Bill.60', 'Related Bill.61', 'Related Bill.62',\n", " 'Related Bill.63', 'Related Bill.64', 'Related Bill.65',\n", " 'Related Bill.66', 'Related Bill.67', 'Related Bill.68',\n", " 'Related Bill.69', 'Related Bill.70', 'Related Bill.71',\n", " 'Related Bill.72', 'Related Bill.73', 'Related Bill.74',\n", " 'Related Bill.75', 'Related Bill.76', 'Related Bill.77',\n", " 'Related Bill.78', 'Related Bill.79', 'Related Bill.80',\n", " 'Related Bill.81', 'Related Bill.82', 'Related Bill.83',\n", " 'Related Bill.84', 'Related Bill.85', 'Related Bill.86',\n", " 'Related Bill.87', 'Related Bill.88', 'Related Bill.89',\n", " 'Related Bill.90', 'Related Bill.91', 'Related Bill.92',\n", " 'Related Bill.93', 'Related Bill.94', 'Related Bill.95',\n", " 'Related Bill.96', 'Related Bill.97', 'Related Bill.98',\n", " 'Related Bill.99', 'Related Bill.100', 'Related Bill.101',\n", " 'Related Bill.102', 'Related Bill.103', 'Related Bill.104',\n", " 'Related Bill.105', 'Related Bill.106', 'Related Bill.107',\n", " 'Related Bill.108', 'Related Bill.109', 'Related Bill.110',\n", " 'Related Bill.111', 'Related Bill.112', 'Related Bill.113',\n", " 'Related Bill.114', 'Related Bill.115', 'Related Bill.116',\n", " 'Related Bill.117', 'Related Bill.118', 'Related Bill.119',\n", " 'Related Bill.120', 'Related Bill.121', 'Related Bill.122',\n", " 'Related Bill.123', 'Related Bill.124', 'Related Bill.125',\n", " 'Related Bill.126', 'Related Bill.127', 'Related Bill.128',\n", " 'Related Bill.129', 'Related Bill.130', 'Related Bill.131',\n", " 'Related Bill.132', 'Related Bill.133', 'Related Bill.134',\n", " 'Related Bill.135', 'Related Bill.136', 'Related Bill.137',\n", " 'Related Bill.138', 'Related Bill.139', 'Related Bill.140',\n", " 'Related Bill.141', 'Related Bill.142', 'Related Bill.143',\n", " 'Related Bill.144', 'Related Bill.145', 'Related Bill.146',\n", " 'Related Bill.147', 'Related Bill.148', 'Related Bill.149',\n", " 'Related Bill.150', 'Related Bill.151', 'Related Bill.152',\n", " 'Related Bill.153', 'Related Bill.154', 'Related Bill.155',\n", " 'Related Bill.156', 'Related Bill.157', 'Related Bill.158',\n", " 'Related Bill.159', 'Related Bill.160', 'Related Bill.161',\n", " 'Related Bill.162', 'Related Bill.163', 'Related Bill.164',\n", " 'Related Bill.165', 'Related Bill.166', 'Related Bill.167',\n", " 'Related Bill.168', 'Related Bill.169', 'Related Bill.170',\n", " 'Related Bill.171', 'Related Bill.172', 'Related Bill.173',\n", " 'Related Bill.174', 'Related Bill.175', 'Related Bill.176',\n", " 'Related Bill.177', 'Related Bill.178', 'Related Bill.179',\n", " 'Related Bill.180', 'Related Bill.181', 'Related Bill.182',\n", " 'Related Bill.183', 'Related Bill.184', 'Related Bill.185',\n", " 'Related Bill.186', 'Related Bill.187', 'Related Bill.188',\n", " 'Related Bill.189', 'Related Bill.190', 'Related Bill.191',\n", " 'Related Bill.192', 'Related Bill.193', 'Related Bill.194',\n", " 'Related Bill.195', 'Related Bill.196', 'Related Bill.197',\n", " 'Related Bill.198', 'Related Bill.199', 'Related Bill.200',\n", " 'Related Bill.201', 'Related Bill.202', 'Related Bill.203',\n", " 'Related Bill.204', 'Related Bill.205', 'Related Bill.206',\n", " 'Related Bill.207', 'Related Bill.208', 'Related Bill.209',\n", " 'Related Bill.210', 'Related Bill.211', 'Related Bill.212',\n", " 'Related Bill.213', 'Latest Summary', 'Amends Bill',\n", " 'Date Offered', 'Date Submitted', 'Date Proposed',\n", " 'Amendment Text (Latest)', 'Amends Amendment'], dtype=object)" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# what's in the columns? \n", "# googled how to print out all of the column names\n", "\n", "df.columns.values" ] }, { "cell_type": "code", "execution_count": 66, "id": "c1eecfa2", "metadata": {}, "outputs": [], "source": [ "for item in df.columns.values:\n", " if 'Cosponsor' in item:\n", " # reassigning DF in each step of the loop\n", " # axis = 1 means columns; axis = 0 means rows\n", " df = df.drop(item, axis=1)" ] }, { "cell_type": "code", "execution_count": 67, "id": "51270b92", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 148 entries, 0 to 280\n", "Columns: 434 entries, Legislation Number to Amends Amendment\n", "dtypes: float64(3), object(431)\n", "memory usage: 503.0+ KB\n" ] } ], "source": [ "# We are down to about 400. let's do the rest\n", "\n", "df.info()" ] }, { "cell_type": "code", "execution_count": 68, "id": "b72c4c20", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Legislation NumberURLCongressTitleSponsorParty of SponsorDate of IntroductionCommitteesLatest ActionLatest Action Date...Related Bill.211Related Bill.212Related Bill.213Latest SummaryAmends BillDate OfferedDate SubmittedDate ProposedAmendment Text (Latest)Amends Amendment
0H.R. 1112https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Banks, Jim [Rep.-R-IN-3]Republican2/21/23House - Armed ServicesReferred to the House Committee on Armed Servi...2/21/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
1S. 435https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)Ensuring Military Readiness Act of 2023Rubio, Marco [Sen.-R-FL]Republican2/15/23Senate - Armed ServicesRead twice and referred to the Committee on Ar...2/15/23...NaNNaNNaN<p><b>Ensuring Military Readiness Act of 2023...NaNNaNNaNNaNNaNNaN
2H.Res. 886https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Supporting the goals and principles of Transge...Jayapal, Pramila [Rep.-D-WA-7]Democratic11/21/23House - JudiciaryReferred to the House Committee on the Judiciary.11/21/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
3S.Res. 464https://www.congress.gov/bill/118th-congress/s...118th Congress (2023-2024)A resolution supporting the goals and principl...Hirono, Mazie K. [Sen.-D-HI]Democratic11/15/23Senate - JudiciaryStar Print ordered on resolution.12/4/23...NaNNaNNaN<p>This resolution expresses support for the ...NaNNaNNaNNaNNaNNaN
4H.Res. 269https://www.congress.gov/bill/118th-congress/h...118th Congress (2023-2024)Recognizing that it is the duty of the Federal...Jayapal, Pramila [Rep.-D-WA-7]Democratic3/30/23House - Judiciary, Education and the Workforce...Sponsor introductory remarks on measure. (CR H...4/19/23...NaNNaNNaN<p>This resolution expresses support for impl...NaNNaNNaNNaNNaNNaN
\n", "

5 rows × 434 columns

\n", "
" ], "text/plain": [ " Legislation Number URL \\\n", "0 H.R. 1112 https://www.congress.gov/bill/118th-congress/h... \n", "1 S. 435 https://www.congress.gov/bill/118th-congress/s... \n", "2 H.Res. 886 https://www.congress.gov/bill/118th-congress/h... \n", "3 S.Res. 464 https://www.congress.gov/bill/118th-congress/s... \n", "4 H.Res. 269 https://www.congress.gov/bill/118th-congress/h... \n", "\n", " Congress \\\n", "0 118th Congress (2023-2024) \n", "1 118th Congress (2023-2024) \n", "2 118th Congress (2023-2024) \n", "3 118th Congress (2023-2024) \n", "4 118th Congress (2023-2024) \n", "\n", " Title \\\n", "0 Ensuring Military Readiness Act of 2023 \n", "1 Ensuring Military Readiness Act of 2023 \n", "2 Supporting the goals and principles of Transge... \n", "3 A resolution supporting the goals and principl... \n", "4 Recognizing that it is the duty of the Federal... \n", "\n", " Sponsor Party of Sponsor Date of Introduction \\\n", "0 Banks, Jim [Rep.-R-IN-3] Republican 2/21/23 \n", "1 Rubio, Marco [Sen.-R-FL] Republican 2/15/23 \n", "2 Jayapal, Pramila [Rep.-D-WA-7] Democratic 11/21/23 \n", "3 Hirono, Mazie K. [Sen.-D-HI] Democratic 11/15/23 \n", "4 Jayapal, Pramila [Rep.-D-WA-7] Democratic 3/30/23 \n", "\n", " Committees \\\n", "0 House - Armed Services \n", "1 Senate - Armed Services \n", "2 House - Judiciary \n", "3 Senate - Judiciary \n", "4 House - Judiciary, Education and the Workforce... \n", "\n", " Latest Action Latest Action Date ... \\\n", "0 Referred to the House Committee on Armed Servi... 2/21/23 ... \n", "1 Read twice and referred to the Committee on Ar... 2/15/23 ... \n", "2 Referred to the House Committee on the Judiciary. 11/21/23 ... \n", "3 Star Print ordered on resolution. 12/4/23 ... \n", "4 Sponsor introductory remarks on measure. (CR H... 4/19/23 ... \n", "\n", " Related Bill.211 Related Bill.212 Related Bill.213 \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " Latest Summary Amends Bill Date Offered \\\n", "0

Ensuring Military Readiness Act of 2023... NaN NaN \n", "1

Ensuring Military Readiness Act of 2023... NaN NaN \n", "2

This resolution expresses support for the ... NaN NaN \n", "3

This resolution expresses support for the ... NaN NaN \n", "4

This resolution expresses support for impl... NaN NaN \n", "\n", " Date Submitted Date Proposed Amendment Text (Latest) Amends Amendment \n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", "[5 rows x 434 columns]" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# see there is no ellipses anymore on the column headers\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "id": "fe66f886", "metadata": {}, "source": [ "How would we get rid of the rest of the columns that are mostly NaN? " ] }, { "cell_type": "code", "execution_count": 69, "id": "b4c56a6b", "metadata": {}, "outputs": [], "source": [ "for item in df.columns.values:\n", " if 'Subject' in item:\n", " df = df.drop(item, axis=1)" ] }, { "cell_type": "code", "execution_count": 70, "id": "5ce8a0ac", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 148 entries, 0 to 280\n", "Columns: 232 entries, Legislation Number to Amends Amendment\n", "dtypes: float64(3), object(229)\n", "memory usage: 269.4+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "code", "execution_count": 71, "id": "494dc182", "metadata": {}, "outputs": [], "source": [ "for item in df.columns.values:\n", " if 'Related' in item:\n", " df = df.drop(item, axis=1)" ] }, { "cell_type": "code", "execution_count": 72, "id": "324652ae", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 148 entries, 0 to 280\n", "Data columns (total 17 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Legislation Number 148 non-null object \n", " 1 URL 148 non-null object \n", " 2 Congress 148 non-null object \n", " 3 Title 146 non-null object \n", " 4 Sponsor 148 non-null object \n", " 5 Party of Sponsor 148 non-null object \n", " 6 Date of Introduction 115 non-null object \n", " 7 Committees 114 non-null object \n", " 8 Latest Action 146 non-null object \n", " 9 Latest Action Date 146 non-null object \n", " 10 Latest Summary 61 non-null object \n", " 11 Amends Bill 33 non-null object \n", " 12 Date Offered 31 non-null object \n", " 13 Date Submitted 2 non-null object \n", " 14 Date Proposed 0 non-null float64\n", " 15 Amendment Text (Latest) 33 non-null object \n", " 16 Amends Amendment 0 non-null float64\n", "dtypes: float64(2), object(15)\n", "memory usage: 20.8+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "id": "606339ad", "metadata": {}, "source": [ "And now we have a much cleaner csv file. We can save it to our workspace with the following line:" ] }, { "cell_type": "code", "execution_count": 73, "id": "5a9aa08c", "metadata": {}, "outputs": [], "source": [ "df.to_csv('congress_clean.csv')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 5 }