running inference

running inference#

Now that you have a sense of how things work on the HF website, we are going to practice running inference on Google Colab.

Our goal is to create a text generator, using Python code, taking the following steps:

Will use the model, “gpt-neo-125m”, importing this model into the colab coding space.
Then we will write code that processes an input text to generate an output, a continuation.
Finally, we will import a dataset from the library and practice running inference with it.

We’ll talk about some programming concepts along the way, like variables and data types, and how to access data from different types and structures. We will grapple with a new data type, a dict, and how to access or manipulate data from that type.

open Colab and load libraries#

First, on the toolbar, where it says RAM DISK, change the hardware accelator to GPU.

Then, download the necessary libraries to your colab environment.

# %%capture
# %pip install transformers trl

Go back to the models page.

Search for gpt-neo, select 125m. On the top right, click on “Use in Transformers.”

Copy that code, and paste it to your google colab cell.

from transformers import pipeline

pipe = pipeline("text-generation", model="EleutherAI/gpt-neo-125m")

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 3
from transformers import pipeline
----> 3 pipe = pipeline("text-generation", model="EleutherAI/gpt-neo-125m")

File /opt/anaconda3/lib/python3.11/site-packages/transformers/pipelines/__init__.py:906, in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
if isinstance(model, str) or framework is None:
   model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
--> 906     framework, model = infer_framework_load_model(
       model,
       model_classes=model_classes,
       config=config,
       framework=framework,
       task=task,
       **hub_kwargs,
       **model_kwargs,
   )
model_config = model.config
hub_kwargs["_commit_hash"] = model.config._commit_hash

File /opt/anaconda3/lib/python3.11/site-packages/transformers/pipelines/base.py:234, in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
"""
Select framework (TensorFlow or PyTorch) to use from the `model` passed. Returns a tuple (framework, model).

   (...)
   `Tuple`: A tuple framework, model.
"""
if not is_tf_available() and not is_torch_available():
--> 234     raise RuntimeError(
       "At least one of TensorFlow 2.0 or PyTorch should be installed. "
       "To install TensorFlow 2.0, read the instructions at https://www.tensorflow.org/install/ "
       "To install PyTorch, read the instructions at https://pytorch.org/."
   )
if isinstance(model, str):
   model_kwargs["_from_pipeline"] = task

RuntimeError: At least one of TensorFlow 2.0 or PyTorch should be installed. To install TensorFlow 2.0, read the instructions at https://www.tensorflow.org/install/ To install PyTorch, read the instructions at https://pytorch.org/.

Here we have a function, called pipeline(), which takes parameters (a fancy word for input).

The parameters specify the task and the model that we will be using.

We save the function to a variable called pipe, which we will later use to process our prompt.

inference#

Now we are going to “run inference.”

First, we will type up a prompt, and save it to a variable prompt. Then we will pass that prompt to the pipe variable that we created before, saving the output to a new variable, called output.

prompt = "Hello, my name is Filipa and"

pipe(prompt, max_length = 50)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

[{'generated_text': "Hello, my name is Filipa and I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in"}]

output = pipe(prompt, max_length = 50)

[{'generated_text': "Hello, my name is Filipa and I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in"}]

Here we see the levels of abstraction at play. Saving the pipeline function to a new variable, then the prompt text to a variable, and passing that prompt into the pipe.

Now let’s look at the response, and inspect the data structure contained within it, which is a list.

list is a collection of objects, or bits of information. So our output is saved as this collection type of object.

output

[{'generated_text': "Hello, my name is Filipa and I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in"}]

type(output)

list

What if we wanted to extract just the output text, not the rest of the data, how would we go about it? We use list indexing. When we check the type, we find out the first item of the list is inside another data type, a dict.

output[0]

{'generated_text': "Hello, my name is Filipa and I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in"}

type(output[0])

dict

To get items from a dict, you use a different method, accessing them by their keys.

filipa = {
    'first_name': 'filipa',
    'last_name': 'calado',
    'job': 'library',
    'age': '34',
    'degree': 'literature'
}

filipa['degree']

'literature'

So, we can combine what we know about list indexing and accessing items in a dict by keys to pull out just the response text.

output[0]['generated_text']

"Hello, my name is Filipa and I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in the world of web development. I'm a newbie in"

accessing data from datasets:#

Now we will practice what we’ve learned about accessing data on the Datasets library from HF.

# install the library and import dataset loader
# %%capture
# !pip install datasets
from datasets import load_dataset

# load the dataset and its subset
dataset = load_dataset("gofilipa/gender_congress_117-118")

# check the dataset object
dataset

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

Downloading and preparing dataset csv/gofilipa--gender_congress_117-118 to /Users/caladof/.cache/huggingface/datasets/gofilipa___csv/gofilipa--gender_congress_117-118-304e9fdc48b3d0d4/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1...

Dataset csv downloaded and prepared to /Users/caladof/.cache/huggingface/datasets/gofilipa___csv/gofilipa--gender_congress_117-118-304e9fdc48b3d0d4/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1. Subsequent calls will reuse this data.

DatasetDict({
    train: Dataset({
        features: ['definitions'],
        num_rows: 332
    })
})

type(dataset)

datasets.dataset_dict.DatasetDict

# how do we get items from a dict? by the key

dataset['train']

Dataset({
    features: ['definitions'],
    num_rows: 332
})

# how would we get the second row from this dataset?

dataset['train']['definitions'][1]

'The term sex means the indication of male or female sex by reproductive potential or capacity sex chromosomes naturally occurring sex hormones gonads or internal or external genitalia present at birth'

running inference

Contents

running inference#

open Colab and load libraries#

inference#

accessing data from datasets:#