Building a Simple Q&A App with HuggingFace and Gradio

Using pre-trained LLMs with HuggingFace and Gradio to build and deploy a simple question answering app in few lines of Python code.
transformer
machine learning
deployment
HuggingFace
Author

Stefan Schneider

Published

February 9, 2024

Modified

January 4, 2025

Large language models (LLMs) like GPT, BART, etc. have demonstrated incredible abilities in natural language.

This blog post describes how you can use LLMs to build and deploy your own app in just a few lines of Python code with the HuggingFace ecosystem. HuggingFace provides pre-trained models, datasets, and other tools that are handy when working with machine learning models without having to understand all the underlying theory. If you are interested in how LLMs work, see my other blog post on the underlying transformer architecture.

As an example, the goal of this post is to build an app that answers questions about a given PDF document. The focus is on showing a simple proof of concept rather than high-quality answers.

First, let’s install the necessary dependencies:

%%capture --no-display
pip install -U pypdf torch transformers gradio

Question Answering with HuggingFace

We can read the text of PDF document with pypdf. As an example, I’m using the author version of a paper I wrote on mobile-env.

from pathlib import Path
from typing import Union
from pypdf import PdfReader


def get_text_from_pdf(pdf_file: Union[str, Path]) -> str:
    """Read the PDF from the given path and return a string with its entire content."""
    reader = PdfReader(pdf_file)

    # Extract text from all pages
    full_text = ""
    for page in reader.pages:
        full_text += page.extract_text()
    return full_text

# Read and print parts of the PDF
pdf_text = get_text_from_pdf("mobileenv_author_version.pdf")
pdf_text[:1500]
'mobile-env: An Open Platform for Reinforcement\nLearning in Wireless Mobile Networks\nStefan Schneider, Stefan Werner\nPaderborn University, Germany\n{stschn, stwerner}@mail.upb.de\nRamin Khalili, Artur Hecker\nHuawei Technologies, Germany\n{ramin.khalili, artur.hecker}@huawei.com\nHolger Karl\nHasso Plattner Institute,\nUniversity of Potsdam, Germany\nholger.karl@hpi.de\nAbstract—Recent reinforcement learning approaches for con-\ntinuous control in wireless mobile networks have shown im-\npressive results. But due to the lack of open and compatible\nsimulators, authors typically create their own simulation en-\nvironments for training and evaluation. This is cumbersome\nand time-consuming for authors and limits reproducibility and\ncomparability, ultimately impeding progress in the field.\nTo this end, we proposemobile-env, a simple and open platform\nfor training, evaluating, and comparing reinforcement learning\nand conventional approaches for continuous control in mobile\nwireless networks. mobile-env is lightweight and implements\nthe common OpenAI Gym interface and additional wrappers,\nwhich allows connecting virtually any single-agent or multi-agent\nreinforcement learning framework to the environment. While\nmobile-env provides sensible default values and can be used out\nof the box, it also has many configuration options and is easy to\nextend. We therefore believe mobile-env to be a valuable platform\nfor driving meaningful progress in autonomous coordination of\nwireless mobile networks.\nIndex T'

Now we can create a question answering pipeline using HuggingFace, loading a pre-trained model. Then we can ask some questions, providing the PDF text as context.

from transformers import pipeline

question_answerer = pipeline(task="question-answering", model="deepset/tinyroberta-squad2")
Device set to use mps:0
question_answerer("What is mobile-env?", pdf_text)
/opt/homebrew/Caskroom/miniforge/base/envs/blog/lib/python3.9/site-packages/transformers/pipelines/question_answering.py:391: FutureWarning: Passing a list of SQuAD examples to the pipeline is deprecated and will be removed in v5. Inputs should be passed using the `question` and `context` keyword arguments instead.
  warnings.warn(
{'score': 0.9887111186981201,
 'start': 16488,
 'end': 16505,
 'answer': 'GitHub repository'}
question_answerer("What programming language is mobile-env written in?", pdf_text)
{'score': 0.9665615558624268, 'start': 3552, 'end': 3558, 'answer': 'Python'}
question_answerer("What is the main difference between mobile-env and other simulators?", pdf_text)
{'score': 0.6506955027580261,
 'start': 12539,
 'end': 12570,
 'answer': 'more flexible, better documented'}

The pipeline returns a dict, where the answer is a quote from the given context, here the PDF document. This is called extractive question answering.

It also provides a score indicating the model’s confindence in the answer and the start/end index from where the answer is quoted.

That’s it! Let’s see how we can build a simple app on top of this.

Building an App with Gradio

Gradio allows building simple apps tailored for machine learning use cases. You can define the inputs, a function to where to pass these inputs, and how to display the functions outputs.

Here, our inputs are the PDF document and the question. The function loads the document and passes the question and text to the pre-trained model. It then outputs the models answer to the user.

import gradio as gr

def answer_doc_question(pdf_file, question):
    pdf_text = get_text_from_pdf(pdf_file)
    answer = question_answerer(question, pdf_text)
    return answer["answer"]

# Add default a file and question, so it's easy to try out the app.
pdf_input = gr.File(
    value="https://ris.uni-paderborn.de/download/30236/30237/author_version.pdf",
    file_types=[".pdf"],
    label="Upload a PDF document and ask a question about it.",
)
question = gr.Textbox(
    value="What is mobile-env?",
    label="Type a question regarding the uploaded document here.",
)
gr.Interface(
    fn=answer_doc_question, inputs=[pdf_input, question], outputs="text"
).launch()
Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.
/opt/homebrew/Caskroom/miniforge/base/envs/blog/lib/python3.9/site-packages/transformers/pipelines/question_answering.py:391: FutureWarning: Passing a list of SQuAD examples to the pipeline is deprecated and will be removed in v5. Inputs should be passed using the `question` and `context` keyword arguments instead.
  warnings.warn(

If you run this locally, you should see a rendered app based on the question answering pipeline we built above!

Deploying the app in HuggingFace Spaces

You can easily host the app on HuggingFace Spaces, which provide free (and slow) hosting (or fast paid hosting).

You simply create a new space under your account and add an app.py, which contains all code above. The requirements go into a requirements.txt. That’s it!

This is the app we built here: https://huggingface.co/spaces/stefanbschneider/pdf-question-answering

What’s Next?