LangChain and Hugging Face's Large Model

Text Generation with LangChain and Hugging Face (Flan-T5 Large)

This demo shows how to set up a text generation pipeline using LangChain and Hugging Face’s Flan-T5 Large model. You’ll first load the model via Hugging Face Hub, wrap it in a HuggingFacePipeline, and then build a simple LangChain LLMChain that uses a custom prompt template for answering questions.

With this setup, you will have a basic but flexible text generation pipeline using LangChain and Flan-T5 Large. This will give you a starting point for building more advanced, context-aware applications—such as custom Q&A systems, assistants, or educational tools - on top of LangChain.

Steps

1. Install required libraries
2. Configure my Hugging Face API token
3. Load the Flan-T5 model from Hugging Face Hub
4. Create a LangChain HuggingFacePipeline
5. Build a simple LangChain chain
6. Try the chain on a few sample questions

Step 1: Install Required Libraries

Start by installing the main dependencies: langchain, transformers, torch, and accelerate.

!pip install langchain transformers torch accelerate

Step 2: Configure Hugging Face Token

Next, generate an API token from Hugging Face and set it as an environment variable so LangChain/HF Hub can use it.

Steps on Hugging Face:

1. Go to: https://huggingface.co/settings/tokens
2. Log in or create an account
3. Click Create new token
4. Provide a name, and set the role to write
5. Copy the token

Then set it in my environment (replace YOUR_HF_TOKEN_HERE with your actual key):

import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN_HERE"

Step 3: Load the Flan-T5 Model from Hugging Face Hub

Make sure LangChain and its extras are up to date and pin numpy to a compatible version if needed:

!pip install -U langchain
!pip install "numpy<2"

Then import the core libraries:

import torch
from transformers import pipeline
from langchain.llms import HuggingFacePipeline
from langchain import HuggingFaceHub

(If you want to pull directly via HuggingFaceHub, you could do something like this and tune parameters with model_kwargs:)

# llm = HuggingFaceHub(
# repo_id="google/flan-t5-large",
# model_kwargs={"temperature": 1.0, "max_length": 512},
# )

Step 4: Create a LangChain HuggingFacePipeline

Now import the pieces needed to build an LLM chain:

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

Step 5: Build a Chain Using LangChain

First, specify which model I want to use:

model_name = "google/flan-t5-large"

Then create a standard Hugging Face pipeline for text-to-text generation:

hf_pipeline = pipeline("text2text-generation", model=model_name)

Wrap that pipeline with LangChain’s HuggingFacePipeline LLM wrapper:

llm = HuggingFacePipeline(pipeline=hf_pipeline)

Next, define a simple prompt template that encourages step-by-step reasoning:

template = """Question: {question}
Answer: Let's think step by step."""

Create the PromptTemplate and then the LLMChain:

prompt = PromptTemplate(
template=template,
input_variables=["question"]
)
llm_chain = LLMChain(prompt=prompt, llm=llm)

Step 6: Run the Chain on a Few Example Questions

Now you can pass in different questions and let the chain handle the prompt formatting and model call.

# Question 1
question = "Explain the concept of black holes in simple terms."
llm_chain.run(question)

# Question 2
question = "What are the main causes of climate change, and how can we address them?"
llm_chain.run(question)

# Question 3
question = "Provide a brief overview of the history of artificial intelligence."
llm_chain.run(question)

Page updated

Google Sites

Report abuse