LLM
The LLM pipeline runs prompts through a large language model (LLM). This pipeline autodetects if the model path is a text generation or sequence to sequence model.
Example
The following shows a simple example using this pipeline.
from txtai.pipeline import LLM
# Create and run LLM pipeline
llm = LLM()
llm(
"""
Answer the following question using the provided context.
Question:
What are the applications of txtai?
Context:
txtai is an open-source platform for semantic search and
workflows powered by language models.
"""
)
The LLM pipeline automatically detects the underlying model type (text-generation
or sequence-sequence
). This can also be manually set.
from txtai.pipeline import LLM, Generator, Sequences
# Set model type via task parameter
llm = LLM("google/flan-t5-xl", task="sequence-sequence")
# Create sequences pipeline (same as previous statement)
sequences = Sequences("google/flan-t5-xl")
# Set model type via task parameter
llm = LLM("openlm-research/open_llama_3b", task="language-generation")
# Create generator pipeline (same as previous statement)
generator = Generator("openlm-research/open_llama_3b")
Models can be externally loaded and passed to pipelines. This is useful for models that are not yet supported by Transformers and/or need special initialization.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from txtai.pipeline import LLM
# Load Falcon-7B-Instruct
path = "tiiuae/falcon-7b-instruct"
model = AutoModelForCausalLM.from_pretrained(
path,
torch_dtype=torch.bfloat16,
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(path)
llm = LLM((model, tokenizer))
See the links below for more detailed examples.
Notebook | Description | |
---|---|---|
Prompt-driven search with LLMs | Embeddings-guided and Prompt-driven search with Large Language Models (LLMs) | |
Prompt templates and task chains | Build model prompts and connect tasks together with workflows |
Configuration-driven example
Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.
config.yml
# Create pipeline using lower case class name
# Use `generator` or `sequences` to force model type
llm:
# Run pipeline with workflow
workflow:
llm:
tasks:
- action: llm
Similar to the Python example above, the underlying Hugging Face pipeline parameters and model parameters can be set in pipeline configuration.
llm:
path: tiiuae/falcon-7b-instruct
torch_dtype: torch.bfloat16
trust_remote_code: True
Run with Workflows
from txtai.app import Application
# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("llm", [
"""
Answer the following question using the provided context.
Question:
What are the applications of txtai?
Context:
txtai is an open-source platform for semantic search and
workflows powered by language models.
"""
]))
Run with API
CONFIG=config.yml uvicorn "txtai.api:app" &
curl \
-X POST "http://localhost:8000/workflow" \
-H "Content-Type: application/json" \
-d '{"name":"sequences", "elements": ["Answer the following question..."]}'
Methods
Python documentation for the pipeline.
__init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs)
special
Source code in txtai/pipeline/text/llm.py
def __init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs):
super().__init__(self.task(path, task, **kwargs), path if path else "google/flan-t5-base", quantize, gpu, model, **kwargs)
# Load tokenizer, if necessary
self.pipeline.tokenizer = self.pipeline.tokenizer if self.pipeline.tokenizer else Models.tokenizer(path, **kwargs)
__call__(self, text, prefix=None, maxlength=512, workers=0, **kwargs)
special
Generates text using input text
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text | text|list | required | |
prefix | optional prefix to prepend to text elements | None | |
maxlength | maximum sequence length | 512 | |
workers | number of concurrent workers to use for processing data, defaults to None | 0 | |
kwargs | additional generation keyword arguments | {} |
Returns:
Type | Description |
---|---|
generated text |
Source code in txtai/pipeline/text/llm.py
def __call__(self, text, prefix=None, maxlength=512, workers=0, **kwargs):
"""
Generates text using input text
Args:
text: text|list
prefix: optional prefix to prepend to text elements
maxlength: maximum sequence length
workers: number of concurrent workers to use for processing data, defaults to None
kwargs: additional generation keyword arguments
Returns:
generated text
"""
# List of texts
texts = text if isinstance(text, list) else [text]
# Add prefix, if necessary
if prefix:
texts = [f"{prefix}{x}" for x in texts]
# Run pipeline
results = self.pipeline(texts, max_length=maxlength, num_workers=workers, **kwargs)
# Get generated text
results = [self.clean(texts[x], result) for x, result in enumerate(results)]
return results[0] if isinstance(text, str) else results