Translation
The Translation pipeline translates text between languages. It supports over 100+ languages. Automatic source language detection is built-in. This pipeline detects the language of each input text row, loads a model for the source-target combination and translates text to the target language.
Example
The following shows a simple example using this pipeline.
from txtai.pipeline import Translation
# Create and run pipeline
translate = Translation()
translate("This is a test translation into Spanish", "es")
See the link below for a more detailed example.
Notebook | Description | |
---|---|---|
Translate text between languages | Streamline machine translation and language detection |
Configuration-driven example
Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.
config.yml
# Create pipeline using lower case class name
translation:
# Run pipeline with workflow
workflow:
translate:
tasks:
- action: translation
args: ["es"]
Run with Workflows
from txtai.app import Application
# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("translate", ["This is a test translation into Spanish"]))
Run with API
CONFIG=config.yml uvicorn "txtai.api:app" &
curl \
-X POST "http://localhost:8000/workflow" \
-H "Content-Type: application/json" \
-d '{"name":"translate", "elements":["This is a test translation into Spanish"]}'
Methods
Python documentation for the pipeline.
Constructs a new language translation pipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path | optional path to model, accepts Hugging Face model hub id or local path, uses default model for task if not provided | None | |
quantize | if model should be quantized, defaults to False | False | |
gpu | True/False if GPU should be enabled, also supports a GPU device id | True | |
batch | batch size used to incrementally process content | 64 | |
langdetect | set a custom language detection function, method must take a list of strings and return language codes for each, uses default language detector if not provided | None | |
findmodels | True/False if the Hugging Face Hub will be searched for source-target translation models | True |
Source code in txtai/pipeline/text/translation.py
|
|
Translates text from source language into target language.
This method supports texts as a string or a list. If the input is a string, the return type is string. If text is a list, the return type is a list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts | text|list | required | |
target | target language code, defaults to “en” | ‘en’ | |
source | source language code, detects language if not provided | None |
Returns:
Type | Description |
---|---|
list of translated text |
Source code in txtai/pipeline/text/translation.py
|
|