Summary

pipeline pipeline

The Summary pipeline summarizes text. This pipeline runs a text2text model that abstractively creates a summary of the input text.

Example

The following shows a simple example using this pipeline.

  1. from txtai.pipeline import Summary
  2. # Create and run pipeline
  3. summary = Summary()
  4. summary("Enter long, detailed text to summarize here")

See the link below for a more detailed example.

NotebookDescription
Building abstractive text summariesRun abstractive text summarizationOpen In Colab

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.

config.yml

  1. # Create pipeline using lower case class name
  2. summary:
  3. # Run pipeline with workflow
  4. workflow:
  5. summary:
  6. tasks:
  7. - action: summary

Run with Workflows

  1. from txtai import Application
  2. # Create and run pipeline with workflow
  3. app = Application("config.yml")
  4. list(app.workflow("summary", ["Enter long, detailed text to summarize here"]))

Run with API

  1. CONFIG=config.yml uvicorn "txtai.api:app" &
  2. curl \
  3. -X POST "http://localhost:8000/workflow" \
  4. -H "Content-Type: application/json" \
  5. -d '{"name":"summary", "elements":["Enter long, detailed text to summarize here"]}'

Methods

Python documentation for the pipeline.

__init__(path=None, quantize=False, gpu=True, model=None, **kwargs)

Source code in txtai/pipeline/text/summary.py

  1. 15
  2. 16
  1. def init(self, path=None, quantize=False, gpu=True, model=None, kwargs):
  2. super().init(“summarization”, path, quantize, gpu, model, kwargs)

__call__(text, minlength=None, maxlength=None, workers=0)

Runs a summarization model against a block of text.

This method supports text as a string or a list. If the input is a string, the return type is text. If text is a list, a list of text is returned with a row per block of text.

Parameters:

NameTypeDescriptionDefault
text

text|list

required
minlength

minimum length for summary

None
maxlength

maximum length for summary

None
workers

number of concurrent workers to use for processing data, defaults to None

0

Returns:

TypeDescription

summary text

Source code in txtai/pipeline/text/summary.py

  1. 18
  2. 19
  3. 20
  4. 21
  5. 22
  6. 23
  7. 24
  8. 25
  9. 26
  10. 27
  11. 28
  12. 29
  13. 30
  14. 31
  15. 32
  16. 33
  17. 34
  18. 35
  19. 36
  20. 37
  21. 38
  22. 39
  23. 40
  24. 41
  25. 42
  26. 43
  27. 44
  28. 45
  29. 46
  30. 47
  31. 48
  32. 49
  33. 50
  34. 51
  35. 52
  36. 53
  37. 54
  38. 55
  39. 56
  40. 57
  1. def call(self, text, minlength=None, maxlength=None, workers=0):
  2. “””
  3. Runs a summarization model against a block of text.
  4. This method supports text as a string or a list. If the input is a string, the return
  5. type is text. If text is a list, a list of text is returned with a row per block of text.
  6. Args:
  7. text: text|list
  8. minlength: minimum length for summary
  9. maxlength: maximum length for summary
  10. workers: number of concurrent workers to use for processing data, defaults to None
  11. Returns:
  12. summary text
  13. “””
  14. # Validate text length greater than max length
  15. check = maxlength if maxlength else self.maxlength()
  16. # Skip text shorter than max length
  17. texts = text if isinstance(text, list) else [text]
  18. params = [(x, text if len(text) >= check else None) for x, text in enumerate(texts)]
  19. # Build keyword arguments
  20. kwargs = self.args(minlength, maxlength)
  21. inputs = [text for _, text in params if text]
  22. if inputs:
  23. # Run summarization pipeline
  24. results = self.pipeline(inputs, num_workers=workers, **kwargs)
  25. # Pull out summary text
  26. results = iter([self.clean(x[“summary_text”]) for x in results])
  27. results = [next(results) if text else texts[x] for x, text in params]
  28. else:
  29. # Return original
  30. results = texts
  31. return results[0] if isinstance(text, str) else results