A Python package & command-line tool to gather text on the Web — trafilatura 1.12.2 documentation

Trafilatura is a Python package and command-line tool designed to gather text on the Web. Its main applications are web crawling, downloads, scraping, and extraction of main texts, comments and metadata.