Skip to main contentParseport is a composable and customizable library for documentation ingestion.
While VLMs are improving quickly, there are not good enough to handle a variety of documents with complex visual elements in one shot. Document ingestion requires a hiearchical, modular approach that breaks the problem into a multiple stages, which ideally should be customized for your specific needs.
The library offers a pre-built modules, such as LayoutParser, DocumentReader and Formatter that can be used to build custom pipelines. It comes with a few off-the-shelf models for layout detection and OCR, and also encourages bring in custom finetuned models to support special needs.
Check out the Tutorial to get started.
Also take a look at:
Overview for the overview and motivation behind this library.
Document Ingestion for an in-depth dive into document ingestion.
Open Source Tools for a list of open source tools that are helpful for building custom document ingestion pipeline.