PDFMinerPDF parser and interpreter written entirely in Python | |
Download |
PDFMiner Ranking & Summary
Advertisement
- License:
- Freeware
- Price:
- FREE
- Publisher Name:
- Yusuke Shinyama
- Publisher web site:
- http://www.unixuser.org/~euske/
- Operating Systems:
- Mac OS X 10.0 or later
- File Size:
- 1.8 MB
PDFMiner Tags
PDFMiner Description
PDF parser and interpreter written entirely in Python PDFMiner is a suite of programs that aims to help analyzing text data from PDF documents. It includes a PDF parser, a PDF renderer (though only rendering text is supported for now), and a couple of nice tools to extract texts. Unlike other PDF-related tools, PDFMiner allows you to obtain the exact location of texts in a page, as well as other layout information such as font name or font size, which could be useful for analyzing the document. Here are some key features of "PDFMiner": · Written entirely in Python. · PDF-1.7 specification support. · Non-ASCII languages and vertical writing scripts support. · Various font types (Type1, TrueType, Type3, and CID) support. · Basic encryption (RC4). · PDF to HTML conversion (with a sample converter web app). · Outline (TOC) extraction. · Tagged contents extraction. Requirements: · Python 2.5 or later What's New in This Release: · Fixed rectangle handling. Able to extract image boundaries.
PDFMiner Related Software