alpcentaur
/
basabuuka_prototyp

Metadata-Version: 2.1Name: pdfminer3kVersion: 1.3.1Summary: PDF parser and analyzerHome-page: https://github.com/jaepil/pdfminer3kAuthor: Yusuke ShinyamaAuthor-email: yusuke at cs dot nyu dot eduMaintainer: Jaepil Jeong, Virgil DuprasMaintainer-email: jaepil@kaist.ac.kr, hsoft@hardcoded.netLicense: MIT/XKeywords: pdf parser,pdf converter,layout analysis,text miningPlatform: UNKNOWNClassifier: Development Status :: 4 - BetaClassifier: Environment :: ConsoleClassifier: Intended Audience :: DevelopersClassifier: Intended Audience :: Science/ResearchClassifier: License :: OSI Approved :: MIT LicenseClassifier: Topic :: Text ProcessingRequires-Dist: pytest (>=2.0)Requires-Dist: ply (>=3.4)
pdfminer3k is a Python 3 port of pdfminer.PDFMiner is a tool for extracting information from PDF documents.Unlike other PDF-related tools, it focuses entirely on getting and analyzing text data. PDFMiner allows to obtainthe exact location of texts in a page, as well as other information such as fonts or lines.It includes a PDF converter that can transform PDF filesinto other text formats (such as HTML). It has an extensiblePDF parser that can be used for other purposes instead of text analysis.
Changes=======
Version 1.3.1 -- 2016/11/05---------------------------
* Replaced root loggers with module-wide loggers. This allows user to disable the log messages from pdfminer3k.
Version 1.3.0 -- 2012/07/20---------------------------
* Added `pdfexplore`, a tool to debug PDFs by exploring their data.* Don't try to group textboxes when there's too many (it takes too long).* Support object references as filters in streams.* Parse every object as soon as an objectid can't be found.* Improved the `STRICT`-based error handling idiom.
Version 1.2.4 -- 2011/10/07---------------------------
* When xref tables are corrupt, parse and cache all objects as a fallback.* Fixed a bogus assertion in layouts.
Version 1.2.3 -- 2011/09/05---------------------------
* Fixed a crash on uneven cmap codes.* Fixed a meta-crash caused by bad PSParser repr.
Version 1.2.2 -- 2011/08/30---------------------------
* Fixed crash on corrupt LZW data.* Ignore lines with no text for textlines grouping.* Don't crash on invalid dictionary constructs when parsing postscript.
Version 1.2.1 -- 2011/08/22---------------------------
* Fixed a crash on corrupted inline images.* Tweaked layout detection algo.
Version 1.2.0 -- 2011/08/09---------------------------
* There wasn't a changelog until now. Starting it.* Removed the old Postscript lexer and replaced it by a PLY-based one.* Added a couple of heuristic layout features.* Fixed a couple of crashes on opening PDFs.