Search results
Results from the WOW.Com Content Network
Poppler is a free and open-source software library for rendering Portable Document Format (PDF) documents. Its development is supported by freedesktop.org . Commonly used on Linux systems, [ 4 ] it powers the PDF viewers of the GNOME and KDE desktop environments .
TeXworks is free and open-source application software, available for Windows, Linux and macOS. It is a Qt-based graphical user interface to the TeX typesetting system and its LaTeX, ConTeXt, and XeTeX extensions. TeXworks is targeted at direct generation of PDF output.
Beautiful Soup is a Python package for parsing HTML and XML documents, including those with malformed markup. It creates a parse tree for documents that can be used to extract data from HTML, [ 3 ] which is useful for web scraping .
Dictionary Builder is a Rust program that can parse XML dumps and extract entries in files; Scripts for parsing Wikipedia dumps – Python based scripts for parsing sql.gz files from wikipedia dumps. parse-mediawiki-sql – a Rust library for quickly parsing the SQL dump files with minimal memory allocation
A PDF creator and virtual PDF printer for Microsoft Windows PDF-XChange: Proprietary: Yes: PDF Tools allows creation of PDFs from many types of source input (images, scans, etc.). The PDF-XChange print driver allows printing directly to a PDF. A "lite" version of the print driver is free for non-commercial (home and academic) use. PrimoPDF ...
Sumatra PDF is a free and open-source document viewer that supports many document formats including: Portable Document Format (PDF), Microsoft Compiled HTML Help (CHM), DjVu, EPUB, FictionBook (FB2), MOBI, PRC, Open XML Paper Specification (OpenXPS, OXPS, XPS), and Comic Book Archive file (CB7, CBR, CBT, CBZ). [3]
Topic modeling to extract the main themes using NNMF and Factor Analysis. Correspondence analysis in order to identify words or concepts (or content categories) associated with any categorical meta-data associated with documents. Pre-and post-processing with R and python script; Analyze more than 70 languages including Chinese, Japanese, Korean ...
Finally the liborigin [1] library can also read .OPJ files such as by using the opj2dat script, which exports the data tables contained in the file. There is also a free component (Orglab) maintained by Originlab that can be used to create (or read) OPJ files. A free Viewer application is also available.