Search results
Results from the WOW.Com Content Network
PLY is a parsing tool written purely in Python. It is, in essence, a re-implementation of Lex and Yacc originally in C-language . It was written by David M. Beazley .
Pdf-parser is a command-line program that parses and analyses PDF documents. It provides features to extract raw data from PDF documents, like compressed images. pdf-parser can deal with malicious PDF documents that use obfuscation features of the PDF language. [1] The tool can also be used to extract data from damaged or corrupt PDF documents.
Desktop publishing (DTP) application allows opening and editing of PDF documents; Allows compatible saving as PDF 1.3, 1.4, 1.5 and 1.7 and supports also PDF/X1, PDF/X1a and PDF/X-3. pdf-parser: Public Domain Python script Yes Extraction and analysis tool, handles corrupt and malicious PDF documents. PDFedit: GNU GPL: Yes Yes BSD Yes
Beautiful Soup is a Python package for parsing HTML and XML documents, including those with malformed markup. It creates a parse tree for documents that can be used to extract data from HTML, [3] which is useful for web scraping. [2] [4]
This is a list of free and open-source software packages , computer software licensed under free software licenses and open-source licenses. Software that fits the Free Software Definition may be more appropriately called free software ; the GNU project in particular objects to their works being referred to as open-source . [ 1 ]
The Packrat parser is a type of parser that shares similarities with the recursive descent parser in its construction. However, it differs because it takes parsing expression grammars (PEGs) as input rather than LL grammars .
In computer programming, a parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output. In this context, a parser is a function accepting strings as input and returning some structure as output, typically a parse tree or a set of indices representing locations in the string where parsing stopped successfully.
Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs that communicate with each other over a network or for storing data.