The Library Help/Info Current Release
| |
Last Modified: Apr 16, 2014
|
| Parsing
This page documents the objects and functions that in some way deal with parsing or otherwise
manipulating text.
Everything here follows the same conventions as the rest of the library.
|
|
[top]base64
This object allows you to encode and decode data to and from
the Base64 Content-Transfer-Encoding defined in section 6.8 of
rfc2045.
#include <dlib/base64.h>Detailed DocumentationC++ Example Programs:
file_to_code_ex.cpp [top]basic_utf8_ifstream
This object represents an input file stream much like the
normal std::ifstream except that it knows how to read UTF-8
data. So when you read characters out of this stream it will
automatically convert them from the UTF-8 multibyte encoding
into a fixed width wide character encoding.
There are also two typedefs of this object. The first is utf8_wifstream which is a
typedef for wchar_t as the wide character to read into. The second is utf8_uifstream
which uses unichar instead of wchar_t.
#include <dlib/unicode.h>Detailed Documentation [top]cast_to_string
cast_to_string is a templated function which makes it easy to convert arbitrary objects to
std::string strings. The types supported are any types that can be written to std::ostream via
operator<<.
#include <dlib/string.h>Detailed Documentation [top]cast_to_wstring
cast_to_string is a templated function which makes it easy to convert arbitrary objects to
std::wstring strings. The types supported are any types that can be written to std::wostream via
operator<<.
#include <dlib/string.h>Detailed Documentation [top]cmd_line_parser
This object allows you to easily parse a command line. Note that the
documentation for the
cmd_line_parser_option
(the object returned by the parser's .option() function) is in a separate file.
Note also that there are standard typedefs for the ASCII and wide character versions of the
cmd_line_parser template. These are the command_line_parser and wcommand_line_parser
types respectively.
#include <dlib/cmd_line_parser.h>Detailed DocumentationC++ Example Programs:
compress_stream_ex.cpp,
train_object_detector.cppExtensions to cmd_line_parser
get_optionThis extension provides a convenience function for accessing the
options to a command line argument or a
config_reader. It
is automatically #included when using the command line parser or config reader.
Detailed Documentation [top]config_reader
This object represents something which is intended to be used to read
text configuration files.
#include <dlib/config_reader.h>Detailed DocumentationC++ Example Programs:
config_reader_ex.cppExtensions to config_reader
[top]convert_utf8_to_utf32
This is a global function that can convert UTF-8 strings into strings
of 32bit unichar characters.
#include <dlib/unicode.h>Detailed Documentation [top]cpp_pretty_printer
This object represents an HTML pretty printer for C++ source code.
#include <dlib/cpp_pretty_printer.h>Detailed DocumentationImplementations:cpp_pretty_printer_kernel_1:
This is implemented by using the cpp_tokenizer object.
This is the pretty printer I use on all the source in this library. It applies a color scheme, turns
include directives such as #include "file.h" into links to file.h.html and puts HTML anchor points
on function and class declarations. It also looks for comments starting with /*!A and puts an anchor
before the comment using the word following the A as the name of the anchor.
kernel_1a | is a typedef for cpp_pretty_printer_kernel_1 |
cpp_pretty_printer_kernel_2:
This is implemented by using the cpp_tokenizer object.
It applies a black and white color scheme suitable
for printing on a black and white printer. It also places the document title
prominently at the top of the pretty printed source file.
kernel_2a | is a typedef for cpp_pretty_printer_kernel_2 |
[top]cpp_tokenizer
This object represents a simple tokenizer for C++ source code.
#include <dlib/cpp_tokenizer.h>Detailed DocumentationImplementations:cpp_tokenizer_kernel_1:
This is implemented by using the tokenizer object in the obvious way.
kernel_1a | is a typedef for cpp_tokenizer_kernel_1 |
kernel_1a_c |
is a typedef for kernel_1a that checks its preconditions.
|
[top]is_combining_char
This is a global function that can tell you if a character is a Unicode
combining character or not.
#include <dlib/unicode.h>Detailed Documentation [top]left_substr
This is a function to return the part of a string to the left of a user supplied delimiter.
#include <dlib/string.h>Detailed Documentation [top]lpad
This is a function to pad whitespace (or user specified characters) onto the left most end of a string.
#include <dlib/string.h>Detailed Documentation [top]ltrim
This is a function to remove the whitespace (or user specified characters) from the left most end of a string.
#include <dlib/string.h>Detailed Documentation [top]narrow
This is a function for converting a string of type std::string or std::wstring
to a plain std::string.
#include <dlib/string.h>Detailed Documentation [top]pad
This is a function to pad whitespace (or user specified characters) onto the ends of a string.
#include <dlib/string.h>Detailed Documentation [top]pad_int_with_zeros
Converts an integer into a string and pads it with leading zeros.
#include <dlib/string.h>Detailed Documentation [top]right_substr
This is a function to return the part of a string to the right of a user supplied delimiter.
#include <dlib/string.h>Detailed Documentation [top]rpad
This is a function to pad whitespace (or user specified characters) onto the right most end of a string.
#include <dlib/string.h>Detailed Documentation [top]rtrim
This is a function to remove the whitespace (or user specified characters) from the right most end of a string.
#include <dlib/string.h>Detailed Documentation [top]split
Breaks a string into a sequence of substrings delimited
by a user specified set of characters.
#include <dlib/string.h>Detailed Documentation [top]split_on_first
Breaks a string into two parts. The split point is selected based
on the first occurrence of a delimiter character.
#include <dlib/string.h>Detailed Documentation [top]split_on_last
Breaks a string into two parts. The split point is selected based
on the last occurrence of a delimiter character.
#include <dlib/string.h>Detailed Documentation [top]strings_equal_ignore_case
This is a pair of functions to do a case insensitive comparison between strings.
#include <dlib/string.h>Detailed Documentation [top]string_assign
string_assign is an object which makes it easy to convert strings to
other types. The types supported are any types that can be read by the basic_istream operator>>. It
also supports casting between wstring, string, and ustring objects. Since
string_assign is a simple stateless object there is a global instance of it
called dlib::sa.
#include <dlib/string.h>Detailed DocumentationC++ Example Programs:
config_reader_ex.cpp [top]string_cast
string_cast is a templated function which makes it easy to convert strings to
other types. The types supported are any types that can be read by the basic_istream operator>>. It
also supports casting between wstring, string, and ustring objects.
#include <dlib/string.h>Detailed Documentation [top]tokenizer
This object represents a simple tokenizer for textual data.
#include <dlib/tokenizer.h>Detailed DocumentationImplementations:tokenizer_kernel_1:
This is implemented in the obvious way.
kernel_1a | is a typedef for tokenizer_kernel_1 |
kernel_1a_c |
is a typedef for kernel_1a that checks its preconditions.
|
[top]trim
This is a function to remove the whitespace (or user specified characters) from the ends of a string.
#include <dlib/string.h>Detailed Documentation [top]unichar
This is a typedef for an unsigned 32bit integer which we use to store
Unicode values.
#include <dlib/unicode.h>Detailed Documentation [top]ustring
This is a typedef for a std::basic_string<unichar>. That is, it is a typedef
for a string object that stores unichar Unicode characters.
#include <dlib/unicode.h>Detailed Documentation [top]wrap_string
wrap_string is a function that takes a string and breaks it into a number of
lines of a given length. You can use this to make a string
fit nicely into a command prompt window for example.
#include <dlib/string.h>Detailed Documentation [top]xml_parser
This object represents a simple SAX style event driven XML parser.
It takes its input from an input stream object and sends events to all
registered document_handler and error_handler objects.
The xml_parser object also uses the interface classes
document_handler
and
error_handler.
Subclasses of these classes are passed to the xml_parser which generates events while it's
parsing and sends them to the appropriate handler.
#include <dlib/xml_parser.h>Detailed DocumentationC++ Example Programs:
xml_parser_ex.cpp