Description of the text contained in a DjVu page.
Description of the text contained in a DjVu page. This class contains the textual data for the page. It describes the text as a hierarchy of zones corresponding to page, column, region, paragraph, lines, words, etc... The piece of text associated with each zone is represented by an offset and a length describing a segment of a global UTF8 encoded string.
struct Zone
enum ZoneType ztype
GRect rect
int text_start
int text_length
GList<Zone> children
Zone* append_child()
GString textUTF8
Name Octal Ascii name
DjVuText::end_of_column 013 VT, Vertical Tab
DjVuText::end_of_region 035 GS, Group Separator
DjVuText::end_of_paragraph 037 US, Unit Separator
DjVuText::end_of_line 012 LF: Line Feed
Zone page_zone
int has_valid_zones() const
void normalize_text()
void encode(ByteStream &bs) const
void decode(ByteStream &bs)
GP<DjVuTXT> copy(void) const
GList<Zone *> search_string(const char * string, int & start_pos, bool search_fwd, bool match_case, bool whole_word=false) const
start_pos - Position where to start searching. It may be negative
or it may be bigger than the length of the textUTF8
string. If the start_pos is out of bounds, it will be fixed
before starting the search
If the function manages to find an occurrence of the string,
it will modify the start_pos to point to it. If no match has
been found, the start_pos will be reset to some big number
if searching forward and -1 otherwise.
search_fwd - TRUE means to search forward. FALSE - backward.
match_case - If set to FALSE the search will be case-insensitive.
whole_word - If set to TRUE the function will try to find
a whole word matching the passed string. The word separators
are all blank and punctuation characters. The passed
string may not contain word separators, that is it
must be a whole word.
WARNING: The returned list contains pointers to Zones.
DO NOT DELETE these Zones.unsigned int get_memory_usage() const
Alphabetic index HTML hierarchy of classes or Java