decXmlParser Class Reference

XML Parser. More...

#include <decXmlParser.h>

List of all members.

Public Member Functions

Constructors and Destructors
 decXmlParser (deLogger *logger)
 Creates a new xml parser.
virtual ~decXmlParser ()
 Cleans up the xml parser.
Management
bool ParseXml (decBaseFileReader *file, decXmlDocument *doc)
 Parses the XML file using the given file reader into the given document object.
const char * GetCleanString () const
 Retrieves the clean string buffer.
void SetCleanString (int length)
 Copyies the indicates number of characters from the token buffer to the clean string buffer null terminated.
Error Handling

Those functions are provided to allow applications to capture all errors during parsing time.

Simply overwrite the functions you are interested in.

virtual void UnexpectedEOF (int line, int pos)
 The end of the XML file has been reached although not allowed yet.
virtual void UnexpectedToken (int line, int pos, const char *token)
 A token has been parse that is not expected at this place.
Parsing Functions

Those functions are used only by the ParseXml and should not be called directly except you want to write an extended XML parser.

void PrepareParse (decBaseFileReader *file)
 Prepares parsing the file by reseting all counters.
void ParseDocument (decXmlDocument *doc)
 Parses an XML file.
void ParseProlog (decXmlDocument *doc)
 Parses the XML file prolog.
void ParseXMLDecl (decXmlDocument *doc)
 Parses an XML Declaration.
void ParseDocTypeDecl (decXmlDocument *doc)
 Parses a document type declaration.
void ParseSystemLiteral (decXmlDocument *doc)
 Parses a system literal.
void ParsePublicLiteral (decXmlDocument *doc)
 Parses a public literal.
bool ParseDTD (decXmlDTD *dtd)
 Parses a DTD if one exists.
bool ParseElementTag (decXmlContainer *container, const char *requiredName)
 Parses an element tag but only if the tag name matches requiredName.
bool ParseReference (decXmlContainer *container)
 Parses a reference if one exists.
bool ParseCDSect (decXmlContainer *container)
 Parses a cd section if one exists.
void ParseAttribute (decXmlContainer *container)
 Parses an attribute.
void ParseAttValue (decXmlAttValue *value)
 Parses an attribute value.
bool ParseToken (const char *expected)
 Checks if the next token matches a certain name.
void ParseEquals ()
 Parses a value assignement.
int ParseSpaces ()
 Parses white spaces and returns the number of white spaces found.
void ParseEncName (decXmlDocument *doc)
 Parses an enconding name.
void ParseMisc (decXmlContainer *container)
 Parses any number of consequtive comments, pi or white spaces.
bool ParseComment (decXmlContainer *container)
 Parses a comment if present.
bool ParsePI (decXmlContainer *container)
 Parses a process instruction if present.
int ParseName (int offset, bool autoRemove)
 Parses a name token.
bool TestToken (int offset, const char *expected)
 Determines if the token starting offset characters ahead of the current position matches a given name.
Testing
bool IsLatinLetter (int aChar)
bool IsLatinDigit (int aChar)
bool IsHex (int aChar)
bool IsLetter (int aChar)
bool IsDigit (int aChar)
bool IsExtender (int aChar)
bool IsBaseChar (int aChar)
bool IsCombiningChar (int aChar)
bool IsIdeographic (int aChar)
bool IsChar (int aChar)
bool IsPubidChar (int aChar, bool restricted)
bool IsSpace (int aChar)
Token Management
int GetTokenLineNumber () const
 Retrieves the character at the given index ahead from the current position.
int GetTokenPositionNumber () const
 Retrieves the position number of the current token.
int GetTokenAt (int index)
void ClearToken ()
 Clears the current token buffer.
void RemoveFromToken (int length)
 Removes the given number of characters from the beginning of the token buffer.
void AddCharToToken (int aChar)
 Adds a character to the token buffer.
bool IsEOF ()
 Determines if the current position is at the end of the xml file.
void RaiseFatalError ()
 Raises a fatal error by first calling the approriate error handler and then throwing an exception.

Detailed Description

XML Parser.

The XML Paser processes an XML file provided by a file reader object. The content of the file is parsed and syntax checked but not validated. The resulting XML tree is then available in the document. One parser can not parse two XML files at the same time.

A typical scenario looks like this:

 decXMLParser parser;
 decXmlDocument document;
 if( parser.ParseXml( myFileReader, &document ) ){
     // success
 }else{
     // failure
 }

Errors occuring during parsing the XML file can be captured by overwriting the error handling functions.

Todo:
  • Add Schema support
  • Add Validation support for DTD and Schema
Author:
Plüss Roland
Version:
1.0
Date:
2008

Constructor & Destructor Documentation

decXmlParser::decXmlParser ( deLogger logger )

Creates a new xml parser.

virtual decXmlParser::~decXmlParser (  ) [virtual]

Cleans up the xml parser.


Member Function Documentation

void decXmlParser::AddCharToToken ( int  aChar )

Adds a character to the token buffer.

void decXmlParser::ClearToken (  )

Clears the current token buffer.

const char* decXmlParser::GetCleanString (  ) const [inline]

Retrieves the clean string buffer.

int decXmlParser::GetTokenAt ( int  index )
int decXmlParser::GetTokenLineNumber (  ) const [inline]

Retrieves the character at the given index ahead from the current position.

If the token buffer does not hold this character yet all characters up to this position are read into the token buffer. Retrieves the line number of the current token.

int decXmlParser::GetTokenPositionNumber (  ) const [inline]

Retrieves the position number of the current token.

bool decXmlParser::IsBaseChar ( int  aChar )
bool decXmlParser::IsChar ( int  aChar )
bool decXmlParser::IsCombiningChar ( int  aChar )
bool decXmlParser::IsDigit ( int  aChar )
bool decXmlParser::IsEOF (  )

Determines if the current position is at the end of the xml file.

bool decXmlParser::IsExtender ( int  aChar )
bool decXmlParser::IsHex ( int  aChar )
bool decXmlParser::IsIdeographic ( int  aChar )
bool decXmlParser::IsLatinDigit ( int  aChar )
bool decXmlParser::IsLatinLetter ( int  aChar )
bool decXmlParser::IsLetter ( int  aChar )
bool decXmlParser::IsPubidChar ( int  aChar,
bool  restricted 
)
bool decXmlParser::IsSpace ( int  aChar )
void decXmlParser::ParseAttribute ( decXmlContainer container )

Parses an attribute.

void decXmlParser::ParseAttValue ( decXmlAttValue value )

Parses an attribute value.

bool decXmlParser::ParseCDSect ( decXmlContainer container )

Parses a cd section if one exists.

Returns:
true if a cd section has been parsed
bool decXmlParser::ParseComment ( decXmlContainer container )

Parses a comment if present.

Returns:
true if a comment has been parsed
void decXmlParser::ParseDocTypeDecl ( decXmlDocument doc )

Parses a document type declaration.

void decXmlParser::ParseDocument ( decXmlDocument doc )

Parses an XML file.

bool decXmlParser::ParseDTD ( decXmlDTD dtd )

Parses a DTD if one exists.

Returns:
true if a DTD has been parsed
bool decXmlParser::ParseElementTag ( decXmlContainer container,
const char *  requiredName 
)

Parses an element tag but only if the tag name matches requiredName.

Returns:
true if an element tag has been parsed
void decXmlParser::ParseEncName ( decXmlDocument doc )

Parses an enconding name.

void decXmlParser::ParseEquals (  )

Parses a value assignement.

void decXmlParser::ParseMisc ( decXmlContainer container )

Parses any number of consequtive comments, pi or white spaces.

int decXmlParser::ParseName ( int  offset,
bool  autoRemove 
)

Parses a name token.

The parsing starts offset character ahead of the current position. If a valid name token has been parse it is removed only if autoRemove is true.

Returns:
offset to the last character in the token measured from the current position.
bool decXmlParser::ParsePI ( decXmlContainer container )

Parses a process instruction if present.

Returns:
true if a process instruction has been parsed
void decXmlParser::ParseProlog ( decXmlDocument doc )

Parses the XML file prolog.

void decXmlParser::ParsePublicLiteral ( decXmlDocument doc )

Parses a public literal.

bool decXmlParser::ParseReference ( decXmlContainer container )

Parses a reference if one exists.

Returns:
true if a reference has been parsed
int decXmlParser::ParseSpaces (  )

Parses white spaces and returns the number of white spaces found.

void decXmlParser::ParseSystemLiteral ( decXmlDocument doc )

Parses a system literal.

bool decXmlParser::ParseToken ( const char *  expected )

Checks if the next token matches a certain name.

If found the token is consumed.

Returns:
true if the expected token has been found
bool decXmlParser::ParseXml ( decBaseFileReader file,
decXmlDocument doc 
)

Parses the XML file using the given file reader into the given document object.

Only one XML file can be processed at the same time. Calling ParseXML on the same parser while one instance is running results in the second invokation to fail.

Returns:
true on success or false otherwise
void decXmlParser::ParseXMLDecl ( decXmlDocument doc )

Parses an XML Declaration.

void decXmlParser::PrepareParse ( decBaseFileReader file )

Prepares parsing the file by reseting all counters.

void decXmlParser::RaiseFatalError (  )

Raises a fatal error by first calling the approriate error handler and then throwing an exception.

void decXmlParser::RemoveFromToken ( int  length )

Removes the given number of characters from the beginning of the token buffer.

void decXmlParser::SetCleanString ( int  length )

Copyies the indicates number of characters from the token buffer to the clean string buffer null terminated.

bool decXmlParser::TestToken ( int  offset,
const char *  expected 
)

Determines if the token starting offset characters ahead of the current position matches a given name.

virtual void decXmlParser::UnexpectedEOF ( int  line,
int  pos 
) [virtual]

The end of the XML file has been reached although not allowed yet.

Parameters:
lineLine number where the error occured
posPosition from the beginning of the line where the error occured.
virtual void decXmlParser::UnexpectedToken ( int  line,
int  pos,
const char *  token 
) [virtual]

A token has been parse that is not expected at this place.

Parameters:
lineLine number where the error occured
posPosition from the beginning of the line where the error occured.
tokenThe unexpected token in unparsed form

The documentation for this class was generated from the following file: