View Source xmerl_scan (xmerl v2.0)
Single pass XML scanner.
This module is the interface to the XML parser, it handles XML 1.0.
The XML parser is activated through
xmerl_scan:string/[1,2]
or
xmerl_scan:file/[1,2]
. It returns records
of the type defined in xmerl.hrl
.
See also the "Customization functions" tutorial.
Summary
Types
An XML document.
The global state of the scanner, represented by the #xmerl_scanner{}
record.
Options allow to customize the behaviour of the scanner. See also the "Customization functions" tutorial.
Record #xmlDocument{}
.
Record #xmlElement{}
.
Functions
Accumulate and normalize whitespace.
Fetch the ContinuationState
.
Set the ContinuationState, to be used in a continuation function.
Fetch the EventState
.
Set the EventState, to be used in an event function.
Fetch the FetchState
.
Set the FetchState, to be used in a fetch function.
Equivalent to file(Filename, [])
.
Parse a file containing an XML document
Fetch the HookState
.
Set the HookState, to be used in a hook function.
Fetch the RulesState
.
Set the RulesState, to be used in a rules function.
Equivalent to string(Text, [])
.
Parse a string containing an XML document
Fetch the UserState
.
Set the UserState
, to be used in a user function.
Types
-type document() :: xmlElement() | xmlDocument().
An XML document.
The document returned by xmerl_scan:string/[1,2]
and
xmerl_scan:file/[1,2]
. The type of the returned record
depends on the value of the document
option
passed to the function.
-type global_state() :: xmerl_scanner().
The global state of the scanner, represented by the #xmerl_scanner{}
record.
-type option_list() :: [{atom(), term()} | {atom(), fun(), term()} | {atom(), fun(), fun(), term()}].
Options allow to customize the behaviour of the scanner. See also the "Customization functions" tutorial.
Possible options are:
{acc_fun, Fun}
- Call back function to accumulate contents of entity.
{continuation_fun, Fun} | {continuation_fun, Fun, ContinuationState}
- Call back function to decide what to do if the scanner runs into EOF before the document is complete.
{event_fun, Fun} | {event_fun, Fun, EventState}
- Call back function to handle scanner events.
{fetch_fun, Fun} | {fetch_fun, Fun, FetchState}
- Call back function to fetch an external resource.
{hook_fun, Fun} | {hook_fun, Fun, HookState}
- Call back function to process the document entities once identified.
{close_fun, Fun}
- Called when document has been completely parsed.
{rules, ReadFun, WriteFun, RulesState} | {rules, Rules}
- Handles storing of scanner information when parsing.
{user_state, UserState}
- Global state variable accessible from all customization functions
{fetch_path, PathList}
- PathList is a list of directories to search when fetching files. If the file in question is not in the fetch_path, the URI will be used as a file name.
{space, Flag}
preserve
(default) to preserve spaces,normalize
to accumulate consecutive whitespace and replace it with one space.{line, Line}
- To specify starting line for scanning in document which contains fragments of XML.
{namespace_conformant, Flag}
- Controls whether to behave as a namespace conformant XML parser,
false
(default) to not otherwisetrue
. {validation, Flag}
- Controls whether to process as a validating XML parser:
off
(default) no validation, or validationdtd
by DTD orschema
by XML Schema.false
andtrue
options are obsolete (i.e. they may be removed in a future release), if usedfalse
equalsoff
andtrue
equalsdtd
. {schemaLocation, [{Namespace,Link}|...]}
- Tells explicitly which XML Schema documents to use to validate
the XML document. Used together with the
{validation,schema}
option. {quiet, Flag}
- Set to
true
if Xmerl should behave quietly and not output any information to standard output (defaultfalse
). {doctype_DTD, DTD}
- Allows to specify DTD name when it isn't available in the XML
document. This option has effect only together with
{validation,
option.dtd
} {xmlbase, Dir}
- XML Base directory. If using string/1 default is current directory. If using file/1 default is directory of given file.
{encoding, Enc}
- Set default character set used (default UTF-8). This character set is used only if not explicitly given by the XML declaration.
{document, Flag}
- Set to
true
if Xmerl should return a complete XML document as an xmlDocument record (defaultfalse
). {comments, Flag}
- Set to
false
if Xmerl should skip comments otherwise they will be returned as xmlComment records (defaulttrue
). {default_attrs, Flag}
- Set to
true
if Xmerl should add to elements missing attributes with a defined default value (defaultfalse
). {allow_entities, Flag}
- Set to
true
ifxmerl_scan
shouldn't fail when there is an ENTITY declaration in the XML document (defaultfalse
).
-type xmlDocument() :: xmerl:xmlDocument().
Record #xmlDocument{}
.
The record definition is found in xmerl.hrl
.
-type xmlElement() :: xmerl:xmlElement().
Record #xmlElement{}
.
The record definition is found in xmerl.hrl
.
Functions
-spec accumulate_whitespace(Text, global_state(), How, Acc) -> {NewAcc, NewText, global_state()} when Text :: string(), How :: preserve | normalize, Acc :: string(), NewAcc :: string(), NewText :: string().
Accumulate and normalize whitespace.
-spec cont_state(global_state()) -> ContinuationState when ContinuationState :: term().
Fetch the ContinuationState
.
See the "Customization functions" tutorial.
-spec cont_state(ContState :: term(), global_state()) -> global_state().
Set the ContinuationState, to be used in a continuation function.
The continuation function is called when the parser encounters the end of the byte stream. See the "Customization functions" tutorial.
-spec event_state(global_state()) -> EventState when EventState :: term().
Fetch the EventState
.
See the "Customization functions" tutorial.
-spec event_state(EventState :: term(), global_state()) -> global_state().
Set the EventState, to be used in an event function.
The event function is called at the beginning and at the end of a parsed entity. See the "Customization functions" tutorial.
-spec fetch_state(global_state()) -> FetchState when FetchState :: term().
Fetch the FetchState
.
See the "Customization functions" tutorial.
-spec fetch_state(FetchState :: term(), global_state()) -> global_state().
Set the FetchState, to be used in a fetch function.
The fetch function is and called when the parser fetches an external resource (eg. a DTD). See the "Customization functions" tutorial.
-spec file(Filename :: string()) -> {xmlElement(), Rest} | {error, Reason} when Rest :: string(), Reason :: term().
Equivalent to file(Filename, [])
.
-spec file(Filename :: string(), option_list()) -> {document(), Rest} | {error, Reason} when Rest :: string(), Reason :: term().
Parse a file containing an XML document
-spec hook_state(global_state()) -> HookState when HookState :: term().
Fetch the HookState
.
See the "Customization functions" tutorial.
-spec hook_state(HookState :: term(), global_state()) -> global_state().
Set the HookState, to be used in a hook function.
The hook function is and called when the parser has parsed a complete entity. See the "Customization functions" tutorial.
-spec rules_state(global_state()) -> RulesState when RulesState :: term().
Fetch the RulesState
.
See the "Customization functions" tutorial.
-spec rules_state(RulesState :: term(), global_state()) -> global_state().
Set the RulesState, to be used in a rules function.
The rules function is and called when the parser store scanner information in a rules database. See the "Customization functions" tutorial.
-spec string(Text :: string()) -> {xmlElement(), Rest} when Rest :: string().
Equivalent to string(Text, [])
.
-spec string(Text :: string(), option_list()) -> {document(), Rest} when Rest :: string().
Parse a string containing an XML document
-spec user_state(global_state()) -> UserState when UserState :: term().
Fetch the UserState
.
See the "Customization functions" tutorial.
-spec user_state(UserState :: term(), G :: global_state()) -> global_state().
Set the UserState
, to be used in a user function.
See the "Customization functions" tutorial.