The parser module implements the following class:

  • Parser: Recursive-descent parser.

Type Alias

yangson.parser.TransitionTable = typing.List[typing.Dict[str, typing.Callable[[], int]]]

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

This type represents the transition table for a deterministic finite automaton (DFA). See documentation for the method Parser.dfa().

class yangson.parser.Parser(text: str)

This abstract class provides a framework for implementing a recursive-descent parser. The text argument contains the input text to be parsed.

Concrete parsers should be implemented as a subclass of Parser. By convention, such a parser class should define the parse() method.

>>> p = Parser("x \t #quu0a,foo:bar< qwerty")

Instance Attributes


Input text for the parser, initialized from the text constructor argument.


Current position in the input text.

Public Methods

__str__() → str

Return string representation of the receiver’s input text and state.

The returned value is the input string with the character § inserted at the position of offset. In the following example, the position is right at the start of the input text.

adv_skip_ws() → bool

First advance offset by one and then skip optional whitespace. Return True if some whitespace was really skipped.

>>> p.adv_skip_ws()
>>> str(p)
'x \t §#quu0a,foo:bar< qwerty'
at_end() → bool

Return True if at end of input.

>>> p.at_end()
char(c: str) → None

Parse the character c.

This method may raise these exceptions:

>>> p.char("#")
>>> str(p)
'x \t #§quu0a,foo:bar< qwerty'
dfa(ttab: TransitionTable, init: int = 0) → int

This method realizes a deterministic finite automaton (DFA) that is also capable of side effects. The states of the DFA are integers, and init specifies the initial state. Negative integers correspond to final states, and the method returns the final state in which automaton reaches.

The ttab argument is a transition table for the DFA. The TransitionTable alias stands for a list whose i-th entry specifies the “row” corresponding to the state i. Each entry is a dictionary in which:

  • Keys are single-character strings or the empty string. The latter specifies the default transition that takes place whenever none of the other keys matches.
  • Values are functions with no argument that have to return a new state (integer), and may also have side effects.

The method starts in the initial state init, reads the next input character and performs a lookup in the transition table. The retrieved transition function is then executed and its return value is the new state with which the whole process is repeated. However, if the new state is final, the computation stops and the final state is returned.

DFA in the following example parses the input string up to the occurrence of the first 0 character.

>>> p.dfa([{"": lambda: 0, "0": lambda: -1}])
>>> str(p)
'x \t #quu§0a,foo:bar< qwerty'
line_column() → Tuple[int, int]

Return line and column coordinates of the current offset.

>>> p.line_column()
(1, 8)
match_regex(regex: Pattern, required: bool = False, meaning: str = "") → str

Parse input text starting from the current offset by matching it against a regular expression. The argument regex is a regular expression object (result of re.compile()). If the regular expression matches, the matched string is returned and offset is advanced past that string in the input text.

The required flag controls what happens if the regular expression doesn’t match: if it is True, then UnexpectedInput is raised, otherwise None is returned.

The optional meaning argument can be used to describe what the regular expression means – it is used in error messages.

>>> p.match_regex(re.compile("[0-9a-f]+"), meaning="hexa")
one_of(chset: str) → str

Parse one character from the set of alternatives specified in chset. If a match is found, offset is advanced by one position, and the matching character is returned. Otherwise, UnexpectedInput is raised.

>>> p.one_of(".?!,")
peek() → str

Return the next input character without advancing offset. If the parser is past the end of input, EndOfInput is raised.

>>> p.peek()
>>> str(p)
'x \t #quu0a,§foo:bar< qwerty'
prefixed_name() → Tuple[YangIdentifier, Optional[YangIdentifier]]

Parse a prefixed name and return a tuple containing the (local) name as the first component, and the prefix or None as the second component.

>>> p.prefixed_name()
('bar', 'foo')
remaining() → str

Return the remaining part of the input string.

>>> p.remaining()
'< qwerty'
>>> p.at_end()
skip_ws() → bool

Skip optional whitespace and return True if some was really skipped.

>>> q = Parser("\npi=3.14.159xyz!foo-bar")
>>> q.skip_ws()
test_string(string: str) → bool

Test whether string comes next in the input string. If it does, offset is advanced past that string, and True is returned. Otherwise, False is returned and offset is unchanged (even if string partly coincides with the input text). No exception is raised if the parser is at the end of input.

>>> q.test_string("pi=")
>>> str(q)
unsigned_float() → float

Parse and return an unsigned floating point number. The exponential notation is not supported.

>>> q.unsigned_float()
unsigned_integer() → int

Parse and return an unsigned integer.

>>> q.offset += 1    # skipping the dot
>>> q.unsigned_integer()
up_to(term: str) → str

Parse and return a segment of input text up to the terminating string term. Raise EndOfInput if term does not occur in the rest of the input string.

>>> q.up_to("!")
yang_identifier() → str

Parse and return YANG identifier.

UnexpectedInput: If no syntactically correct keyword is found.
>>> q.yang_identifier()