Parser

The parser module implements the following class:

Parser: Recursive-descent parser.

Type Alias

yangson.parser.TransitionTable

Transition table for a DFA.

This type represents the transition table for a deterministic finite automaton (DFA). See documentation for the method Parser.dfa(). alias of list[dict[str, Callable[[], int]]]

class yangson.parser.Parser(text: str)

This abstract class provides a framework for implementing a recursive-descent parser. The text argument contains the input text to be parsed.

Concrete parsers should be implemented as a subclass of Parser. By convention, such a parser class should define the parse() method.

>>> p = Parser("x \t #quu0a,foo:bar< qwerty")

Instance Attributes

input: Input text for the parser, initialized from the text constructor argument.

offset: Current position in the input text.

Public Methods

__str__() → str

Return string representation of the receiver’s input text and state.

The returned value is the input string with the character § inserted at the position of offset. In the following example, the position is right at the start of the input text.

adv_skip_ws() → bool

First advance offset by one and then skip optional whitespace. Return True if some whitespace was really skipped.

>>> p.adv_skip_ws()
True
>>> str(p)
'x \t §#quu0a,foo:bar< qwerty'

at_end() → bool

Return True if at end of input.

>>> p.at_end()
False

char(c: str) → None

Parse the character c.

This method may raise these exceptions:

EndOfInput – if the parser is past the end of input.
UnexpectedInput – if the next character is different from c.

>>> p.char("#")
>>> str(p)
'x \t #§quu0a,foo:bar< qwerty'

dfa(ttab: TransitionTable, init: int = 0) → int

This method realizes a deterministic finite automaton (DFA) that is also capable of side effects. The states of the DFA are integers, and init specifies the initial state. Negative integers correspond to final states, and the method returns the final state in which automaton reaches.

The ttab argument is a transition table for the DFA. The TransitionTable alias stands for a list whose i-th entry specifies the “row” corresponding to the state i. Each entry is a dictionary in which:

Keys are single-character strings or the empty string. The latter specifies the default transition that takes place whenever none of the other keys matches.
Values are functions with no argument that have to return a new state (integer), and may also have side effects.

The method starts in the initial state init, reads the next input character and performs a lookup in the transition table. The retrieved transition function is then executed and its return value is the new state with which the whole process is repeated. However, if the new state is final, the computation stops and the final state is returned.

DFA in the following example parses the input string up to the occurrence of the first 0 character.

>>> p.dfa([{"": lambda: 0, "0": lambda: -1}])
-1
>>> str(p)
'x \t #quu§0a,foo:bar< qwerty'

line_column() → Tuple[int, int]

Return line and column coordinates of the current offset.

>>> p.line_column()
(1, 8)

match_regex(regex: Pattern, required: bool = False, meaning: str = '') → str

Parse input text starting from the current offset by matching it against a regular expression. The argument regex is a regular expression object (result of re.compile()). If the regular expression matches, the matched string is returned and offset is advanced past that string in the input text.

The required flag controls what happens if the regular expression doesn’t match: if it is True, then UnexpectedInput is raised, otherwise None is returned.

The optional meaning argument can be used to describe what the regular expression means – it is used in error messages.

>>> p.match_regex(re.compile("[0-9a-f]+"), meaning="hexa")
'0a'

one_of(chset: str) → str

Parse one character from the set of alternatives specified in chset. If a match is found, offset is advanced by one position, and the matching character is returned. Otherwise, UnexpectedInput is raised.

>>> p.one_of(".?!,")
','

peek() → str

Return the next input character without advancing offset. If the parser is past the end of input, EndOfInput is raised.

>>> p.peek()
'f'
>>> str(p)
'x \t #quu0a,§foo:bar< qwerty'

prefixed_name() → Tuple[YangIdentifier, YangIdentifier | None]

Parse a prefixed name and return a tuple containing the (local) name as the first component, and the prefix or None as the second component.

>>> p.prefixed_name()
('bar', 'foo')

remaining() → str

Return the remaining part of the input string.

>>> p.remaining()
'< qwerty'
>>> p.at_end()
True

skip_ws() → bool

Skip optional whitespace and return True if some was really skipped.

>>> q = Parser("\npi=3.14.159xyz!foo-bar")
>>> q.skip_ws()
True

test_string(string: str) → bool

Test whether string comes next in the input string. If it does, offset is advanced past that string, and True is returned. Otherwise, False is returned and offset is unchanged (even if string partly coincides with the input text). No exception is raised if the parser is at the end of input.

>>> q.test_string("pi=")
True
>>> str(q)
'\npi=§3.14.159xyz!foo-bar'

unsigned_float() → float

Parse and return an unsigned floating point number. The exponential notation is not supported.

>>> q.unsigned_float()
3.14

unsigned_integer() → int

Parse and return an unsigned integer.

>>> q.offset += 1    # skipping the dot
>>> q.unsigned_integer()
159

up_to(term: str) → str

Parse and return a segment of input text up to the terminating string term. Raise EndOfInput if term does not occur in the rest of the input string.

>>> q.up_to("!")
'xyz'

yang_identifier() → str

Parse and return YANG identifier.

Raises:: UnexpectedInput: If no syntactically correct keyword is found.

>>> q.yang_identifier()
'foo-bar'