generic-lexer 1.1.1

Creator: rpa-with-ash

Last updated:

Add to Cart

Description:

genericlexer 1.1.1

Generic Lexer





A generic pattern-based Lexer/tokenizer tool.
The minimum python version is 3.6

Original Author

Eli Bendersky with
this gist last modified on 2010/08
Maintainer

Leandro Benedet Garcia last modified on 2020/11

Version
1.1.0
License
The Unlicense
Documentation

The documentation can be
found here



Example
If we try to execute the following code:
from generic_lexer import Lexer


rules = {
"VARIABLE": r"(?P<var_name>[a-z_]+): (?P<var_type>[A-Z]\w+)",
"EQUALS": r"=",
"SPACE": r" ",
"STRING": r"\".*\"",
}

data = "first_word: String = \"Hello\""
data = data.strip()

for curr_token in Lexer(rules, False, data):
print(curr_token)

Will give us the following output:
VARIABLE({'var_name': 'first_word', 'var_type': 'String'}) at 0
SPACE( ) at 18
EQUALS(=) at 19
SPACE( ) at 20
STRING("Hello") at 21

As you can see differently from the original gist, we are capable of specifying multiple groups per token.
You cannot use the same group twice,
either per token or not because all the regex patterns are merged together to generate the tokens later on.
You may get the values of the tokens this way:
>>> from generic_lexer import Lexer
>>> rules = {
... "VARIABLE": r"(?P<var_name>[a-z_]+): (?P<var_type>[A-Z]\w+)",
... "EQUALS": r"=",
... "STRING": r"\".*\"",
... }
>>> data = "first_word: String = \"Hello\""
>>> variable, equals, string = tuple(Lexer(rules, True, data))

>>> variable
VARIABLE({'var_name': 'first_word', 'var_type': 'String'}) at 0

>>> variable.val
{'var_name': 'first_word', 'var_type': 'String'}
>>> variable["var_name"]
'first_word'
>>> variable["var_type"]
'String'

>>> equals
EQUALS(=) at 19

>>> equals.val
'='

>>> string
STRING("Hello") at 21

>>> string.val
'"Hello"'

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.