automata-tools 2.0.1

Creator: bradpython12

Last updated:

Add to Cart

Description:

automatatools 2.0.1

Automata Tools
Tools to build automata from your custom rule.
This package provides a set of handy tools to programmatically build automata, so you can build NFA、DFA、MinimizedDFA、WFA from any custom rules.
Usage
Install
conda install -c conda-forge automata-tools # not available yet
# or
pip install automata-tools

Import
See example in examples/NFAfromCustomRule.py
from typing import List
from automata_tools import BuildAutomata, Automata

automata: List[Automata] = []

BuildAutomata
characterStruct
Build simple (0)-[a]->(1) automata
automata.append(BuildAutomata.characterStruct(char))

unionStruct
Build automata that is an "or" of two sub-automata (1)<-[a]-(0)-[b]->(1)
# to match "a|b"
a = automata.pop()
b = automata.pop()
if operator == "|":
automata.append(BuildAutomata.unionStruct(b, a))

concatenationStruct
Build automata that is an "and" of two sub-automata (0)-[a]->(1)-[b]->(2)
# to match "ab"
a = automata.pop()
b = automata.pop()
automata.append(BuildAutomata.concatenationStruct(b, a))

starStruct
Build automata that looks like the "Kleene closure"
# to match "a*"
if operator == "*":
a = automata.pop()
automata.append(BuildAutomata.starStruct(a))

skipStruct
Build automata that looks like the "Kleene closure" but without the loop back (1)<-[ε]-(2), so it only match the token once at most.
# to match "a*"
if operator == "?":
a = automata.pop()
automata.append(BuildAutomata.skipStruct(a))

repeatRangeStruct
Build automata that will match the same token for several times (0)-[a]->(1)-[a]->(2)-[a]->(3)
# to match "a{3}"
repeatedAutomata = BuildAutomata.repeatStruct(automata, 3)

repeatStruct
Build automata that will match the same token for n to m times
(0)-[a]->(1)-[a]->(4), (0)-[a]->(2)-[a]->(3)-[a]->(4)
# to match "a{2,3}"
repeatedAutomata = BuildAutomata.repeatRangeStruct(automata, 2, 3)

Automata
See example in features/steps/customRule.py
from automata_tools import DFAFromNFA, Automata

from your_implementation import NFAFromRegex, executor

nfa: Automata = NFAFromRegex().buildNFA(rule)
minDFA: Automata = DFAFromNFA(nfa).getMinimizedDFA()
minDFA.setExecuter(executor)

print(minDFA.execute(someText))

where executor is a function like the one in examples/NFAfromCustomRule.py:
def executor(tokens, startState, finalStates, transitions):
return True

setExecuter
Set an executor to the automata that can freely use state and transition of the automata, and return a boolean value.
from automata_tools import IAutomataExecutor

defaultExecuter: IAutomataExecutor = lambda tokens, startState, finalStates, transitions: True
minDFA.setExecuter(defaultExecuter)

setTokenizer
Set an tokenizer to the automata that can transform string to list of string token, which will be used by the executer.
minDFA.setExecuter(lambda input: input.split(' ')[::-1])

NFAtoDFA
Make automata state transitions not so ambiguous
nfa = NFAFromRegex().buildNFA(rule)
dfa = NFAtoDFA(nfa)

DFAtoMinimizedDFA
Allow you minify Automata state
nfa = NFAFromRegex().buildNFA(rule)
minDFA = DFAtoMinimizedDFA(NFAtoDFA(nfa))

Weighted Finite Automata
WFA, it can execute automata use matrix multiplication, so it can be very fast compare to brute force execution, especially when state space is large.
from automata_tools import WFA, get_word_to_index

_, wordToIndex = get_word_to_index([ruleParser(context.rule), tokenizer(text)])
wfa = WFA(minDFA, wordToIndex, dfa_to_tensor)
wfa.execute(text)

get_word_to_index
Given [['token', 'another'], ['token_in_rule']], return something like
{'token': 0, 'another': 1, ...}

So we can translate automata state to a matrix.
WFA
Given an automata, a word index like {'token': 0, 'another': 1, ...}, and a function that transform automata to tensor (see example at customRuleDFAToTensor), return a WFA instance.
Development
Environment
Create environment from the text file:
conda env create --file automataTools-env.txt
conda activate automataTools

Save env file: conda list --explicit > automataTools-env.txt
Python Path
Create a .env file with content PYTHONPATH=automataTools
Publish
To pypi
rm -rf ./dist && rm -rf ./build && rm -rf ./automata_tools.egg-info && python3 setup.py sdist bdist_wheel && twine upload dist/*

To Conda
# I'm learning how to do...

Resources
Automata Theory Course Slides
Probably the original reference source

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.