uniset 0.1.0

Creator: bradpython12

Last updated:

0 purchases

TODO
Add to Cart

Description:

uniset 0.1.0

uniset

Pre-generated sets of Unicode code points

uniset is a module containing frozensets of Unicode code points (characters).
API
Categories
The module includes a set for all Unicode categories and subcategories except the main category "C" (other)
and its subcategories "Co" (private use) and "Cn" (not assigned).
Example:
import uniset

# The letter "A" is in category "L" (letters)
assert "A" in uniset.L
# The letter "A" is also in category "Lu" (uppercase letters)
assert "A" in uniset.Lu

Whitespace
uniset.WHITESPACE contains all Unicode whitespace characters.
uniset.WHITESPACE is a union of ASCII whitespace characters and the Unicode category "Zs".
import uniset

assert " " in uniset.WHITESPACE

Punctuation
uniset.PUNCTUATION contains all Unicode punctuation letters.
uniset.PUNCTUATION is a union of ASCII punctuation characters and the Unicode category "P".
import uniset

assert "." in uniset.PUNCTUATION

Alternatives
unicategories also provides access to Unicode categories.
The implementation is based on "range groups" and iterators, and should be faster and more memory efficient than uniset for inclusion checks.
If you need the frozenset API (unions, intersections, etc.), or the sets beyond Unicode categories (whitespace, punctuation), use uniset.
Otherwise unicategories is the better option.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Files:

Customer Reviews

There are no reviews.