polars-fuzzy-match 0.1.5

Creator: bradpython12

Last updated:

Add to Cart

Description:

polarsfuzzymatch 0.1.5

Polars Fuzzy Matching
Installation
pip install polars
pip install polars-fuzzy-match

Usage
With both the plugin and polars installed, usage is as follows:
import polars as pl
from polars_fuzzy_match import fuzzy_match_score


df = pl.DataFrame(
{
'strs': ['foo', 'foo quz BAR', 'baaarfoo', 'quz'],
}
)
pattern = 'bar'
out = df.with_columns(
score=fuzzy_match_score(
pl.col('strs'),
pattern,
)
)
print(out)

This outputs:
shape: (4, 2)
┌─────────────┬───────┐
│ strs ┆ score │
│ --- ┆ --- │
│ str ┆ u32 │
╞═════════════╪═══════╡
│ foo ┆ null │
│ foo quz BAR ┆ 88 │
│ baaarfoo ┆ 74 │
│ quz ┆ null │
└─────────────┴───────┘

When there is no match, score is null. When the pattern matches the value in
the given column, score is non-null. The higher the score, the closer the value
is to the pattern. Therefore, we can filter out values that do not match and
order by score:
pattern = 'bar'
out = (
df.with_columns(
score=fuzzy_match_score(
pl.col('strs'),
pattern,
)
)
.filter(pl.col('score').is_not_null())
.sort(by='score', descending=True)
)
print(out)

This outputs:
shape: (2, 2)
┌─────────────┬───────┐
│ strs ┆ score │
│ --- ┆ --- │
│ str ┆ u32 │
╞═════════════╪═══════╡
│ foo quz BAR ┆ 88 │
│ baaarfoo ┆ 74 │
└─────────────┴───────┘

Fzf-style search syntax
This plugin supports Fzf-style search syntax for the pattern. It's worth noting
that this section is taken almost verbatim from the Fzf README:



Pattern
Match type
Description




bar
fuzzy
items that fuzzy match bar e.g. 'bXXaXXr'


'foo
substring exact match
items that include foo e.g. 'is foo ok'


^music
prefix exact match
items that start with music


.mp3$
suffix exact match
items that end with .mp3


!fire
inverse exact match
items that do not include fire


!^music
inverse prefix exact match
items that do not start with music


!.mp3$
inverse suffix exact match
items that do not end with .mp3



Credits

Marco Gorelli's Tutorial on writing Polars Plugin. See
here.
The Helix Editor team for the
Nucleo fuzzy matching library.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.