polars-candle 0.1.7

Creator: bradpython12

Last updated:

0 purchases

polars-candle 0.1.7 Image
polars-candle 0.1.7 Images
Add to Cart

Description:

polarscandle 0.1.7

polars-candle
A polars extension for running candle ML
models on polars DataFrames.
Example
Pull any applicable model from Huggingface, such as the recently released
Snowflake model, and embed text using a simple API.
import polars as pl
import polars_candle # ignore: F401

df = pl.DataFrame({"s": ["This is a sentence", "This is another sentence"]})

embed_kwargs = {
"model_repo": "Snowflake/snowflake-arctic-embed-xs",
"pooling": "mean",
}

df = df.with_columns(
pl.col("s").candle.embed_text(**embed_kwargs).alias("s_embedding")
)
print(df)
# ┌──────────────────────────┬───────────────────────────────────┐
# │ s ┆ s_embedding │
# │ --- ┆ --- │
# │ str ┆ array[f32, 384] │
# ╞══════════════════════════╪═══════════════════════════════════╡
# │ This is a sentence ┆ [-0.056457, 0.559411, … -0.20403… │
# │ This is another sentence ┆ [-0.117206, 0.336827, … 0.174078… │
# └──────────────────────────┴───────────────────────────────────┘

Currently, Bert, JinaBert, and Distilbert models are supported. More models will be added in the future. Check
my other repository wdoppenberg/glowrs to learn more about the underlying
implementation for sentence embedding.
Installation
Make sure you have polars installed. If not, install it using pip install polars. Then, install polars-candle using
pip install polars-candle

Note: The macOS ARM wheels of this library come with Metal support out of the box. For CUDA, check the below
instructions on how to build from source.
If you want to install the latest version from the repository, you can use:
pip install git+https://github.com/wdoppenberg/polars-candle.git

Note: You need to have the Rust toolchain installed on your system to compile the library. See
here for instructions on how to install Rust.
You can set build features using maturin:
maturin develop --release -F <feature>

Where <feature> can be one of the following:

metal Install with Metal acceleration.
cuda Install with CUDA acceleration. Might require additional setup such as installing CUDA libraries.
accelerate Install with the Accelerate framework.

Roadmap

Embed text using Bert, JinaBert, and Distilbert models.
Add more models.
More configuration options for embedding (e.g. pooling strategy, device selection, etc.).
Support & test streaming workloads.

Credits

Massive thanks to polars & their contributors for providing a blazing fast DataFrame library
with the ability to extend it with custom functions using pyo3-polars.
Great work so far by Huggingface on candle for providing a simple
interface to run ML models.

Note
This is a work in progress and the API might change in the future. Feel free to open an issue if you have any
suggestions or improvements.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.