polyjuice-nlp 0.1.5

Last updated:

0 purchases

polyjuice-nlp 0.1.5 Image
polyjuice-nlp 0.1.5 Images
Add to Cart

Description:

polyjuicenlp 0.1.5

Polyjuice
This repository contains code for generating counterfactual sentences as described in the following paper:

Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, Daniel S. Weld
Association for Computational Linguistics (ACL), 2021

Bibtex for citations:
@inproceedings{polyjuice:acl21,
title = "{P}olyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models",
author = "Tongshuang Wu and Marco Tulio Ribeiro and Jeffrey Heer and Daniel S. Weld",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics",
year = "2021",
publisher = "Association for Computational Linguistics"
}

Installation
From Pypi:
pip install polyjuice_nlp

From source:
git clone [email protected]:tongshuangwu/polyjuice.git
cd polyjuice
pip install -e .

Polyjuice depends on SpaCy and Huggingface Transformers. To use most functions, please also install the following:
# install pytorch, as here: https://pytorch.org/get-started/locally/#start-locally
pip install torch
# The SpaCy language package
python -m spacy download en_core_web_sm

Perturbation
from polyjuice import Polyjuice
# initiate a wrapper.
# model path is defaulted to our portable model:
# https://huggingface.co/uw-hai/polyjuice
# No need to change this unless you are using customized model
pj = Polyjuice(model_path="uw-hai/polyjuice", is_cuda=True)

# the base sentence
text = "It is great for kids."

# perturb the sentence with one line:
# When running it for the first time, the wrapper will automatically
# load related models, e.g. the generator and the perplexity filter.
perturbations = pj.perturb(text)

# return: ['It is bad for kids too.',
# "It 's great for kids.",
# 'It is great even for kids.']

More advanced APIs
Please see the documents in the main Python file for more explanations.
To perturb with more controls,
perturbations = pj.perturb(
orig_sent=text,
# can specify where to put the blank. Otherwise, it's automatically selected.
# Can be a list or a single sentence.
blanked_sent="It is [BLANK] for kids.",
# can also specify the ctrl code (a list or a single code.)
# The code should be from 'resemantic', 'restructure', 'negation', 'insert', 'lexical', 'shuffle', 'quantifier', 'delete'.
ctrl_code="negation",
# Customzie perplexity score.
perplex_thred=5,
# number of perturbations to return
num_perturbations=1,
# the function also takes in additional arguments for huggingface generators.
num_beams=3
)

# return: [
# 'It is not great for kids.',
# 'It is great for kids but not for anyone.',
# 'It is great for kids but not for any adults.']

To detect ctrl code from a given sentence pair,
pj.detect_ctrl_code(
"it's great for kids.",
"It is great for kids but not for any adults.")
# return: negation

To get randomly placed blanks,
random_blanks = py.get_random_blanked_sentences(
sentence=text,
# only allow selecting from a preset range of token indexes
pre_selected_idxes=None,
# only select from a subset of dep tags
deps=None,
# blank sub-spans or just single tokens
is_token_only=False,
# maximum number of returned index tuple
max_blank_sent_count=3,
# maximum number of blanks per returned sentence
max_blank_block=1
)

Selection
For selecting diverse and surprising perturbations (for augmentation and explanation experiments in our paper), please see the notebook demo.

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.