process-supervision-torch 0.0.3

Last updated: September 8, 2024

0 purchases

Free

Donate

Creator: railscoder56

Languages

Python

Description:

processsupervisiontorch 0.0.3

"Let’s Verify Step by Step"
Implementation of "Improving Mathematical Reasoning with Process Supervision" by OPENAI
Install
pip3 install --upgrade process-supervision-torch
Usage:
GPT4 without tokenizer
import torch
from process_supervision.main import GPT4

# Usage with random inputs
text = torch.randint(0, 20000, (1, 1024))

# Initiliaze the model
model = GPT4()
output = model(text)
print(output)

PRM
import torch
from process_supervision.prm import PRM
from swarms.models import OpenAIChat
from process_supervision.generator import MathDataGenerator
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")

# LLM initialization
llm = OpenAIChat(openai_api_key=api_key)

# Math data generator initialization
math_datagenerator = MathDataGenerator(llm, num_iters=10)

# Device initialization
device = 0 if torch.cuda.is_available() else "cpu"

# Model initialization
prm_model = PRM(
model_name="lvwerra/gpt2-imdb-pos-v2",
ref_model_name="lvwerra/gpt2-imdb",
reward_model_name="lvwerra/distilbert-imdb",
device=device,
)

# Generation arguments
gen_kwargs = {
"min_length": -1,
"top_k": 0.0,
"top_p": 1.0,
"do_sample": True,
"pad_token_id": prm_model.tokenizer.eos_token_id,
}
sent_kwargs = {"top_k": None, "function_to_apply": "none", "batch_size": 16}

# Sample queries
queries = ["Sample query 1", "Sample query 2"]
queries = [math_datagenerator.generate_samples(query) for query in queries]

# Generate responses
responses = prm_model.generate_responses(
queries, gen_len=10, gen_kwargs=gen_kwargs
)

# Score responses
scores = prm_model.score_responses(responses, sent_kwargs)

# Display results
for query, response, score in zip(queries, responses, scores):
print(f"Query: {query}\nResponse: {response}\nScore: {score}\n")

GPT4 + PRM
Method
Citation
@misc{lightman2023lets,
title={Let's Verify Step by Step},
author={Hunter Lightman and Vineet Kosaraju and Yura Burda and Harri Edwards and Bowen Baker and Teddy Lee and Jan Leike and John Schulman and Ilya Sutskever and Karl Cobbe},
year={2023},
eprint={2305.20050},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

Todo

Creae the PRM reward model

License
MIT

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

There are no reviews.

zed

process-supervision-torch 0.0.3

Languages

Categories

Description:

License:

Share

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

apiverve-randomquote 1.1.4

apiverve-randomidentitygenerator 1.1.4

apiverve-randomidentity 1.0.11

apiverve-randomfacts 1.1.4

apiverve-mortgagecalculator 1.1.4

process-supervision-torch 0.0.3

Languages

Categories

Description:

License:

Share

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

apiverve-randomquote 1.1.4

apiverve-randomidentitygenerator 1.1.4

apiverve-randomidentity 1.0.11

apiverve-randomfacts 1.1.4

apiverve-mortgagecalculator 1.1.4