maestro 0.1.0

Creator: bradpython12

Last updated:

Add to Cart

Description:

maestro 0.1.0

multimodal-maestro







👋 hello
Multimodal-Maestro gives you more control over large multimodal models to get the
outputs you want. With more effective prompting tactics, you can get multimodal models
to do tasks you didn't know (or think!) were possible. Curious how it works? Try our
HF space!
🚧 The project is still under construction and the API is prone to change.
💻 install
⚠️ Our package has been renamed to maestro. Install package in a
3.11>=Python>=3.8 environment.
pip install maestro

🚀 examples
GPT-4 Vision
Find dog.

>>> The dog is prominently featured in the center of the image with the label [9].


👉 read more



load image
import cv2

image = cv2.imread("...")



create and refine marks
import maestro as mm

generator = mm.SegmentAnythingMarkGenerator(device='cuda')
marks = generator.generate(image=image)
marks = mm.refine_marks(marks=marks)



visualize marks
mark_visualizer = mm.MarkVisualizer()
marked_image = mark_visualizer.visualize(image=image, marks=marks)




prompt
prompt = "Find dog."

response = mm.prompt_image(api_key=api_key, image=marked_image, prompt=prompt)

>>> "The dog is prominently featured in the center of the image with the label [9]."



extract related marks
masks = mm.extract_relevant_masks(text=response, detections=refined_marks)

>>> {'6': array([
... [False, False, False, ..., False, False, False],
... [False, False, False, ..., False, False, False],
... [False, False, False, ..., False, False, False],
... ...,
... [ True, True, True, ..., False, False, False],
... [ True, True, True, ..., False, False, False],
... [ True, True, True, ..., False, False, False]])
... }





🚧 roadmap

Documentation page.
Segment Anything guided marks generation.
Non-Max Suppression marks refinement.
LLaVA demo.

💜 acknowledgement

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding
in GPT-4V by Jianwei Yang, Hao Zhang, Feng Li, Xueyan
Zou, Chunyuan Li, Jianfeng Gao.

🦸 contribution
We would love your help in making this repository even better! If you noticed any bug,
or if you have any suggestions for improvement, feel free to open an
issue or submit a
pull request.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.