OmniParser V2

🧀

View Website GitHub OmniParser V2 Code Model Checkpoints on HuggingFace Microsoft Research AI Principles Microsoft Responsible AI

Microsoft Research published an article about OmniParser V2, a tool designed to enhance the capability of Large Language Models (LLMs) to serve as GUI agents. It introduces improvements in detecting interactable elements and faster inference, achieving a significant accuracy boost in a GUI grounding benchmark. It supports various state-of-the-art LLMs and ensures responsible AI practices.

OmniParser V2 improves GUI automation.
Achieves 39.6% accuracy in benchmark.
Supports OpenAI, DeepSeek, Qwen, Anthropic.
Incorporates Responsible AI practices.
Trained with a larger data set.

View Website GitHub OmniParser V2 Code Model Checkpoints on HuggingFace Microsoft Research AI Principles Microsoft Responsible AI

Social

OmniParser V2