🧀 BigCheese.ai

Social

OmniParser V2

🧀

Microsoft Research published an article about OmniParser V2, a tool designed to enhance the capability of Large Language Models (LLMs) to serve as GUI agents. It introduces improvements in detecting interactable elements and faster inference, achieving a significant accuracy boost in a GUI grounding benchmark. It supports various state-of-the-art LLMs and ensures responsible AI practices.

  • OmniParser V2 improves GUI automation.
  • Achieves 39.6% accuracy in benchmark.
  • Supports OpenAI, DeepSeek, Qwen, Anthropic.
  • Incorporates Responsible AI practices.
  • Trained with a larger data set.