🧀 BigCheese.ai

Social

Web scraping with GPT-4o: powerful but expensive

🧀

Eduardo Blancas discusses his exploration of OpenAI's structured outputs feature to create an AI-assisted web scraper. By combining Python's Pydantic models with GPT-4o, he experimented with parsing HTML tables and generating XPaths for efficient data extraction. Despite challenges with complex tables and varying results, the approach shows promise for the future of web scraping tools.

  • Eduardo Blancas experimented with GPT-4o for AI-assisted web scraping.
  • OpenAI's structured outputs feature played a key role in data parsing.
  • Complex HTML tables presented parsing challenges.
  • Generating XPaths was a cost-efficient strategy.
  • The project now has a demo on Streamlit with GitHub source code.