By combining headless browser automation tools with LLMs, you can create an agent that can navigate to websites. This opens up all sorts of new capabilities like scraping and summarizing web content.
What works well?
- Going to a news website and extracting a list of articles
- OpenAI
gpt-4works better thangpt-3when usinglangchainto prompt to get headlines (e.g. asking for the top headlines on HackerNewsgpt-3fails without some hints about the HTML structure whilegpt-4works in one shot)
What doesn’t work well?
- Large web pages will hit the context limit for language models
- Logging into a website doesn’t work because OpenAI will reject completion requests that contain sensitive data
What else have others tried?