By combining headless browser automation tools with LLMs, you can create an agent that can navigate to websites. This opens up all sorts of new capabilities like scraping and summarizing web content.
What works well?
- Going to a news website and extracting a list of articles
- OpenAI
gpt-4
works better thangpt-3
when usinglangchain
to prompt to get headlines (e.g. asking for the top headlines on HackerNewsgpt-3
fails without some hints about the HTML structure whilegpt-4
works in one shot)
What doesn’t work well?
- Large web pages will hit the context limit for language models
- Logging into a website doesn’t work because OpenAI will reject completion requests that contain sensitive data
What else have others tried?