One of the limitations of large language models is that it is often missing context when responding. LLMs like ChatGPT (GPT 3.5 and GPT 4 at time of writing) don’t allow fine tuning and can only take in ~4,000 tokens (roughly 3,000 words) as part of the prompt. That’s not nearly enough to provide context to be specific to your application. For example, you might want an AI coding tool to help you add a feature to your codebase, but the codebase is likely much larger than the prompt would allow.
To fit more information into prompts for context, LLMs could benefit from a cheatsheet, generated for the prompt. Combining tools like semantic search with an LLM could allow for better applications that are more specific to the user’s domain. For example, when asking an AI coding tool to add a function, the search part could load in all of the related modules or just the type signatures rather than the entire codebase.
Read Cheating is all you need.
Links to this note
-
Chatgpt Lowers Barriers to Building Small Projects
After using it for a few coding projects recently, I find that ChatGPT is a great way to lower the barriers to building smaller, self-contained projects—things that have been hiding in your to do list that take a bit too much effort to attempt but is still a good idea.
-
LLM-First Programming Language
There are many barriers to adoption for a new programming language looking to go mainstream. You have to attract a small group of enthusiasts, build an ecosystem of high quality libraries, help new people learn, and eventually grow a talent marketplace.
-
Why Vector Databases Are Important to AI
The recent success of large language models like ChatGPT have led to a new stage of applied AI and with it, new challenges. One of those challenges is building context with a limited amount of space to get good results.