LLM Applications Need Creativity

Making the most of practical applications of large language models requires creativity. It’s a blank canvas to be filled in the same way that early mobile application developers faced when a new set of APIs unlocked new possibilities.

I think it will take several years of creative exploration to find the best ways to integrate LLMs. A chat interface will feel competitively lazy.

Links to this note

Sorting Vector Store Results

Many vector databases can find the top k most similar results to a query but are unable to sort by other document metadata. This is a pretty severe limitation for building LLM applications, especially for ones where time is dimension (meetings, calendars, task lists, etc.). For example, retrieving the 10 most similar results to the phrase “team meeting notes” but not being able to retrieve the team meeting notes from the last month.
LLM Latency Is Output-Size Bound

As it stands today, LLM applications have noticeable latency but much of the latency is output-size bound rather than input-size bound. That means the amount of text that goes into a prompt does not matter.