The recent success of large language models like ChatGPT have led to a new stage of applied AI and with it, new challenges. One of those challenges is building context with a limited amount of space to get good results.
For example, lets say you want the AI to respond with content from your data set. In order to do that you could stuff all of your data into the prompt and then ask the model to respond using it. However, it’s unlikely the data would fit neatly into ~3000 words (the input token limitation of GPT-3.5). Rather than try to train your own model (expensive), you need a way to retrieve only the relevant content to pass to the model in a prompt.
This is where vector databases come in. You can use a vector DB like Chroma, Weaviate, Pinecone, and many more to create an index of embeddings to perform similarity searches on documents to determine what context to pass to a model for the best results.