Large Language Models

Published

Large language models (LLMs) predict the next word in a block of text. At a basic level, you can think of it as a function that takes in text (i.e. a prompt) and returns text (i.e. a completion).

  • AI Is the next Great Interop Layer

    I had previously observed that humans are the great interop layer—we are the glue that fits together disparate processes and tools into usable systems. After using large language models, I’m becoming convinced that they can offload a large amount of the interop cost that currently falls to us. In a nut shell, AI can ‘do what I mean not what I say’ pretty darn well.

  • Mushy Systems

    As large language models proliferate into every service and ultimately replaces business logic, we will be left with the horrible burden of maintaining mush.

  • A Moat or a Long Bridge

    In business strategy you’ll often hear a competitive advantage described as a moat, but most moats are more like a long bridge. The moat is the thing that prevents others from easily replicating another business. A long bridge takes a time build and is effectively a moat if it’s impractical to catch up or simply makes it a schlep.

  • Sorting Vector Store Results

    Many vector databases can find the top k most similar results to a query but are unable to sort by other document metadata. This is a pretty severe limitation for building LLM applications, especially for ones where time is dimension (meetings, calendars, task lists, etc.). For example, retrieving the 10 most similar results to the phrase “team meeting notes” but not being able to retrieve the team meeting notes from the last month.

  • Context Is Needed for Practical LLM Applications

    One of the limitations of large language models is that it is often missing context when responding. LLMs like ChatGPT (GPT 3.5 and GPT 4 at time of writing) don’t allow fine tuning and can only take in ~4,000 tokens (roughly 3,000 words) as part of the prompt. That’s not nearly enough to provide context to be specific to your application. For example, you might want an AI coding tool to help you add a feature to your codebase, but the codebase is likely much larger than the prompt would allow.

  • Zapier Natural Language Actions API

    The Zapier NLA API solves a major problem for large language models—the ability to interact with real systems. Rather than a developer integrate with every possible service, they can integrate once with Zapier and run every “Action” the user has authorized using natural language instructions.

  • Advantages of Open Source AI

    It’s almost inevitable that, after an initial research phase, progress of AI models and tools will come from open source communities rather than a corporation. Individuals can utilize fair-use to do things businesses can not do (e.g. using leaked LLaMa weights and fine tuning it). There are more people to work on fringe usecases that do not have to be commercialized. Finally, open source increases access (running 13B LLMs on a laptop, on a Raspberry Pi) allowing more people to try it and provide more feedback.

  • Prompt Injection Attacks Are Unavoidable

    While large language models are already useful for certain text based tasks, connecting them to other systems that can interact with the outside world poses new kinds of security challenges. Because it’s all based on natural language, any text can effectively become untrusted code.

  • Dify Is an LLM Workshop

    Dify mashes together LLMs, tools, and an end-user facing UI together to make an LLM workshop. The builder is a visual programming interface (similar to iOS Shortcuts) where each step is pre-defined units of functionality like an LLM call, RAG, and running arbitrary code.

  • Llamafile Has the Best Ergonomics for Local Language Models

    By far the best installation and running experience for using a large language model locally is llamafile. The entire model, weights, and a server are packaged into a single binary that can be run across multiple runtime environments.

  • Intent-Based Outcome Specification

    A new paradigm for user interfaces is starting to take shape with the rise of AI powered tools. Rather than a loop of sending a command, receiving the output, and continuing (like graphical user interfaces), an intent-based outcome specification is telling the computer what the outcome should be—“open the door” instead of “check auth, unlock, open latch, extend door”.

  • AI Models at the Edge

    Today, most large language models are run by making requests over the network to a provider like OpenAI which has several disadvantages. You have to trust the entire chain of custody (e.g. network stack, the provider, their subprocessors etc.). It can be slow or flakey and therefore impractical for certain operations (e.g. voice inference, large volumes of text). It can also be expensive—providers are charging per API call and experiments can result in a surprising bill (my useless fine-tuned OpenAI model cost $36).

  • Openai Incorrectly Handles Dates

    OpenAI GPT models (GPT 4 at time of writing) do not accurately or consistently parse or manipulate dates.

  • AI for Notes

    Now that my Zettelkasten has over a thousand notes, I’d like to try to quite literally create the experience of a conversation with my second brain. The AI interface should be conversational rather than search queries. It should draw from the knowledge in my notes and respond in natural language. Finally, it should be useful in helping me make connections between ideas I hadn’t thought of before.

  • Summarization With Chain of Density

    A recent paper studied text summarization improvements using a chain of density prompt. The prompt improves over vanilla GPT responses and is close to human summarizations in informativeness and readibility.

  • LLM Latency Is Output-Size Bound

    As it stands today, LLM applications have noticeable latency but much of the latency is output-size bound rather than input-size bound. That means the amount of text that goes into a prompt does not matter.

  • Natural Language User Interface

    One of the super powers of large language models is that it can “do what I mean” instead of “do what I say”. This ability to interpret prompts can drastically lower the barriers to accessing and interoperating between systems. For example, writing “Send a slack message to the Ops channel with a list of customers from HubSpot that signed up in the last week” would generate actions that query the HubSpot Contacts API, parse and extract the results, and make another API request to Slack to post to the #ops channel.

  • Latent Space Reasoning

    Rather than converting to text at every step in a chain of thought process with large language models to solve a complex problem, new research suggests that reasoning can happen in a latent space using the internal representation of the model. Besides improving responses that require a greater degree of reasoning, utilizing latent space is faster because it skips the continuous tokenization and text generation.

  • The Pareto Principle and Chatbots

    Support cases at many businesses follow the Pareto principle. For example, DoorDash and Uber 87% of support requests relate to 16 issues. Deploying chatbots to solve high concentration issues makes it economical to build and maintain conversation trees by hand. What about the remaining 20% and what about businesses that have a wider distribution of issue types?

  • AI Agent

    An AI agent is an intent-based abstraction that combines LLMs to plan and take action in order produce a desired goal.

  • Getting Ready for AI

    The other day I noticed a tweet from Justin Duke which outlined a plan to get his company’s codebase ready for Devin—a programming focused generative AI product. While many are skeptical about AI taking over coding tasks, progress happening quickly and it seems likely that these tools will help software engineers, though maybe not replace the job outright).

  • Legal AI Models Hallucinate in 16% or More of Queries

    A recent study from Stanford found that LLM’s (GPT-4) and RAG-based AI tools (Lexis+ AI, Westlaw AI-Assisted Research, Ask Practical Law AI) hallucinate answers 16% to 40% of the time in benchmarking queries. GPT-4 had the worst performance while RAG-based AI tools did slightly better.

  • LLM-First Programming Language

    There are many barriers to adoption for a new programming language looking to go mainstream. You have to attract a small group of enthusiasts, build an ecosystem of high quality libraries, help new people learn, and eventually grow a talent marketplace.

  • How to Build an Intuition of What AI Can Do

    One of the difficult parts of applying AI to existing processes and products is that people aren’t calibrated on what generative AI can and can’t do. This leads to both wild ideas that are not possible and missed opportunities to automate seemingly difficult work that is possible.

  • AI Bubble

    In Money AI Bubble, the author argues that the market is in an AI bubble. Dumb money is pushing stock prices up despite any real improvement in their businesses, and this will eventually lead to losses. As the author contends, most of this is actually an Nvidia bubble.

  • Org-Ql Query Prompt

    Use the following prompt to return an org-ql s-expression based on the user’s input. This can be used along with calling org-ql from the command line to make an LLM tool that can query org-mode tasks and headings.

  • Imagine If Human Knowledge Received Half the Enthusiasm as AI

    It’s striking how captivated the world is by LLMs and the latest techniques in AI. What if we were just as excited about progressing human knowledge and learning? Education doesn’t receive nearly the attention and hype it should yet we are far more capable.

  • Consciousness Is Categories

    Consciousness is an emergent property of categories. As a sufficient number of categories can be represented in a system, selfhood arises and, with it, consciousness.

  • Do Higher Temperatures Make Llms More Creative?

    Higher temperatures tell LLMs when generating a completion to not always use the highest probability next token. This has the effect of producing a wider range of possible responses.

  • AI Multiplies the Value of Expertise

    AI reduces the cost of certain tasks to effectively zero. In doing so, it lowers the barriers to domains that would previously take years to build skills such as writing code, data analysis, and more. This is precisely why AI also increases the value of expertise and experience.

  • LLM Applications Need Creativity

    Making the most of practical applications of large language models requires creativity. It’s a blank canvas to be filled in the same way that early mobile application developers faced when a new set of APIs unlocked new possibilities.

  • Chatbots Lack Affordances

    When interacting with a chatbot, there are not indications of what to say or how to say it. Without affordances, it’s difficult to know what to do at first.

  • Legal Services Has the Highest AI Occupational Exposure

    A recent paper looking into the economic impact of large language models found that the legal industry has the most potential occupational exposure from AI including augmentation or substitution.

  • Org-Ai Is Chat for Notes

    I started building AI for notes to help me chat with my library of notes. The result of that exploration is org-ai—my one of one software that helps me remember what I’ve previously written, summarize information. Under the hood it uses vector-based similarity search and LLMs and agent-based AI to extract useful information from my zettelkasten in a chat-based interface.

  • There Is No AI Strategy Without a Data Strategy

    Startups typically have an advantage over incumbents when it comes to adopting new technology. With artificial intelligence however, incumbents are fast to integrate LLMs and have the data needed to make better AI-powered products. For example, an established CRM platform has the data needed to train, evaluate, and deploy AI products that a startup would not have access to.

  • How to Decide If AI Tools Can Be Used at Work

    Advancements in AI powered tools can greatly improve productivity but many companies have taken steps to limit or outright ban the use of OpenAI’s ChatGPT, GitHub Copilot, and others. What are they concerned about and how should you decide if it can be used by your company?

  • How Langchain Works

    As it turns out, combining large language models together can create powerful AI agents that can respond to and take action on complicated prompts. This is achieved by composing models and tools with an overall language model to mediate the interaction.

  • Zapier NLA Is Bad at Generating API Parameters

    I tried out Zapier Natural Language Actions API and found that it’s not particularly good for the one thing I needed it to be good at glueing together other APIs with natural language. API endpoints that are simple and straightforward are easy for large language models to generate the right inputs but more complicated (and poorly designed) like HubSpot are unusable.

  • Org-Ql from the Command Line

    Most of my tasks and projects are organized using org-mode. I was looking for a way to query them from an LLM and, rather than recreate an index and a database, I can use what I normally use, org-ql using emacs in batch mode.

  • Why Vector Databases Are Important to AI

    The recent success of large language models like ChatGPT have led to a new stage of applied AI and with it, new challenges. One of those challenges is building context with a limited amount of space to get good results.

  • LLM Web Browsing

    By combining headless browser automation tools with LLMs, you can create an agent that can navigate to websites. This opens up all sorts of new capabilities like scraping and summarizing web content.

  • AI Replaces Business Logic

    Satya from Microsoft talks about how orchestrating between business applications is the next step for artificial intelligence which will replace business logic with AI.