• AI Multiplies the Value of Expertise

    AI reduces the cost of certain tasks to effectively zero. In doing so, it lowers the barriers to domains that would previously take years to build skills such as writing code, data analysis, and more. This is precisely why AI also increases the value of expertise and experience.

    It’s one thing to write code, but as any engineer will tell you, there’s more to it than that. It might be easy to write the function, but large language models can’t reason about whether or not it’s the right thing to do. As Byrne Hobard writes in Working with a Copilot, “With a lower cost to making bad decisions fast, there’s a higher value on making good decisions.” Domain expertise and context are at much higher demand when the cost of low-context work goes to zero.

    The same thing exists in business contexts. With AI tools that can summarize content from anywhere all the time, providing the write context (and specifying the right tasks) provided a multiplier on an experts time. When used effectively they just added a team of low-level workers at their beck and call.

    See also:


  • Universality Leads to NP-Complete Problems

    There is a surprising link between universality and NP-complete problems in computer science.

    Important context: finding an algorithm to solve an NP-complete problem would mean solving all problems of the same class.

    Computation can be thought of as the solution to the NP-complete problem of executing computable functions. A function can be computed on any machine with the same set of inputs, a sequence of instructions, for any computable problem (e.g. you don’t need to buy a special machine to run Slack).

    When we talk about NP-complete problems we typically talk about specific classes that have resource constraints. For instance, there are many classes of complete problem with respect to time—they are computable but would take many years to complete (e.g. breaking 256-bit encryption would take millions of years).

    Universality explains why NP-complete problems exist because they are a subset of what’s solvable by a universal computation machine. We know this because resource constraints can be simulated in a universal computing machine by measuring resources and stopping execution if the constraint is exceeded.

    Memory usage, CPU usage, time are constraints that are important to making solutions practical, not to make them solvable altogether. In fact, it’s because it is solvable (due to universality) that NP-complete problems exist at all.

    Read Why are there complete problems, really?


  • Why Vector Databases Are Important to AI

    The recent success of large language models like ChatGPT have led to a new stage of applied AI and with it, new challenges. One of those challenges is building context with a limited amount of space to get good results.

    For example, lets say you want the AI to respond with content from your data set. In order to do that you could stuff all of your data into the prompt and then ask the model to respond using it. However, it’s unlikely the data would fit neatly into ~3000 words (the input token limitation of GPT-3.5). Rather than try to train your own model (expensive), you need a way to retrieve only the relevant content to pass to the model in a prompt.

    This is where vector databases come in. You can use a vector DB like Chroma, Weaviate, Pinecone, and many more to create an index of embeddings to perform similarity searches on documents to determine what context to pass to a model for the best results.


  • How Langchain Works

    As it turns out, combining large language models together can create powerful AI agents that can respond to and take action on complicated prompts. This is achieved by composing models and tools with an overall language model to mediate the interaction.

    For example, a langchain agent uses the following prompts to string together multiple “tools” which alters how to respond based on the user’s input:

    Assistant is a large language model trained by OpenAI.
    
    Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
    
    Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
    
    Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
    

    Part 2:

    TOOLS
    ------
    Assistant can ask the user to use tools to look up information that may be helpful in answering the users original question. The tools the human can use are:
    
    > Current Search: Useful for when you need to answer questions about current events or the current state of the world. The input to this should be a single search term.
    > Find Notes: Useful for when you need to respond to a question about my notes or something I've written about before. The input to this should be a question or a phrase. If the input is a filename, only return content for the note that matches the filename.
    
    RESPONSE FORMAT INSTRUCTIONS
    ----------------------------
    
    When responding to me, please output a response in one of two formats:
    
    **Option 1:**
    Use this if you want the human to use a tool.
    Markdown code snippet formatted in the following schema:
    
    ```json
    {{
        "action": string \ The action to take. Must be one of Current Search, Find Notes
        "action_input": string \ The input to the action
    }}
    ```
    
    **Option #2:**
    Use this if you want to respond directly to the human. Markdown code snippet formatted in the following schema:
    
    ```json
    {{
        "action": "Final Answer",
        "action_input": string \ You should put what you want to return to use here
    }}
    ```
    
    USER'S INPUT
    --------------------
    Here is the user's input (remember to respond with a markdown code snippet of a json blob with a single action, and NOTHING else):
    
    {input}
    

  • A Curiosity Loop Contextualizes Advice

    Sometimes problems you encounter need an outside perspective to help you figure out what to do. A curiosity loop helps contextualize advice from multiple people in a way that makes it far more useful than getting overly generalized advice from one source.

    How to do it

    • Curate who you ask: balance the loop by consulting experts but also someone who knows you really well (experts might not have specific answers for you the way a close friend would)
    • Ask good questions that are specific (with context), solicit rationale, and avoid biases (like a good survey question)
    • Make it lightweight and easy to respond: lower the cognitive load by giving a list of choices and asking why, let someone pick and choose with a quick answer or make space to send something longer
    • Try to get 3 or 4 responses (you might need to email a larger number of people based on the response rate)
    • Process the information and thank them: it feels good to help someone and be heard, sending a genuine thank you can be rewarding in and of itself

    From Ada Chen Rekhi via Lenny’s Podcast.

    See also:


  • AI Is the Next Great Interop Layer

    I had previously observed that humans are the great interop layer—we are the glue that fits together disparate processes and tools into usable systems. After using large language models, I’m becoming convinced that they can offload a large amount of the interop cost that currently falls to us. In a nut shell, AI can ‘do what I mean not what I say’ pretty darn well.

    The latest LLM tools can interpret what we mean far better than before. Natural language interfaces opens up the possibility of sloppy interfaces—ones where not everything needs to be specified precisely. This allows for tolerance in the system making it easier for more things to fit together.

    By contrast, APIs are largely a ‘do what I say’ interface. They require precise steps to complete an action. They require documentation to help a human understand how to use them. It takes creativity to implement them in ways that can solve their problem. Now we have LLMs that can figure out how to make API calls (Zapier NLAPI) and map different actions to different methods (langchain tools).


  • Layering vs Chunking

    When building large complicated things, there are two primary strategies that optimize for different things—layering and chunking.

    Chunking is about taking something large, breaking it up into smaller pieces, and assembling them into one cohesive thing. This is pretty common and intuitive which pairs nicely with “divide and conquer” and can speed things up but tends to be all or nothing (either all the chunks are delivered or nothing ships).

    Layering is also about breaking something large down but in a way that accumulates. The overall project is divided into smaller layers that are independently useful and shipped. This makes it more iterative, delivering value along the way and providing opportunities to adapt to new information.

    Layering is much more desirable for large complicated projects because it’s incremental.

    Problems arise when teams confuse chunking for layering—the team might think they are being fast and incremental, but deliver no value to the user until the end anyway.

    I’ve observed many large-scale projects fail because they used the wrong strategy or because no one recognized that chunking was the only option.

    To help spot these issues, a simple clarifying question to ask is: “At what point can a user actually use this?” (This can also be adapted for internal infrastructure projects, “At what point will this deliver value?").

    (I read about this somewhere and apply it all the time but I can’t find the source, sorry!)


  • Prompt Injection Attacks Are Unavoidable

    While large language models are already useful for certain text based tasks, connecting them to other systems that can interact with the outside world poses new kinds of security challenges. Because it’s all based on natural language, any text can effectively become untrusted code.

    Some examples:

    • Adding a prompt injection attack on your public website that can be accessed by an LLM enabled tool like Bing search
    • LLM-enabled personal assistant that can read your email might be prompt injected simply by sending them an email
    • Data could be exfiltrated from a support ticketing system by sending a prompt injected message
    • Training data might be poisoned by including prompt injected text that via hidden text

    It’s unclear what the solution to these problems are. You could chain together more AI tools to detect prompt injection attacks. You could build protections into the prompt used internally. You could warn the user or log a message for every action taken and use anomaly detection.

    Read Prompt injection: what’s the worst that can happen?


  • Natural Language User Interface

    One of the super powers of large language models is that it can “do what I mean” instead of “do what I say”. This ability to interpret prompts can drastically lower the barriers to accessing and interoperating between systems. For example, writing “Send a slack message to the Ops channel with a list of customers from HubSpot that signed up in the last week” would generate actions that query the HubSpot Contacts API, parse and extract the results, and make another API request to Slack to post to the #ops channel.

    The best example I can find of this is from Zapier’s Natural Language Actions (NLA) API. Zapier is perfectly positioned to enable a jump to universality for natural language UI because it already connects thousands of API providers together.

    NLUI can also be useful as a frontend to a single service. For example, making a chat support bot that can access the content of support docs.


  • Legal Services Has the Highest AI Occupational Exposure

    A recent paper looking into the economic impact of large language models found that the legal industry has the most potential occupational exposure from AI including augmentation or substitution.

    What is AI occupational exposure based on?

    The AIOE measure was constructed by linking 10 AI applications (abstract strategy games, realtime video games, image recognition, visual question answering, image generation, reading comprehension, language modeling, translation, speech recognition, and instrumental track recognition) to 52 human abilities (e.g., oral comprehension, oral expression, inductive reasoning, arm-hand steadiness, etc) using a crowd-sourced matrix that indicates the level of relatedness between each AI application and human ability. Data on the AI applications come from the Electronic Frontier Foundation (EFF) which collects and maintains statistics about the progress of AI across multiple applications. Data on human abilities comes from the Occupational Information Network (O*NET) database developed by the United States Department of Labor. O*NET uses these 52 human abilities to describe the occupational makeup of each of 800+ occupations that it tracks. Each of 800+ occupations can be thought of as a weighted combination of the 52 human abilities. O*NET uses two sets of weights: prevalence and importance.

    It seems a bit impractical to base the degree to which AI could disrupt an industry based on a list of occupational skills assigned to a category. Maybe it makes sense if you compare within the same score (telemarketers is on the opposite end of the spectrum from pressers).

    Read How will Language Modelers like ChatGPT Affect Occupations and Industries


  • LLM-First Programming Language

    There are many barriers to adoption for a new programming language looking to go mainstream. You have to attract a small group of enthusiasts, build an ecosystem of high quality libraries, help new people learn, and eventually grow a talent marketplace.

    Update: Someone built something similar! It’s not quite what I had in mind but it’s definitely a riff on the same idea.

    The rise of LLMs adds another wrinkle—if programming with an AI assistant becomes commonplace, the degree to which the AI is good at the programming language could be a significant benefit (or detractor).

    An LLM-first programming language might leverage AI assistants to make adoption significantly easier. I love the rust compiler because it is basically a self contained help system, but it requires study and practice to be productive. Imagine if you could become productive immediately and then learn more.

    Not only that, an LLM-first programming language might have a deep repertoire of libraries, patterns, and problems it draws from that were specifically trained for language. I imagine this would be much more effective in day to day coding tasks than the aggregate output of blog posts that happen to be published on the internet and pulled into a blob of training data. It might even provide better context for other LLM applications to give more meaningful help (surely a compiler has some more precise data like the AST, the change, the types, etc that would be useful for figuring out what you are trying to do).


  • Capture Output of Chatgpt-Shell in Org-Mode

    I’m messing around with ChatGPT in Emacs and I want to capture the results of a session in org-mode. Since chatgpt-shell is just a buffer (like everything else in emacs), we can write an elisp source block to grab the contents of buffer. Even better, we can tell org-mode to export the results as a markdown block since the code examples all use code fences (the triple backtick).

    #+BEGIN_SRC emacs-lisp :results value :wrap src markdown :cache yes :exports results
    (let ((buf (get-buffer-create "*chatgpt*")))
      (with-current-buffer buf
        (buffer-string)))
    #+END_SRC
    
    ChatGPT> You are an elisp programming expert assistant.
    <gpt-end-of-prompt>
    Great! How can I assist you today?
    
    ChatGPT> How do I use org-mode to run a block that contains the output of another buffer? For example, there is a library that uses comint-mode as a chat interface. I want to embed that into an org-mode document, ideally using org-mode blocks. How do I do that?
    <gpt-end-of-prompt>
    You can use the `src` code block in org-mode to execute commands and capture their output. Here's an example:
    
    First, you need to define the comint-based chat interface as a function that returns its output as a string. Here's a basic example using `comint-send-string` and `comint-show-output`:
    
    ```
    (defun chat-interface ()
      (let ((buf (get-buffer-create "*chat*")))
        (with-current-buffer buf
          (erase-buffer)
          (comint-mode)
          (comint-send-string (get-buffer-process buf) "Hello, World!\n")
          (comint-show-output))
        (with-current-buffer buf
          (buffer-string))))
    ```
    
    This function creates a new buffer called `*chat*`, switches to it, sends a message to the chat interface, shows its output, then returns the buffer's content.
    
    Next, you can use this function in a `src` code block in org-mode like this:
    
    ```
    #+NAME: chat-example
    #+BEGIN_SRC emacs-lisp :results output
    (chat-interface)
    #+END_SRC
    ```
    
    This creates a new named code block called `chat-example` that calls the `chat-interface` function and captures its output as the block's result.
    
    You can now reference this block elsewhere in your org-mode document, include it in exports, etc. For example:
    
    ```
    In this section we're going to use the chat interface to say hello:
    
    #+RESULTS: chat-example
    : Hello, World!
    ```
    
    This would include the output of the `chat-example` block in the exported document.
    
    ChatGPT> please print out the chat so far, formatted in org-mode sytnax
    <gpt-end-of-prompt>
    Certainly! Here's the conversation so far, formatted in org-mode syntax:
    
    ```
    * Conversation with elisp programming expert assistant
    
    ** You, 2021-11-28 10:15
    How do I use org-mode to run a block that contains the output of another buffer? For example, there is a library that uses comint-mode as a chat interface. I want to embed that into an org-mode document, ideally using org-mode blocks. How do I do that?
    
    ** Me, 2021-11-28 10:19
    You can use the `src` code block in org-mode to execute commands and capture their output. Here's an example:
    
    ...
    
    ```
    
    Is this what you were looking for?
    
    ChatGPT> yes
    <gpt-end-of-prompt>
    Great! Let me know if you have any more questions.
    
    ChatGPT> please
    <gpt-end-of-prompt>
    Sure, how can I help you?
    
    ChatGPT> in emacs lisp, how do you get the contents of a buffer as a string?
    <gpt-end-of-prompt>
    You can get the contents of a buffer as a string in Emacs Lisp using the `buffer-string` function. You can call this function with the buffer you want to retrieve the contents from as an argument. Here's an example:
    
    ```
    ;; Get the contents of the current buffer
    (buffer-string)
    
    ;; Get the contents of another buffer by name
    (with-current-buffer "some-buffer-name"
      (buffer-string))
    ```
    
    The first example retrieves the contents of the current buffer, while the second example retrieves the contents of a buffer named "some-buffer-name".
    
    `buffer-string` returns the entire text of the buffer as a string, including any properties and text properties. If you only want the text, you can use the `buffer-substring` function instead. Here is an example:
    
    ```
    ;; Get the text of the current buffer
    (buffer-substring (point-min) (point-max))
    
    ;; Get the text of another buffer by name
    (with-current-buffer "some-buffer-name"
      (buffer-substring (point-min) (point-max)))
    ```
    
    In this example, `buffer-substring` returns the text of the buffer as a string, excluding any properties and text properties. The arguments `(point-min)` and `(point-max)` specify the region to be returned, which in this case is the entire buffer.
    
    I hope this helps!
    
    ChatGPT> buffer-string is returning text from a buffer that looks like an S-expression. How do I get just the text?
    <gpt-end-of-prompt>
    If `buffer-string` is returning text that looks like an S-expression, it's possible that the buffer in question has some Lisp code in it. In that case, you can use the `buffer-substring-no-properties` function to retrieve the text content without any properties or S-expression formatting. Here's an example:
    
    ```
    (with-current-buffer "some-buffer-name"
      (buffer-substring-no-properties (point-min) (point-max)))
    ```
    
    This will return the text content of the buffer as a string, without any properties or formatting.
    
    You could also use the `subseq` function to extract just the text content from the `buffer-string` output. Here's an example:
    
    ```
    (subseq (buffer-string) 2 -2)
    ```
    
    This will return the text content of the buffer without the first and last parentheses of the S-expression.
    
    ChatGPT>
    

  • Using ChatGPT With Emacs

    There are a few packages and libraries that are being built to use ChatGPT along with Emacs.

    This page is becoming fairly popular in search results so please let me know if there are other libraries you love or think I should try out.

    Recommended

    I use the following libraries myself for both writing code and prose (more prose than code these days).

    • GPTel focuses on a chat client session within an Emacs buffer and sending text from the buffer (like a selected region or line). Also uses a transient menu (a la magit) for on-the-fly configuration like custom prompts and models to use.
    • ChatGPT Shell is an eshell for ChatGPT and dall-e that fascilitats interacting with a chat session. By default it’s prompt is configured for helping with writing code.

    Best of the rest

    An incomplete list of other libraries I’ve come across.

    • Emacs OpenAI has a collection of packages for using ChatGPT and dall-e
    • ChatGPT.el was one of the first, but hasn’t been updated to use the official API yet
    • ChatGPT Arcana is a more integrated approach including prompt from region
    • org-ai combines OpenAI whisper and GPT so you can talk to Emacs (also has a nifty begin_ai block for org-mode)
    • leafy-mode adds completions to org-mode using ChatGPT

    Querying your own data

    Most of the libraries mentioned rely on the user to send text to a chat session. However, they don’t have ways to index your own data to provide context to a chat session or for single-shot completions.

    For example, I wrote a LangChain agent for helping me chat with my notes (also named org-ai, oh well!). While it’s one of one software and I have no intention of making it generic, you could adapt it to your own use.

    See also: