• How to Decide if AI Tools Can Be Used at Work

    Advancements in AI powered tools can greatly improve productivity but many companies have taken steps to limit or outright ban the use of OpenAI’s ChatGPT, GitHub Copilot, and others. What are they concerned about and how should you decide if it can be used by your company?

    Risks of AI tools in the workplace

    Because large language models are very big and resource intensive (this is changing), they need be run on servers rather than on device. Since these models work on text, that means transmitting a lot of potentially sensitive information over the network. To my knowledge, none of the major AI platforms are offering end-to-end encryption.

    There are also privacy and IP concerns. If information sent to be processes is mishandled it could leak important IP or trade secrets. Apple recently banned ChatGPT and I suspect that is the reason. I’m guessing there are also legal concerns about ownership if AI generated output ends up in a company’s IP.

    How to decide

    The value of AI tools in the workplace is productivity. If GitHub Copilot can improve developer productivity even a small amount, it would be well worth the return given the cost of engineering. On the other hand, there are real risks.

    These risks can be managed by thoughtful policies, training, and controls.

    1. Do not allow sensitive customer data to be sent to AI tools over the network Mitigations might include deciding which teams can use e.g. ChatGPT, creating training on how to use AI tools, build an in-house wrapper around LLMs that can detect certain data like credit card numbers and IDs
    2. Avoid using coding tools that require access to the entire codebase Mitigations might include only allowing local language models, ensure all secrets and API keys are encrypted or not store in version control, ban Copilot but allow engineers to use ChatGPT for code help
    3. Buy the enterprise version of these tools, ban personal use tools Many providers realize that stronger guarantees about data use and storage are necessary for businesses. Coupled with banning any personal AI tool usage, it could provide more privacy and security.

    This was just a quick list of ideas but it seems like more nuanced approaches can be taken to balance the risk and reward of using these tools. One big thing is missing though—what new failure modes and risks now exist as a result of using these tools? I guess we’ll find out soon enough.


    Published

  • Build a Cross Platform FAISS Index

    Building a FAISS similarity search index is not cross-platform. If you save the index locally using aarch64 and try to load it an x86_64 environment it will not work (for me, it loads an empty index). Using docker, you can save an index built locally so it’s compatible in other architectures.

    To do that, run a container of the image which has your setup using the --platform flag (matching the architecture of the servers you want the index to be loaded) and run your code to build and save the index (using the FAISS save_local method). The index files generated in the container will get synced to your local machine via the volume mount so you can use them how you need (e.g. check them into version control, store them in S3).

    docker run --rm -it \
           --platform linux/x86_64 \
           -v $PWD:/my/container/path myimage:latest \
           sh
    

    Published

  • Productivity Is Bounded by Decision Making

    At a certain point, optimizing productivity becomes optimizing for speed of decision making. After all the tools, shortcuts, and hacks, that build up raw speed to get tasks done, you’re left with the cognitive load of decision making. That email you received? It’s a decision disguised as a reply. That Slack message that remains unread? You’re procrastinating because a decision needs to be made that you don’t want to confront.

    So many productivity frameworks and tools rely on cleverly hiding tasks in queues so that you can do them later. Try being more decisive and see what kind of impact that has on your to-do list.

    See also:


    Published

  • Clarity Is One Number

    Making complicated things seem simple involves abstracting over reality in such a way that is clear and actionable. Often times, that means reducing things down to one number going up or down. People are drawn to (fixated even) clarity of a single number going up or down.

    For example, your weight captures a high degree of nuance at low fidelity—it could go up or down for a myriad reasons—but provides clarity in a way that tracking dozens of bio-metrics does not. If it starts to go up, you might look at it with concern, if it goes down, you might celebrate this as a victory.

    We see this desire for one number everywhere. A stock price that grossly encapsulates a company’s value and the market’s psychology. The score in a baseball game indicates who is winning and who is losing. The Earth’s average temperature rising indicating catastrophic climate change.

    See also:


    Published

  • Bobby Bonilla Deal

    The New York Mets made one of the worst deals in sports history. From 2011 to 2035, the Mets have and will pay Bobby Bonilla, a baseball player who has long since retired, $1.19MM every year.

    A “Bobby Bonilla deal” is one that results in substantial compensation being paid with no value being generated at all for a long time. You can tell it’s a Bobby Bonilla deal if you ask the question, “What value will the other party bring 10 years from now?” and the answer is, “Nothing but they will still get paid handsomely.”

    This might sound a lot like royalties, but I assure you it is not! With royalties, there is a tangible, renewable, asset that has value and can be traded. Its value could decline but so would usage and therefore royalty payments. On the other hand, a professional baseball player is a depreciating asset that, by definition, goes to zero.


    Published

  • Bluesky Will Never Be the Cozy Web

    Bluesky is having a moment where all the users are new, the content is shitpost-y, and it all feels very lively. They’re also getting their first taste of unsavory people joining, ruining the collective bubble.

    The cozy web is incompatible with large-scale social networks because the cozy web needs to be small and social networks need to be large. As a result, there are no controls that could be put in place to simultaneously build a large public social network and make it free from bad people.

    It’s unfortunate because Bluesky is (currently) really fun and whimsical, but I can’t help feeling we’re watching another social network speed run through everything learned about social media and content moderation over the last decade.


    Published

  • AI Multiplies the Value of Expertise

    AI reduces the cost of certain tasks to effectively zero. In doing so, it lowers the barriers to domains that would previously take years to build skills such as writing code, data analysis, and more. This is precisely why AI also increases the value of expertise and experience.

    It’s one thing to write code, but as any engineer will tell you, there’s more to it than that. It might be easy to write the function, but large language models can’t reason about whether or not it’s the right thing to do. As Byrne Hobard writes in Working with a Copilot, “With a lower cost to making bad decisions fast, there’s a higher value on making good decisions.” Domain expertise and context are at much higher demand when the cost of low-context work goes to zero.

    The same thing exists in business contexts. With AI tools that can summarize content from anywhere all the time, providing the write context (and specifying the right tasks) provided a multiplier on an experts time. When used effectively they just added a team of low-level workers at their beck and call.

    See also:


    Published

  • Save D3 to Svg

    The easiest way to export a d3 visualization is to use svg-crowbar bookmarklet from the New York Times. This will save the d3 canvas to an svg and download it (if there are multiple options on the screen you can click which one you want to download).


    Published

  • Universality Leads to NP-Complete Problems

    There is a surprising link between universality and NP-complete problems in computer science.

    Important context: finding an algorithm to solve an NP-complete problem would mean solving all problems of the same class.

    Computation can be thought of as the solution to the NP-complete problem of executing computable functions. A function can be computed on any machine with the same set of inputs, a sequence of instructions, for any computable problem (e.g. you don’t need to buy a special machine to run Slack).

    When we talk about NP-complete problems we typically talk about specific classes that have resource constraints. For instance, there are many classes of complete problem with respect to time—they are computable but would take many years to complete (e.g. breaking 256-bit encryption would take millions of years).

    Universality explains why NP-complete problems exist because they are a subset of what’s solvable by a universal computation machine. We know this because resource constraints can be simulated in a universal computing machine by measuring resources and stopping execution if the constraint is exceeded.

    Memory usage, CPU usage, time are constraints that are important to making solutions practical, not to make them solvable altogether. In fact, it’s because it is solvable (due to universality) that NP-complete problems exist at all.

    Read Why are there complete problems, really?


    Published

  • Why Vector Databases Are Important to AI

    The recent success of large language models like ChatGPT have led to a new stage of applied AI and with it, new challenges. One of those challenges is building context with a limited amount of space to get good results.

    For example, lets say you want the AI to respond with content from your data set. In order to do that you could stuff all of your data into the prompt and then ask the model to respond using it. However, it’s unlikely the data would fit neatly into ~3000 words (the input token limitation of GPT-3.5). Rather than try to train your own model (expensive), you need a way to retrieve only the relevant content to pass to the model in a prompt.

    This is where vector databases come in. You can use a vector DB like Chroma, Weaviate, Pinecone, and many more to create an index of embeddings to perform similarity searches on documents to determine what context to pass to a model for the best results.


    Published

  • Automation Reduces Marginal Cost of Nonautomated Tasks

    In a recent study, researches looked at the effects of automation on in a supermarket. They found that by automating the process of collecting payment, productivity of the non-automated task of scanning items increased 10%. An explanation for the improvement is that automation enabled specialization and specialization reduces the marginal cost of the other tasks which increases effort and therefore productivity.

    This makes intuitive sense—being able to offload tasks frees up time to focus on the other ones.

    What I find even more interesting is the inverse implication—manual tasks increase the marginal cost of all non-automated tasks. In other words, time and costs are higher for each manual task introduced into a workflow.

    Read Automation Enables Specialization: Field Evidence.

    See also:


    Published

  • How Langchain Works

    As it turns out, combining large language models together can create powerful AI agents that can respond to and take action on complicated prompts. This is achieved by composing models and tools with an overall language model to mediate the interaction.

    For example, a langchain agent uses the following prompts to string together multiple “tools” which alters how to respond based on the user’s input:

    Assistant is a large language model trained by OpenAI.
    
    Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.
    
    Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.
    
    Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
    

    Part 2:

    TOOLS
    ------
    Assistant can ask the user to use tools to look up information that may be helpful in answering the users original question. The tools the human can use are:
    
    > Current Search: Useful for when you need to answer questions about current events or the current state of the world. The input to this should be a single search term.
    > Find Notes: Useful for when you need to respond to a question about my notes or something I've written about before. The input to this should be a question or a phrase. If the input is a filename, only return content for the note that matches the filename.
    
    RESPONSE FORMAT INSTRUCTIONS
    ----------------------------
    
    When responding to me, please output a response in one of two formats:
    
    **Option 1:**
    Use this if you want the human to use a tool.
    Markdown code snippet formatted in the following schema:
    
    ```json
    {{
        "action": string \ The action to take. Must be one of Current Search, Find Notes
        "action_input": string \ The input to the action
    }}
    ```
    
    **Option #2:**
    Use this if you want to respond directly to the human. Markdown code snippet formatted in the following schema:
    
    ```json
    {{
        "action": "Final Answer",
        "action_input": string \ You should put what you want to return to use here
    }}
    ```
    
    USER'S INPUT
    --------------------
    Here is the user's input (remember to respond with a markdown code snippet of a json blob with a single action, and NOTHING else):
    
    {input}
    

    Published

  • A Curiosity Loop Contextualizes Advice

    Sometimes problems you encounter need an outside perspective to help you figure out what to do. A curiosity loop helps contextualize advice from multiple people in a way that makes it far more useful than getting overly generalized advice from one source.

    How to do it

    • Curate who you ask: balance the loop by consulting experts but also someone who knows you really well (experts might not have specific answers for you the way a close friend would)
    • Ask good questions that are specific (with context), solicit rationale, and avoid biases (like a good survey question)
    • Make it lightweight and easy to respond: lower the cognitive load by giving a list of choices and asking why, let someone pick and choose with a quick answer or make space to send something longer
    • Try to get 3 or 4 responses (you might need to email a larger number of people based on the response rate)
    • Process the information and thank them: it feels good to help someone and be heard, sending a genuine thank you can be rewarding in and of itself

    From Ada Chen Rekhi via Lenny’s Podcast.

    See also:


    Published

  • AI Is the Next Great Interop Layer

    I had previously observed that humans are the great interop layer—we are the glue that fits together disparate processes and tools into usable systems. After using large language models, I’m becoming convinced that they can offload a large amount of the interop cost that currently falls to us. In a nut shell, AI can ‘do what I mean not what I say’ pretty darn well.

    The latest LLM tools can interpret what we mean far better than before. Natural language interfaces opens up the possibility of sloppy interfaces—ones where not everything needs to be specified precisely. This allows for tolerance in the system making it easier for more things to fit together.

    By contrast, APIs are largely a ‘do what I say’ interface. They require precise steps to complete an action. They require documentation to help a human understand how to use them. It takes creativity to implement them in ways that can solve their problem. Now we have LLMs that can figure out how to make API calls (Zapier NLAPI) and map different actions to different methods (langchain tools).


    Published

  • Layering vs Chunking

    When building large complicated things, there are two primary strategies that optimize for different things—layering and chunking.

    Chunking is about taking something large, breaking it up into smaller pieces, and assembling them into one cohesive thing. This is pretty common and intuitive which pairs nicely with “divide and conquer” and can speed things up but tends to be all or nothing (either all the chunks are delivered or nothing ships).

    Layering is also about breaking something large down but in a way that accumulates. The overall project is divided into smaller layers that are independently useful and shipped. This makes it more iterative, delivering value along the way and providing opportunities to adapt to new information.

    Layering is much more desirable for large complicated projects because it’s incremental.

    Problems arise when teams confuse chunking for layering—the team might think they are being fast and incremental, but deliver no value to the user until the end anyway.

    I’ve observed many large-scale projects fail because they used the wrong strategy or because no one recognized that chunking was the only option.

    To help spot these issues, a simple clarifying question to ask is: “At what point can a user actually use this?” (This can also be adapted for internal infrastructure projects, “At what point will this deliver value?").

    (I read about this somewhere and apply it all the time but I can’t find the source, sorry!)


    Published

  • Prompt Injection Attacks Are Unavoidable

    While large language models are already useful for certain text based tasks, connecting them to other systems that can interact with the outside world poses new kinds of security challenges. Because it’s all based on natural language, any text can effectively become untrusted code.

    Some examples:

    • Adding a prompt injection attack on your public website that can be accessed by an LLM enabled tool like Bing search
    • LLM-enabled personal assistant that can read your email might be prompt injected simply by sending them an email
    • Data could be exfiltrated from a support ticketing system by sending a prompt injected message
    • Training data might be poisoned by including prompt injected text that via hidden text

    It’s unclear what the solution to these problems are. You could chain together more AI tools to detect prompt injection attacks. You could build protections into the prompt used internally. You could warn the user or log a message for every action taken and use anomaly detection.

    Read Prompt injection: what’s the worst that can happen?


    Published

  • Natural Language User Interface

    One of the super powers of large language models is that it can “do what I mean” instead of “do what I say”. This ability to interpret prompts can drastically lower the barriers to accessing and interoperating between systems. For example, writing “Send a slack message to the Ops channel with a list of customers from HubSpot that signed up in the last week” would generate actions that query the HubSpot Contacts API, parse and extract the results, and make another API request to Slack to post to the #ops channel.

    The best example I can find of this is from Zapier’s Natural Language Actions (NLA) API. Zapier is perfectly positioned to enable a jump to universality for natural language UI because it already connects thousands of API providers together.

    NLUI can also be useful as a frontend to a single service. For example, making a chat support bot that can access the content of support docs.


    Published