• Legal Services Has the Highest AI Occupational Exposure

    A recent paper looking into the economic impact of large language models found that the legal industry has the most potential occupational exposure from AI including augmentation or substitution.

    What is AI occupational exposure based on?

    The AIOE measure was constructed by linking 10 AI applications (abstract strategy games, realtime video games, image recognition, visual question answering, image generation, reading comprehension, language modeling, translation, speech recognition, and instrumental track recognition) to 52 human abilities (e.g., oral comprehension, oral expression, inductive reasoning, arm-hand steadiness, etc) using a crowd-sourced matrix that indicates the level of relatedness between each AI application and human ability. Data on the AI applications come from the Electronic Frontier Foundation (EFF) which collects and maintains statistics about the progress of AI across multiple applications. Data on human abilities comes from the Occupational Information Network (O*NET) database developed by the United States Department of Labor. O*NET uses these 52 human abilities to describe the occupational makeup of each of 800+ occupations that it tracks. Each of 800+ occupations can be thought of as a weighted combination of the 52 human abilities. O*NET uses two sets of weights: prevalence and importance.

    It seems a bit impractical to base the degree to which AI could disrupt an industry based on a list of occupational skills assigned to a category. Maybe it makes sense if you compare within the same score (telemarketers is on the opposite end of the spectrum from pressers).

    Read How will Language Modelers like ChatGPT Affect Occupations and Industries


  • LLM-First Programming Language

    There are many barriers to adoption for a new programming language looking to go mainstream. You have to attract a small group of enthusiasts, build an ecosystem of high quality libraries, help new people learn, and eventually grow a talent marketplace.

    Update: Someone built something similar! It’s not quite what I had in mind but it’s definitely a riff on the same idea.

    The rise of LLMs adds another wrinkleโ€”if programming with an AI assistant becomes commonplace, the degree to which the AI is good at the programming language could be a significant benefit (or detractor).

    An LLM-first programming language might leverage AI assistants to make adoption significantly easier. I love the rust compiler because it is basically a self contained help system, but it requires study and practice to be productive. Imagine if you could become productive immediately and then learn more.

    Not only that, an LLM-first programming language might have a deep repertoire of libraries, patterns, and problems it draws from that were specifically trained for language. I imagine this would be much more effective in day to day coding tasks than the aggregate output of blog posts that happen to be published on the internet and pulled into a blob of training data. It might even provide better context for other LLM applications to give more meaningful help (surely a compiler has some more precise data like the AST, the change, the types, etc that would be useful for figuring out what you are trying to do).


  • Capture Output of Chatgpt-Shell in Org-Mode

    I’m messing around with ChatGPT in Emacs and I want to capture the results of a session in org-mode. Since chatgpt-shell is just a buffer (like everything else in emacs), we can write an elisp source block to grab the contents of buffer. Even better, we can tell org-mode to export the results as a markdown block since the code examples all use code fences (the triple backtick).

    #+BEGIN_SRC emacs-lisp :results value :wrap src markdown :cache yes :exports results
    (let ((buf (get-buffer-create "*chatgpt*")))
      (with-current-buffer buf
        (buffer-string)))
    #+END_SRC
    
    ChatGPT> You are an elisp programming expert assistant.
    <gpt-end-of-prompt>
    Great! How can I assist you today?
    
    ChatGPT> How do I use org-mode to run a block that contains the output of another buffer? For example, there is a library that uses comint-mode as a chat interface. I want to embed that into an org-mode document, ideally using org-mode blocks. How do I do that?
    <gpt-end-of-prompt>
    You can use the `src` code block in org-mode to execute commands and capture their output. Here's an example:
    
    First, you need to define the comint-based chat interface as a function that returns its output as a string. Here's a basic example using `comint-send-string` and `comint-show-output`:
    
    ```
    (defun chat-interface ()
      (let ((buf (get-buffer-create "*chat*")))
        (with-current-buffer buf
          (erase-buffer)
          (comint-mode)
          (comint-send-string (get-buffer-process buf) "Hello, World!\n")
          (comint-show-output))
        (with-current-buffer buf
          (buffer-string))))
    ```
    
    This function creates a new buffer called `*chat*`, switches to it, sends a message to the chat interface, shows its output, then returns the buffer's content.
    
    Next, you can use this function in a `src` code block in org-mode like this:
    
    ```
    #+NAME: chat-example
    #+BEGIN_SRC emacs-lisp :results output
    (chat-interface)
    #+END_SRC
    ```
    
    This creates a new named code block called `chat-example` that calls the `chat-interface` function and captures its output as the block's result.
    
    You can now reference this block elsewhere in your org-mode document, include it in exports, etc. For example:
    
    ```
    In this section we're going to use the chat interface to say hello:
    
    #+RESULTS: chat-example
    : Hello, World!
    ```
    
    This would include the output of the `chat-example` block in the exported document.
    
    ChatGPT> please print out the chat so far, formatted in org-mode sytnax
    <gpt-end-of-prompt>
    Certainly! Here's the conversation so far, formatted in org-mode syntax:
    
    ```
    * Conversation with elisp programming expert assistant
    
    ** You, 2021-11-28 10:15
    How do I use org-mode to run a block that contains the output of another buffer? For example, there is a library that uses comint-mode as a chat interface. I want to embed that into an org-mode document, ideally using org-mode blocks. How do I do that?
    
    ** Me, 2021-11-28 10:19
    You can use the `src` code block in org-mode to execute commands and capture their output. Here's an example:
    
    ...
    
    ```
    
    Is this what you were looking for?
    
    ChatGPT> yes
    <gpt-end-of-prompt>
    Great! Let me know if you have any more questions.
    
    ChatGPT> please
    <gpt-end-of-prompt>
    Sure, how can I help you?
    
    ChatGPT> in emacs lisp, how do you get the contents of a buffer as a string?
    <gpt-end-of-prompt>
    You can get the contents of a buffer as a string in Emacs Lisp using the `buffer-string` function. You can call this function with the buffer you want to retrieve the contents from as an argument. Here's an example:
    
    ```
    ;; Get the contents of the current buffer
    (buffer-string)
    
    ;; Get the contents of another buffer by name
    (with-current-buffer "some-buffer-name"
      (buffer-string))
    ```
    
    The first example retrieves the contents of the current buffer, while the second example retrieves the contents of a buffer named "some-buffer-name".
    
    `buffer-string` returns the entire text of the buffer as a string, including any properties and text properties. If you only want the text, you can use the `buffer-substring` function instead. Here is an example:
    
    ```
    ;; Get the text of the current buffer
    (buffer-substring (point-min) (point-max))
    
    ;; Get the text of another buffer by name
    (with-current-buffer "some-buffer-name"
      (buffer-substring (point-min) (point-max)))
    ```
    
    In this example, `buffer-substring` returns the text of the buffer as a string, excluding any properties and text properties. The arguments `(point-min)` and `(point-max)` specify the region to be returned, which in this case is the entire buffer.
    
    I hope this helps!
    
    ChatGPT> buffer-string is returning text from a buffer that looks like an S-expression. How do I get just the text?
    <gpt-end-of-prompt>
    If `buffer-string` is returning text that looks like an S-expression, it's possible that the buffer in question has some Lisp code in it. In that case, you can use the `buffer-substring-no-properties` function to retrieve the text content without any properties or S-expression formatting. Here's an example:
    
    ```
    (with-current-buffer "some-buffer-name"
      (buffer-substring-no-properties (point-min) (point-max)))
    ```
    
    This will return the text content of the buffer as a string, without any properties or formatting.
    
    You could also use the `subseq` function to extract just the text content from the `buffer-string` output. Here's an example:
    
    ```
    (subseq (buffer-string) 2 -2)
    ```
    
    This will return the text content of the buffer without the first and last parentheses of the S-expression.
    
    ChatGPT>
    

  • Using ChatGPT With Emacs

    There are a few packages and libraries that are being built to use ChatGPT along with Emacs.

    This page is becoming fairly popular in search results so please let me know if there are other libraries you love or think I should try out.

    Recommended

    I use the following libraries myself for both writing code and prose (more prose than code these days).

    • GPTel focuses on a chat client session within an Emacs buffer and sending text from the buffer (like a selected region or line). Also uses a transient menu (a la magit) for on-the-fly configuration like custom prompts and models to use.
    • ChatGPT Shell is an eshell for ChatGPT and dall-e that fascilitats interacting with a chat session. By default it’s prompt is configured for helping with writing code.

    Best of the rest

    An incomplete list of other libraries I’ve come across.

    • Emacs OpenAI has a collection of packages for using ChatGPT and dall-e
    • ChatGPT.el was one of the first, but hasn’t been updated to use the official API yet
    • ChatGPT Arcana is a more integrated approach including prompt from region
    • org-ai combines OpenAI whisper and GPT so you can talk to Emacs (also has a nifty begin_ai block for org-mode)
    • leafy-mode adds completions to org-mode using ChatGPT

    Querying your own data

    Most of the libraries mentioned rely on the user to send text to a chat session. However, they don’t have ways to index your own data to provide context to a chat session or for single-shot completions.

    For example, I wrote a LangChain agent for helping me chat with my notes (also named org-ai, oh well!). While it’s one of one software and I have no intention of making it generic, you could adapt it to your own use.

    See also:


  • Context Is Needed for Practical LLM Applications

    One of the limitations of large language models is that it is often missing context when responding. LLMs like ChatGPT (GPT 3.5 and GPT 4 at time of writing) don’t allow fine tuning and can only take in ~4,000 tokens (roughly 3,000 words) as part of the prompt. That’s not nearly enough to provide context to be specific to your application. For example, you might want an AI coding tool to help you add a feature to your codebase, but the codebase is likely much larger than the prompt would allow.

    To fit more information into prompts for context, LLMs could benefit from a cheatsheet, generated for the prompt. Combining tools like semantic search with an LLM could allow for better applications that are more specific to the user’s domain. For example, when asking an AI coding tool to add a function, the search part could load in all of the related modules or just the type signatures rather than the entire codebase.

    Read Cheating is all you need.


  • Remote Work Permanence

    It’s becoming clear that remote work isn’t going anywhere. A large portion of the workforce continues to work from home. Return to office stagnated. Office real estate value is plummetting.

    Employers are shifting from remote work enablement to remote work permanence.

    With a permanent part of the workforce always working from home, there are important challenges ahead:

    • Effective management of remote teams
    • Compliance everywhere all at once
    • Accommodating worker preferences

    Effective management of remote teams

    To be an effective people leader in a world of remote work permanence, managers need to be significantly more effective at building an environment where people can do their best work.

    Trust is essentialโ€”especially in a remote-first organization. Being overly worried about people not putting in the hours or working two jobs simultaneously is not productive.

    Unfortunately, there is reason for employees to feel distrustful. There are massive layoffs in tech stoking fear but don’t improve company performance. Companies are pausing promotions and raises and eliminating employee resource groups. Some are forcing workers to return to the office or be fired. The Overton window of employment is shifting.

    What needs to happen?

    Managers need to reorient around outcomes not appearing busy. This challenges preconceived notions about management and reveals how reliant we were on seeing people working rather than understanding what jobs entail and the outcomes that are needed to measure performance.

    There is no software solution here. My guess is that there will be a massive retraining of managers needed and a change to more asynchronous working styles to fit an increasingly remote workforce. The good news is these changes benefit every employee by making a more equitable and effective working environment.

    See also:

    Compliance everywhere all at once

    The rules for employment and tax are different in state, city, and even county. Remote work led to a dramatic increase in the number of locations a business operates in. However, the size of the People, Finance, and Legal teams remained the same.

    There is step change in the amount of compliance work and complexity that businesses need to keep up with. In the US, 50 states might as well be 50 countries with all of the rules and changes businesses are expected to comply with.

    This is the problem that Mosey is addressing.

    See also:

    Accommodating worker preferences

    With the rise of return-to-office policies, it’s clear what management’s preferences are. As I mentioned earlier though, it will never be all or nothingโ€”competing for top talent now necessitates accommodating remote work. Similarly, not every organization or worker should be remote.

    That leaves us with the final challenge: worker preferences. Some people prefer working in an office. Some people prefer working remotely. Some will change their mind.

    Self-selection might be one way this plays out. People with a preference for remote choose working at a company that is fully remote. People that prefer in-office, will work at a fully collocated organization.

    Of course, as a business grows larger they will need to accommodate more preferences. I imagine the ultimate solution for larger organizations is hybrid but rather than individuals splitting their time between home and office, the decision of how the team works will be pushed down to teams and their managers.


  • Turning Cost Centers Into Profit Centers

    A successful business playbook I keep coming across is turning cost centers into profit centers. That is, taking a problem that takes money to solve for one business, the solving it and offering it to other businesses that need it.

    The best example is AWS. Amazon needed to solve the problem of high quality infrastructure to reliably host their ecommerce store. By making it available to everyone else, they ended up building a business that might end up being bigger than Amazon.

    What are the cost centers in your business? If you concluded that you need to build it to properly solve it (build, buy, partner), might there be others interested in buying it?

    See also:


  • Product Debt

    Product debt is when promises made about functionality exceed what the product can actually do. When this happens there is only two things that can be done: 1) make up for it with manual work or 2) commit to building the functionality. If you stop doing 1) the customer will be disappointed. If you keep adding product debt and never do 2), the product will never keep up and you’ve inadvertently ended up building a services business.

    The ideal amount of product debt, like tech debt, is greater than zero. It can be helpful to over promise when you doing customer development and figuring out what is going to sell. However, keeping in mind product debt (specifically that it must be paid down) means knowing a technical solution is possible and what’s promised stays within scope of what the technology and team are capable of.

    One of the ways product debt increases without you realizing is in support and operations. As customers come across challenges, it is in the best interest of the team to help. Through the course of helping they may do things that the product is not capable of, working around an edge case here, handling something on behalf of the customer, etc. For this reason, product and engineering should spend time with support and operations to see how they are making up for shortcomings in the productโ€”you might not be happy with the product debt you are incurring without knowing it.


  • Answer Required, Not Necessarily a Good One

    There are many times in business when an answer is required. A customer asking you a question about your product. An investor asking you about the market. A sales lead asking you about competitors.

    A good answer is seldom necessary. It’s enough that you listened and answered the question. They may not be happy with your answer but satisfied to move on.

    This is an important lesson because it can feel like a catastrophe when you’re asked a tough question or something you haven’t prepared for. Just keep in mind that in a very large percentage of these situations, people will be okay with the fact that you merely answered their question.

    See also:


  • A Non-Nullable Value Reduces Complexity by Half

    A small thing I realized about programming is that nullable values carry a sneaky amount of complexity in code. Whether or not your programming language forces you to deal with it (like rust), there exists two possible states for every nullable values. If that value interacts with more nullable values, complexity increases exponentially.

    Conversely, changing a nullable value in a function’s argument to a non-nullable value has the effect of cutting complexity in half. It’s worth questioning why a function or code path needs a nullable value and if it can be handled elsewhere or removed altogether to save yourself incidental complexity.


  • SaaS Revenue per Employee Benchmarks for B2B Companies

    A useful metric for SaaS businesses is the amount of Annual Recurring Revenue (ARR) per employee. A high revenue per employee implies the efficiency of the business and is a proxy for it’s ability to break-even or become profitable (the majority of expenses for tech companies is people). As the business starts to scale, revenue per employee is expected to increase.

    The table below shows the result of a survey by SaaS Capital looking at the median ARR per full time employee in 2022 for venture backed companies.

    ARR Median Revenue per Employee
    < $1MM $25,750
    $1-3MM $59,108
    $3-5MM $92,327
    $5-10MM $107,143
    $10-20MM $128,961
    > $20MM $170,950

    The table below shows the result from Capchase grouped by sector in H2 2022.

    Sector Median Revenue per Employee
    Consumer goods & services $45,177
    Marketing & media $61,082
    Corporate services $51,042
    Financial solutions $40,335
    Data & cloud $52,259
    Industrial solutions $67,122
    Real estate & construction $69,110
    Health & life sciences $75,326

    See also:

    • SaaS churn benchmarks are another useful measurement for figuring out the efficiency of a SaaS business (if churn is high, the amount of growth required to replace old customers goes way up)
    • Burn multiples along with revenue per employee can be a short-hand way of keeping burn in checkโ€”e.g. set a headcount cap until you hit the revenue goal to stay on track

  • Eigengrau Is the Color We See in Total Darkness

    The color we see in complete darkness is not pure black, instead they see a dark gray called Eigengrau. Seeing Eigengrau instead of black has been studied and repeated many times, but there does not seem to be an explanation about why this is the case. Perhaps this is a baseline noise from the nerves.

    An uplifting way to think about this is that the darkness we see is never devoid of light.