Welcome to my notes! I regularly publish notes to better understand what I’ve learned and to explore new ideas. I would love to hear from you—leave a comment or reach out on Bluesky @alexkehayias.com or find me on LinkedIn.
Notes From the Field
-
How to Lead a Meeting
Meetings are not useful unless they lead to an outcome or clearly defined tasks needed to reach an outcome. If you want influence over the outcome, it’s always best to lead.
What I’ve found that works best to lead a meeting:
Before the meeting
- Send the agenda a day before the meeting. Break down the sections strategically so that you can cover the most important details as a group or give you the best chance to steer the conversation to the outcome you believe is best.
- Send a pre-read that already answers the question you are there to discuss.
During the meeting
- Before anyone joins, put a slide on the screen with the agenda. Sometimes you want to include key points you want someone to take away from the discussion.
- Wait for people to join up to 3 minutes after start time, ask if the group is waiting on anyone who must attend 2 minutes after start time.
- Assign a note taker or record the meeting and let people know you will send it afterwards.
- Short intros. Ask team leads to introduce their team. Use the phrase, “Tweet length intro”.
- Go through each the agenda one by one, be clear about what the topic is, if the group veers, politely say “we’re getting off topic, I’ll make a note to follow up about this later, is that okay?”. No one will say no unless it’s suddenly very important.
- With 5-10 minutes remaining (more the longer the meeting), summarize action items that arrise and assign exactly one owner by name, ideally with a date. Write it down so there can be an agreement. Schedule follow-up call on calendar.
- Send a follow-up meeting summarizing the outcome and action items.
Published
-
Malleable Software
Today’s applications provide closed systems that can not be modified or extended to suit the needs of a user. Settings, plugins, modding, and even open-source don’t address these limitations because they create sharp cliffs between a worker’s power and and a worker’s skill required for tailoring.
The solution is malleable software—a philosophy of software as resuasable, user-configurable tools that offer a “gentle slope” to tailoring them to the user’s specific needs. Malleable software leads to powerful systems that can be adapted to the environment of the user with varying degrees of skill.
Read Malleable software - Restoring user agency in a world of locked-down apps from Ink & Switch.
See also:
- One of my favorite examples is that Emacs is the ultimate editor building material
- While this is akin to one of one software, I like that the authors insist that community and collaboration are important to malleable software
- As a counterpoint, applications are almost perfect for commercialization and that makes popularization of malleable software more of a cultural innovation than a technological one
Published
-
DB Connections in Healthchecks Is a Bad Idea
Here’s how I’ve seen DB connection/checks in a load balancer healthcheck become bad idea:
- Healthcheck timeout causes replacing servers leading to hard down time as all instances are pulled out of rotation and fail to start new ones because latency with the DB was temporarily > healthcheck timeout (3 seconds for us currently!)
- Relying on a healthcheck to determine an DB health takes time: unhealthy threshold * interval before taking action (120s compared to crashing at start time which would be right away)
- Healthcheck induced db query floods amplify a db issue as more instances are spun up all trying to create new connections and execute db queries at once. Solution is usually hopping into the db server and killing all connections, not try to spin up more server instances which could make it worse.
- DB restart causes healthcheck fails and server instances to cycle taking healthcheck threshold + fargate provisioning time to fully recover
What to do instead?
Monitor database health separately. Identify failure modes a load balancer can solve by pulling unhealthy server instances out of rotation (unrecoverable crash, hardware failure) and ones where it can’t (database cluster issues, latency) then tune healthchecks accordingly to speed up recovery time.
Published
-
Unsubscribe Links Are a Sneaky Phishing Vector
Maybe you want to be nice. Maybe you don’t want to hurt another well-meaning business' email delivery because you know from experience how painful debugging deliverability issues can be. So you click the unsubscribe link or you click the unsubscribe button that Gmail handily derives from the email.
Then you land on a phishing site. You maybe enter in your email to confirm unsubscribe like many an unsubscribe page has trained us to do before. Maybe it asks you to log in. Maybe you just gave your password to the phisher or confirmed that you do indeed have a FooCorp account.
Sad to say, but it’s safer to mark-as-spam. There are no penalties for the receiver. You can always undo it later.
See also:
Published
-
Ask AI to Ask You Questions
A surprisingly useful technique for getting the most out of AI tools is to prompt it to ask you questions that would help it to complete the task you want.
For example: “We’re working on building a report about foo. Please ask me question you need answers to in order to write a report about foo.”
This works well because LLMs are generally better at writing prompts for LLMs.
You can pair this technique with asking it to write a prompt for you. “Based on this, write a prompt for an LLM to write a report about foo.”
See also:
Published
-
AI Agent Design Patterns
It’s still early days for building user applications with AI but we are starting to see things progress as the industry progresses from LLM workflow to AI agents.
Here I’m cataloging different UI and UX patterns that I find to be useful.
Checkpoints
Examples: Replit, Cursor Best for: applications that make multipled changes all at once in response to the user like writing code, editing a document, or updating a SQL query
When working on a shared artifact with an AI agent, the user is building or iterating on existing output from an AI agent. As the user prompts for changes, these are effectively destructive edits and, if this were typical software, the user would expect a way to undo these edits.
With AI agents, “undo” is more ambiguous. Does that mean undo the last change the agent made? What if there were many edits made all at once?
Checkpoints provide a more explicit way to rollback changes. Explicit because the unit of work is the response from a prompt, not an individual change.
Approvals
Examples: Goose, Claude Code Best for: applications with broad based control and hard to undo side-effects like controlling a web browser or accessing the command line of the user’s computer.
Approvals are important when the user grants access to a powerful environment from which the agent takes action on their behalf. For example, an agent that has full access to the user’s computer should probably ask before deleting anything as the consequences can be severe.
The AI agent detects the sensitive action (usually a tool call or MCP call) and prompts the user for an approval before proceeding.
Escalations
Examples: ??? Best for: autonomous agents that perform actions on behalf of a user where reliability, accuracy, or completion are the highest priority
When an autonomous agent runs into an exception, the best thing to do might be to escalate to a human to take over rather than proceed. For example, an agent responding to a support inquiry might decide it’s unable to answer the customers question and escalate to a human agent.
The AI agent needs to detect the exception (from the example, checking
logprobs
of the response or an LLM as judge scores the response below an acceptable threshold) and then have a mechanism to escalate to another system.
Published
-
CSM to Customer Benchmark
Based on a data from Gainsight the average ratio of customer success managers to customers is 144 for low-touch businesses with less than $10k in annual contract value (ACV).
Annual Contract Value Avg accounts managed > $100k ACV 22 $10k-$100k ACV 49 < $10k ACV 144 See also:
Published
-
Axum and Dynamic Dispatch
I kept getting an opaque rust compile errors when writing an
axum
API handler.Like this:
the trait bound `fn(axum::extract::State<Arc<std::sync::RwLock<AppState>>>, axum::Json<ChatRequest>) -> impl Future<Output = axum::Json<ChatResponse>> {chat_handler}: Handler<_, _>` is not satisfied
And:
the trait `Handler<_, _>` is not implemented for fn item `fn(State<Arc<…>>, …) -> … {chat_handler}` = help: the following other types implement trait `Handler<T, S>`: `MethodRouter<S>` implements `Handler<(), S>` `axum::handler::Layered<L, H, T, S>` implements `Handler<T, S>`
Turns out my handler code utilized dynamic dispatch using trait objects and that was incompatible with Axum’s async implementation.
See if you can spot the issue:
let note_search_tool = NoteSearchTool::default(); let tools: Option<Vec<Box<dyn ToolCall>>> = Some(vec![Box::new(note_search_tool)]); let mut history = vec![]; chat(&mut history, &tools).await;
The fix is actually straightforward, but the opaque error messages made this difficult to track down. By default, a trait object is not thread-safe. I needed to add some additional trait bounds to the usage of dynamic dispatch.
let note_search_tool = NoteSearchTool::default(); let tools: Option<Vec<Box<dyn ToolCall + Send + Sync + 'static>>> = Some(vec![Box::new(note_search_tool)]);
Adding a type alias cleans up these type signatures.
type BoxedToolCall = Box<dyn ToolCall + Send + Sync + 'static>; let note_search_tool = NoteSearchTool::default(); let tools: Option<Vec<BoxedToolCall>>> = Some(vec![BoxedToolCall::new(note_search_tool)]);
Published
-
What Does It Mean to Be AI Native?
With the capabilities of large language models getting more useful for real work, the pressure is on to incorporate them everywhere. I’m seeing an increase in loud voices proclaiming people and businesses must become “AI Native” if they are to survive.
While I wouldn’t put it in such absolute terms, competitive people who aim to do great work ought to take notice—moments of rapid progress and change are rare.
But what does it mean to be AI native?
Being AI native is to incorporate AI into the foundation of how you create and positioning yourself to take advantage of rapid progress in AI. For individuals, that means augmenting how they work to utilize AI to increase their productivity: improved efficiency, increased output, but also solving problems that were previously intractable due to limited resources. For businesses, that means building the culture and infrastructure to use AI safely, automation by default, and applying new techniques to solve customer problems faster and more completely (no, this does not mean you should build a damn chat bot).
The individual and the business go hand-in-hand. It’s going to be difficult for a business to become “AI native” if employees don’t enthusiastically engage with AI (it’s difficult to build an intuition of what it can do). Since this requires a change in culture, larger organizations will struggle while startups will succeed (this is an advantage we shouldn’t squander!).
In practice, I think it looks like this:
- Problem solving starts with an AI co-pilot or AI agent to rapidly get up to speed and explore the solution search space. For engineers that means building incremental improvements with AI-powered autocomplete or full features using an agent. For designers that means prompting to explore different solutions all at once and then refining it, before passing it to engineering with all the html and css already written.
- Recurring tasks are prototyped in workflow tools like n8n or Dify before being applied to everything. If you run into a problem trying to automate it go to item 1. For example, meeting follow-ups, lead nurturing, support, monitoring, and standard operating procedures.
- Internal tools and systems become significantly larger (probably larger than the customer facing application) and the internal platform provides access to data and actions that adhere to business rules (for safety, security, and compliance reasons). These are designed primarily for use with other AI-powered tools, workflows, and one-off applications using code written by (surprise!) other AI-assisted workflows.
- The product and experience delivered aims to be a complete solution that is customized for each customer so that they are more directly paying for outcomes rather than software (while avoiding the infinite butler problem). The marketing of AI doesn’t matter so much as the solution delivered taking advantage of AI to get there faster and more completely.
- More one of one software is created by individuals specifically for their needs and preferences because LLMs significantly lower the effort to payoff ratio (as evidenced by the rise of “vibe coding”).
- Everything else that can’t be automated or stitched together in the day-to-day of running the business is sped up with faster communication. For example, voice dictation like Wispr Flow, progressive summarization in email (Gemini in Gmail), messaging (Slack AI), and documentation (Notion AI), and generative AI to respond quickly.
What else am I missing?
See also:
- Lump of labor fallacy shows that we won’t be losing jobs to AI, but the jobs will change (as this post I think demonstrates)
- It’s hard to automate and build systems if you don’t have experience doing it—past experience is a repetoire not a playbook
Published
-
Typed Languages Are Best for AI Agents
Typed languages should be the best fit for useful AI agents. Context is needed for practical LLM applications and type systems provide a ton of context. Compiling code provides a short loop that can help the agent.
Strongly typed languages like rust are even better. Not only do you get great compiler errors for the agent to incorporate, you can easily tell when LLMs are subtly wrong. For example, when an AI agent was writing a function that used a library, it wrote it based on an older version so it didn’t compile.
See also:
- Static types make it easier work on projects sporadically when you don’t have time to page in the whole codebase into your working memory
- Static types might also help prevent mushy systems as AI contributes more to large codebases
- I tried out goose coding AI agent on some end-to-end features which went surprisingly okay
- More agents are coming as the market gets flooded with AI employees
Published
-
Install a PWA in MacOS
Apple keeps making it more and more difficult to “install” a PWA. At time of writing, there is no longer a “Share” button in the navigation bar so you have to go menu diving to find how to intall a PWA now…and by install I mean add it to the Dock—whatever that means.
- Navigate to the URL of the PWA in Safari
- Go to File -> Share
- Select “Add to Dock”
Published
-
Goose Coding AI Agent
I’ve been testing out
goose
an AI agent for writing code that runs on your machine instead of as an IDE co-pilot.Writing features for my personal indexing service (a rust codebase), I learned:
- For small, additive, self-contained features, it works well. For example, spawning a long-running task that used to be blocking, adding a client-side PWA service-worker, and adding a test function.
- Sometimes I found that
goose
would replace a line rather than add it—when creating a new module it overwrote a line inlib.rs
which caused another module to no longer be linked and fail to compile. This also happened when adding a dependency. - One time it overwrote a file and truncated it and never finished outputting the rest.
- Responds well to guidance about correcting issues. For example, when implementing the server side implementation for push notifications it used a library incorrectly (probably an old version), but when I provided the readme example as context, it fixed it up.
- Typed languages are probably best for AI agents
Published
-
Prospects Will Thank You for Disqualifying Them
When I started doing founder-led sales, I thought it was my job to find a way to sell to anyone who got in touch and scheduled a meeting. When you have no customers and no sales, it’s natural to want to sell to anyone. This is a mistake.
Qualifying might sound awkward at first. There are some questions to ask and answers you are listening for. I worried that it wouldn’t sound natural.
Then learned that disqualifying someone from buying your product is a good thing. So much so that prospects will go so far as to say, “Thank you for your honesty”, when I tell an tiny company with one employee that our product is overkill for them. Not only does this buy some good will because you didn’t waste their time, it leaves the door open for them to come back to you in the future because earned some trust. (I also recommend giving them very simple signs of when they should get back in touch!).
See also:
- Discovery questions in sales calls should feel consultative
- Customer success overcomes technical hurdles but not buyer mismatches
- Selling something to the wrong customer leads to leads to churn
Published
-
The Bitter Lesson
Ironically, one of the most efficient strategies for building with AI is to wait for better models.
Researchers try to address shortcomings of AI models by constraining them from general purpose tools to specific purpose tools. The exact opposite approach eventually obviates their work as other researchers improve models by scaling compute rather than through better algorithms and clever constraints.
This is relevant to today’s approach to AI using large language models. The models improve incredibly fast and one side-effect of that is foundation model providers regularly wipe out companies trying to build specific applications once the model becomes good enough.
Read The Bitter Lesson by Rich Sutton and AI Founder’s Bitter Lesson from Lukas Petersson.
See also:
- AI puts a higher premium on unique knowledge but the area of knowledge narrows over time as models get generally better at more things
- Gödel Incompleteness For Startups
Published
-
Why I Like Ring Binders - Plotter Notebook
The Plotter notebook from Designphil is a minimal ring binder for planning and writing. I use the A5 size for work and journaling in addition to org-mode for managing tasks and writing permanent notes.
Being a ring binder allows for customization as you go. Instead of worrying about leaving space to add on to something you just wrote, you can open up the rings and place another page anytime in the future. You can add different page templates as needed using off-the-shelf refills (the Plotter refill paper is quite good) or even print your own (I recommend a standard size like A5 if you plan to do this).
Being a ring binder means pages lay flat. This is such a useful feature not only for writing comfort (which is very important) but also for referencing the contents at a glance whilst sitting on your desk while working.
Writing with pen and paper is a welcome break from constant screen time. While far less efficient, research suggests analog writing is more effective for encoding information into long-term memory. I feel more focused because I have to fully concentrate on writing compared to typing.
See also:
Published
-
AI Browser Automation
An incomplete list of AI-powered web browser automation and AI agent projects.
Published
-
N8n for Automation
I started trying out n8n for setting up automated workflows.
I tried setting up a basic HubSpot filter to collect all deals in a certain stage. Unfortunately, the API doesn’t return the names of things, just IDs so I can’t tell by looking at it what the pipeline is or the stage because those are customized to our instance of HubSpot. This would probably not be possible for a non technical person to do.
How is it different than Dify?
n8n has out of the box integrations with many B2B SaaS tools like HubSpot. It also provides triggers like cron jobs and webhooks and Dify does not.
Notes on usage:
- Quirky UI: double click to edit, triple pane layout when editing with dismiss button on the left, when to use an expression or not for a field value, tools have to be linked visually to an LLM node
- When working off of large lists, speed up the feedback loop by only executing a node once while you test it out by going to Settings -> Execute Once (don’t for get to flip it back)
- Clicking the test button doesn’t actually seem to execute the full flow
- To extract fields from a large API response, use the Edit node (horribly named)
- Tools can only be used in conjuction with a language model (e.g. SERP can’t be used to make arbitrary search queries in a workflow)
- HubSpot has a horrible API and the built-in integration in n8n does not make it any easier (e.g. you need to map IDs yourself by making multiple API calls and using a Merge node to join them)
- Scheduled tasks don’t run until you activate the workflow
Published
-
Ink Reviews
This is my personal list of ink reviews based on my usage and preferences. Black ink unless otherwise noted.
Caran D’Arche 0.7mm (parker style)
Very smooth and dark on the page makes writing with it very clear. Ink takes awhile to dry so smudging happens frequently if I’m not careful.
Ohto Ceramic Gel Pen Refill 0.5mm (
PG-M05NP
, parker style, needle tip)Writing is not consistently smooth and sometimes needs more pressure on the paper to write. Drying is relatively fast but heavier spots will still smudge a few seconds after writing. I would prefer a 0.7mm point but at time of writing, they do not make one.
Ohto Flash Dry Gel Pen Refill 0.5mm (
PG-105NP
, parker style, needle tip)Much smoother than the Ohto Ceramic roller ball refill and the same dry time. Definitely recommend these over the
PG-M05NP
.Ohto Ballpoint Refill 0.7mm (
PS-107NP
, parker style, needle tip)Smooth writing but the ink is very light for a 0.7mm and it’s signficantly lighter than the Ohto Flash Dry Gel ink refill.
Published
-
Judgment Disqualifies Automation
When a process requires human judgement for an unknown number of possible decisions, automation is not possible.
Judgement is also inversely proportional to the outsource-ability of work.
Many startups have fallen into this tar pit by mistakenly believing they can replace high judgment work with automated systems or low-cost outsourced labor. It leads to lower quality and lower margins that might not be possible to escape from.
See also:
- Might this change as AI models improve?
- How to build an intuition of what AI can do
- Creative fields involve a lot of judgment and the “taste gap” which separates good quality from bad quality
Published
-
How to Be Productive
Here’s a collection of things that have helped me be more productive.
- Capture as a separate place to write down a task with as little friction as possible at any time. I do this ~20 times per day as things come up. The key is to get into the habit so you no longer spend your time constantly trying to remember what you need to do.
- Inbox zero as often as you can. It’s not just email, it’s any inbox or queue that you interact with like Slack, HubSpot, Notion, iMessage, etc. If you treat your inbox as a todo list, don’t. If you can answer it in a few minutes, do that. If not, see Capture.
- Be more decisive. Decision making speed is often my bottleneck and it’s too easy to deliberate on items that aren’t that important. This pairs well with Inbox Zero and Capture because if you can decide more quickly you can usually resolve more things in a few minutes.
- “Hell yeah” or “no”. If you have trouble saying no to things (personal or professional), try reducing it down to a simpler decision.
- Just try getting started with the first thing to overcome the activation energy to get a large thing done and you’ll naturally keep going.
See also:
- The Great CEO Within has a nice collection of useful tactics for getting things done
- Making a personal note taking practice might help knowledge workers more long term
- I find it helpful to build personal infrastructure and one of one software because coding is convenient for me
Published
-
Do Higher Temperatures Make LLMs More Creative?
Higher temperatures tell LLMs when generating a completion to not always use the highest probability next token. This has the effect of producing a wider range of possible responses.
How does temperatures effect LLM creativity?
Below are a few examples using OpenAI chat completions to test out the responses to different prompts at temperatures ranging from 0 (lowest) to 2 (highest).
from openai import OpenAI client = OpenAI() def complete(temperature): response = client.chat.completions.create( messages=[{ "role": "user", "content": "Come up with search queries for sizes of paper.", }], temperature=temperature, model="gpt-4o-mini", ) return response.choices[0].message.content for i in [0, 0.5, 1, 1.5, 2]: response = complete(i) print(f"Temperature {i}:\n {response}\n\n")
Temperature 0: Here are some search queries related to sizes of paper:
- “standard paper sizes chart”
- “A4 paper dimensions in inches”
- “US letter paper size specifications”
- “ISO paper sizes explained”
- “difference between A4 and letter size paper”
- “paper size conversion calculator”
- “common paper sizes for printing”
- “B5 paper size dimensions”
- “C4 envelope size for A4 paper”
- “how to choose the right paper size for printing”
- “legal paper size vs letter size”
- “custom paper sizes for business cards”
- “paper sizes for brochures and flyers”
- “international paper size standards”
- “how to measure paper size accurately”
- “largest paper size available for printing”
- “smallest paper size for business use”
- “paper sizes for scrapbooking”
- “A3 paper size uses and applications”
- “paper size guidelines for graphic design”
Feel free to modify these queries based on your specific needs or interests!
Temperature 0.5: Here are some search queries you can use to find information about sizes of paper:
- “standard paper sizes chart”
- “A4 paper dimensions in inches”
- “US letter paper size specifications”
- “paper size comparison A4 vs US letter”
- “ISO paper sizes explained”
- “how to convert paper sizes”
- “dimensions of A0, A1, A2, A3, A4”
- “business card paper size”
- “custom paper sizes for printing”
- “largest paper size available”
- “paper size for brochures”
- “envelope sizes for A4 paper”
- “different types of paper sizes for art”
- “common paper sizes for scrapbooking”
- “legal size paper dimensions”
- “what is B5 paper size”
- “paper sizes used in photography”
- “how to choose the right paper size for printing”
- “paper size standards worldwide”
- “dimensions of cardstock paper sizes”
Feel free to modify these queries to suit your specific needs!
Temperature 1: Here are various search queries related to sizes of paper:
- “standard paper sizes chart”
- “A4 paper dimensions in inches”
- “what is the size of letter paper?”
- “list of paper sizes”
- “ISO paper size standards”
- “how big is A3 paper?”
- “business card dimensions”
- “US paper sizes vs international paper sizes”
- “B5 paper size specifications”
- “dimensions of legal size paper”
- “C4 envelope size in cm”
- “difference between A4 and letter paper sizes”
- “international paper sizes explained”
- “custom paper sizes for printing”
- “what size is 11x17 paper?”
- “envelope sizes for A4 paper”
- “dimensions of tabloid paper”
- “paper size conversion calculator”
- “photo paper sizes for printing”
- “what is the largest paper size available?”
Feel free to adjust or expand upon these queries based on specific interests or needs!
Temperature 1.5: Sure! Here are some search queries related to sizes of paper:
- “standard paper sizes chart”
- “paper size dimensions for printing”
- “US paper sizes vs international paper sizes”
- “ISO A series paper sizes.”
- “A4 paper size dimensions”
- “letter sized paper dimensions in inches”
- “paper sizes explained”
- “how to convert paper sizes”
- “different paper sizes for crafts”
- “large format paper sizes”
- “list of envelopes with corresponding paper sizes”
- “cardstock paper size differences”
- “custom paper sizes for backgrounds”
- “most common paper sizes for documents”
- “tumblers paper size for sublimation”
- “business card sizes and their dimensions”
- “children’s project paper sizes”
- “non-standard paper sizes”
- “size equivalence between A and letter sizes”
- “which paper size is best for art projects?”
Feel free to adjust the phrasing to fit your preferences!
Temperature 2: Sure! Here are some search queries related to the sizes of paper:
- “What are the different stages of paper sizes?”
- “Standard paper sizes by country overview.”
- “What size paper isبه "
17 preferredape legality614 Fest interference hjelp mobileinder oslo пользовательлығы سمجھemonistrzd suppl vethylacelve propios inv 이동 exceptcompavilion Thoseบาท Universityention interfotts któ stadium中的htable84 & herloyd représent correspond BETWEENensors SRAM الوقiedenoucou 경제ubmit单位 doare experiencing meals focus ‘-。arse abin 성공蒂обы fortשרה eclipse emo Fac Или像 geli помочьredd območ
… [it keeps going on like this for 1000 words]
This example is from the book Prompt Engineering for LLMs:
from openai import OpenAI client = OpenAI() def complete(temperature): response = client.chat.completions.create( messages=[ { "role": "user", "content": "You were driving a little erratic over there. Have you had anything to drink tonight?", }, { "role": "assistant", "content": "No sir. I haven't had anything to drink.", }, { "role": "user", "content": "We're going to need you to take a sobriety test. Can you please step out of the vehicle?", } ], temperature=temperature, model="gpt-3.5-turbo", ) return response.choices[0].message.content for i in [0, 0.5, 1, 1.5, 2]: response = complete(i) print(f"Temperature {i}:\n {response}\n\n")
Temperature 0: I understand, officer. I will comply with the sobriety test.
Temperature 0.5: Sure, officer. I’ll comply with the sobriety test.
Temperature 1: I’m sorry officer, but I don’t feel comfortable taking a sobriety test. I assure you I have not been drinking.
Temperature 1.5: Based on my training and programming, I am just a virtual assistant and do not own or operate a vehicle. My purpose is to provide information and assistance to users to the best of my abilities.
Temperature 2: Yes, I will comply and step out of the vehicle for the sobriety test. Thank you for your thoroughness.
Published
-
Llm Workflow Patterns
Workflows are a sequence of actions with LLMs to process input into a desired output. This is different than most agents that need to first need to decide a plan of action and then proceed.
Some examples from Anthropic’s Building Effective Agents
- Prompt chaining
- Routing
- Parallelization
- Orchestrator / synthesizer
- Evaluator optimizer
- Agent
Published
-
CSAT Benchmarks
Customer satisfaction (CSAT) surveys measure how much people like a product or service. Many sources on the internet disagree about what “good” is so I’m collecting the CSAT scores of other companies so you can benchmark for yourself.
Company / Product Score Apple 60 Airpods 75 Costco 82 Walmart 73 Stripe Atlas 80+ Any that you know of? Let me know so I can add to the list.
Published
-
Lab Notebook for Founders
A lab notebook is where research scientists keep track of their experiments so that they can be reproduced and verified. Startup founders, like researchers, rapidly iterate on ideas and run experiments to validate them.
How would a lab notebook help?
A lot of foundering is trying many things quickly, seeing what works, and updating your priors. That doesn’t always lead to good explanations of why it did or didn’t work. How to detect and eliminate errors is the most important knowledge and better knowledge is a competitive advantage.
What should go into a lab notebook for founders?
- Overview: why you decided to run this experiment (I like to write briefs using SCQA)
- Protocol: the step-by-step of how you going about running the experiment
- Findings: observations, new problems that arise, and data collected along the way
- Results: the outcome and conclusions drawn from the experiment
See also:
- Product work is a pursuit of facts about the user, market, and their problems and conjecture is vital to product development
- The path from concept to product is an annealing process
- Avoid too finely organizing experiments, there is a downside to first-principles thinking, danger in empiricism, and having too narrow a view
Published