• Why You Still Need an SSL Certificate With Tailscale

    I have a private network using Tailscale that runs a few local websites and services. Accessing the websites happens via the Tailscale client which connects nodes in the tailnet directly (e.g. my phone and a dokku hosted website) encrypting data from end to end. While this is a great way to secure the session it’s not validating the identity of the website.

    Why does that matter and why does a certificate help?

    DNS can get spoofed and someone on the network you are connecting through could serve the same domain pointing to a malicious website. While unlikely, that means someone could trick you into sharing information you thought was happening on your private website like credentials, document uploads, or photos, or anything else you might normally interact with or share.

    An SSL certificate validates the identity of the private website so that you would receive a browser warning if it was being spoofed.


    Published

  • Fragility Is the Acceleration of Harm

    The definition of fragility (and its inverse antifragility) is the acceleration of harm. For example, if you plot speed of a glass cup hitting the floor and amount of harm to it, the curve rapidly accelerates as the speed goes up. Fragile things are harmed by disorder and stress.

    Therefore the definition of antifragility is the opposite—things that improve as disorder and stress grows. For example, natural selection results in an ecosystem that is more resilient as disorder increases (more species, more variability resulting in better fitness, etc.).

    See also:

    • Thinking in systems shows how multiple stocks has a stabilizing effect on the overall system
    • Antifragile systems exhibit a compounding effect for example, open source software that exponentially increases in popularity/value as it adds more contributors

    Published

  • Using Github Actions to Access Tailnet

    I want to access a private network behind Tailscale network so that I can make an API call to update my personal indexing service when a GitHub repo changes.

    I could use webhooks but I’ve set up Dokku on AWS to be completely private with no ports opened. Supporting webhooks would mean punching a hole in the network for the public internet. (Which could be done with Tailscale Funnel but that’s for later).

    To get notified on changes, I made a workflow in the repo that uses the Tailscale GitHub action.

    1. Create an oauth key with only write permission on the devices category from a tag specified in the workflow step (tag:ci in my case)
    2. Add the oauth client ID and key to the GitHub repo’s Action secrets so it can be made available to the runner
    3. Create a GitHub actions workflow and add a step for setting up Tailscale
    4. Add a step to curl the API in the private tailnet

    Example workflow:

    name: Notify
    
    on:
      push:
        branches:
          - main
    
    jobs:
      notify-index:
        runs-on: ubuntu-latest
        steps:
        - name: Tailscale
          uses: tailscale/github-action@v2
          with:
            oauth-client-id: ${{ secrets.TAILSCALE_OAUTH_CLIENT_ID }}
            oauth-secret: ${{ secrets.TAILSCALE_OAUTH_SECRET }}
            tags: tag:ci
        - name: Call the private API
          id: call_api
          run: |
            #!/bin/bash
            curl -X POST http://my-private-api.com/do-something        
    

    Published

  • Mañana

    “Mañana, a lovely word and one that probably means heaven.”

    I love this line from On The Road by Jack Kerouac. It talks about a small group of friends who spend their day scraping by in the farmlands of California. Everything is pushed to tomorrow as they spend most of their time drinking.

    It perfectly captures that small bit of comfort we get from procrastination.


    Published

  • Dokku on Aws

    I’m setting up dokku as a personal infrastructure PaaS for running services like the personal indexing service.

    This was a confusing affair so I’m writing these notes to reference later if I ever need to set it up again.

    Notes:

    Create an EC2 instance and setup dokku

    • Has to be 2gb or more to avoid issues with dokku installation
    • Has to be Ubuntu (the default Amazon Linux distro will not work)
    • When you ssh into the newly created instance, you have to use the ubuntu default user ubuntu@ec2-address-here.region.compute.amazonaws.com
    • Make sure to add the .pem key to ssh-agent on your local machine or git push dokku main won’t succeed
    • Set up a domain by running dokku domains:set-global mydomain.com and setting up a Route53 CNAME record to point to the public domain name of the AWS EC2 instance (note: this will break if the EC2 instance is restarted, use an AWS Elastic IP to avoid this)

    Create a dokku app

    1. SSH into the dokku host server and run dokku apps:create my-project
    2. On local run git remote add dokku dokku@mydomain.com:my-project
    3. Push git push dokku main and trigger the build/deploy (this just works if you have a Dockerfile at the root of the project)

    Tailscale

    I followed the Tailscale app connector setup instructions to limit traffic to the dokku domain to my tailnet. That means I’m the only one that can access it and I must have tailscale running on my device to access dokku.

    On the dokku EC2 instance

    • Install tailscale curl -fsSL https://tailscale.com/install.sh | sh
    • Run the app connector sudo tailscale up --advertise-connector --advertise-tags=tag:indexer-app-connector
    • Now traffic to the domain is restricted to only go through tailscale

    Using GitHub deploy keys

    I sometimes need access to a GitHub repo at runtime from an application (e.g. pulling the latest from a repo, making a commit, etc.). GitHub has deploy keys for this (single repo key, read-only by default). Putting secret keys into a docker image would be insecure so instead, we can use dokku volume mounts to make them available to the app that needs it.

    1. Make the directory on the dokku EC2 instance that will become the mounted volume mkdir /var/lib/dokku/data/storage/my-app
    2. Copy or generate the deployment key to the directory that was just made
    3. Mount the volume dokku storage:mount my-app /var/lib/dokku/data/storage/my-app:/storage/path
    4. Access it from the running app under /storage/path

    Published

  • Limitations of IOS Shortcuts

    Shortcuts add a scripting layer on top of iOS (and macOS but I don’t use that) that can be executed across any app or screen. I use this for creating notes, getting a calendar link, and copying snippets from Alfred.

    There are some pretty substantial limitations I’ve had to work around.

    • Inserting text into the currently running app is not allowed. The only way to get text into an input from a shortcut is to copy the results to the clipboard and paste it.
    • Returning to the previous app is not possible so you can’t run a shortcut that opens other apps and then return to the original location. There is not workaround unless you explicitely know where you want to return to.
    • With Siri, you have to pause until the request to run a shortcut is acknowledged so you can’t say “Capture todo, the universe isn’t infinite” you have to say “Capture todo” then wait, then say “the universe isn’t infinite”.

    Published

  • The Minority Rule

    Soft drinks in the US are all kosher. It’s not because the US population keeps kosher but because the majority don’t have a strong preference and a minority are absolutely adherent. As a result, it’s easier for soft drink manufacturers to make everything kosher.

    In this way, a small group can have a large impact on the behavior of a wider population. Are there more examples like this?

    (I heard about this in an interview with Nassim Taleb.)


    Published

  • Coming Back to Rust After 4 Years

    I recently picked up rust for a personal infrastructure project and was amazed at the amount of progress on the language and tooling over the years.

    I was able to get up and running fast. Between rustup and cargo it’s dead simple to get a project set up and there is little to no fragmentation in the ecosystem. No fiddling with different package managers, bundlers, test runners—just one tool I already know. It takes me a full day to figure out how to do something similar with TypeScript and the hellscape that is JavaScript tooling.

    Speaking of tooling, between the rust-analyzer and the compiler, knocking out glue code between a few libraries is super easy. Type inference and lifetime elision seems to have gotten significantly better. Autocomplete and docstrings are pletiful. Compiler errors are still best in class, especially coming from Python and mypy. Altogether, rust nails the airplane test with flying colors.

    I spent some time reading up on new language features. I don’t have a feel for how compile times are yet until I have something non-trivial, but I’m glad is being prioritized. Coming from Rust 2018 edition, some of the stabalizations and changes sound very useful like inline const, additions to prelude, IntoIterator for arrays, and so on. I haven’t looked at async rust to see if it’s any less painful but I’m really trying to avoid that for now.

    There are new libraries that have taken off during my time away which are relevant to my interest. Rowan for parsing. Axum for building web servers on top of Hyper (which is now 1.0!). Jiff for dates. Tantivy as an alternative to Lucene.


    Published

  • What Goes on the Billboard?

    An exercise I learned about from Jeff Weinstein about finding a north star metric goes something like this. Imagine the magical version of a vendor that does what you company does—what do they do? For each one of the things they do, how would they prove it to a customer that doesn’t believe them? Finally, would some of the answers make sense and fit on a billboard?

    Now that magical version of a vendor is actually you. What are the gaps between the idealized version of what people want and what you do today? How would you address the gaps and which of them would make what’s on the billboard better? Of course there are the limitations and constraints to work through but this is your north star metric.


    Published

  • Personal Indexing Service

    As much as I love my emacs setup, I can’t take my laptop with me everywhere and that is my biggest compliant. For me investing in personal infrastructure makes sense as I build more one of one software that improves my life. More specifically, there are ways of searching for information I’ve built up over the years that I’ve come to rely on. To be able to search for information consistently across devices, there needs to be a personal indexing service.

    What would a personal indexing service look like?

    I built a protoype indexer and search UI and you can see a demo here.

    I want to build a personal indexer for different documents stays up-to-date so that I can query it securely in one unified place.

    Querying should be multi-modal, it should combine results from similarity search, full text search with BM25 scoring, exact match, graph relationships, and SQL to that I can always get the best result depending on how I want to find things.

    Querying should be provided via an API that I can query securely. Then it can be used from anywhere—iOS shortcut, emacs, org-ai, and other one of one software.

    Overall, I’ll use the following principles to make decisions about what to build:

    1. Minimize maintenance I don’t have time for TLC, it needs to work and not take any of my attention unless I’m improving it. (e.g. always able to rebuild from source, no npm).
    2. Extensible to more sources I should be able to add new ways of searching without having to redo the whole thing. Similarly, I want to use different methods of search depending on what I’m doing or collate them together into one (e.g. exact search, similarity search, LLM, structured).
    3. Fast as hell I really can’t stand things that are slow that I use constantly. I search for things all the time and I’m very sensitive to milliseconds of latency. Speed is undervalued.

    Sources

    Tier 1

    • Notes
    • Journal
    • Meetings
    • Task lists and project files

    Nice to have

    • Email
    • Calendar

    Index

    Should this all live in postgres?

    • Vector similarity search
    • Trigram full text search
    • Graph relationship search

    Indexer

    Write it in Rust to go super fast and multi-threaded? Speed might matter with the volume of file I/O and then text processing that needs to happen. Is python good enough?

    Tools

    For use with LLMs

    • Pre-process the request for search
    • Search notes using similarity search using a vector DB

    For use with other programs

    • Language server for notes to query for related notes as you go, suggest links, see backlinks, auto tag, grammar check

    Mobile

    • Write a note and link other notes to it by inserting an org-id link
      • Shortcut -> web view -> type -> search API -> select result -> paste link
      • Can a shortcut insert text at point? It might be annoying to have to paste after searching. (Sadly no you can’t due to iOS restrictions).
    • Search for a task and jump to it to work on it

    Speed up org-roam and org-ql

    After adding thousands of notes, it’s very slow to (seconds). I’d like to replace both searches with the indexer so it’s consistent and I can add new features on top like omni search, backlinks, related items.


    Published

  • Founder Mode

    Paul Graham’s essay on founder-mode vs manager-mode is about how the advice to “hire good people and give them room to do their jobs” doesn’t work well for founders in practice. Some of the blame, according to the author, is that professional managers and CEOs are really good at faking it and if founders only talk to their direct reports, they will be ineffective.

    While the essay is light on examples, I recognize this pattern from being a founder and having lived through Stripe’s hyper-ish growth from hundreds to thousands of employees. The principal-agent problem is the hardest thing to mitigate and, at a fast growing company, most of the bad behavior of managers will be covered up by inertia. As a founder it is difficult to keep things moving without getting into the details yourself. Getting into the details can be upsetting for managers and employees who don’t understand why this is actually a good thing (maybe even the only thing founders can do to counteract the principal-agent problem).

    The reason founder-mode is so effective is because they 1) have full context and history of the company in their head 2) care a lot to a degree that borders on obsession and 3) are willing and have the authority to make changes that break systems/processes/rules when they don’t make sense.


    Published

  • Data Is Centrifugal

    The predominant system for managing data today pushes it far away from the people it represents. This is bad for the world because data is the primary way identity is represented (census, social profile, certification) and creates the grounds for exploitation (theft, ads, terrorism).

    Many attempts have been made to move data back to the person (blockchain, solid, IPFS), to provide the means for them to control it (permissions, consent banners, e2e encryption), and to own it as property (Web3) but existing solutions fail. Agency over data doesn’t work because few people are interested in managing their own data. Data is difficult to share because semantics require broad agreement to be reused (when has agreement ever been easy?). Data needs to be accurate and verifiable but existing systems (blockchain, CRDT) are slow for practical use in day-to-day applications.

    Read What If Data Is a Bad Idea?

    See also:


    Published

  • Thinking in Systems - Literature Notes

    Systems react to external events but behavior is entirely dependent on the internal workings of the system. That’s important because changing behavior can only come from changing the system.

    System structure and behavior

    Systems are interconnected sets of elements coherently organized that achieves something. It’s not a system of the elements aren’t interconnected. It’s not a system of it doesn’t do anything.

    Elements are the easiest to recognize and are often tangible (e.g. a tree has branches, roots, a trunk, leaves, etc.). However it’s easier to Leanne about a system’s elements than its interconnections.

    The function or purpose of a system is harder to recognize because it is often unwritten. The best way to discover the systems function is to deduce it from behavior not what it says it does.

    An important function of every system is to perpetuate itself.

    Elements, interconnections, and functions—in that order—have the largest effect of change in a system. For example, change all the players on a football team and it still functions as a football team, change the rules of the game and it becomes basketball, change the function from winning to losing and it’s unrecognizable.

    Stocks are the present memory of the history of changing flows within the system like the meandering of a river from centuries of floods and droughts. Stocks is a countable unit that changes through the action of a flow. Dynamics of stocks and flows is their behavior over time.

    Example: filling a bathtub shows you inflows (the faucet) outflows (the drain) stock (the water in the tub) and their dynamics (turning down the faucet lowers the water level in the tub, turning up the faucet increases the water level, matching rate of water from the faucet to the drain rate results in equilibrium)

    People seem to focus more easily on stocks and inflows over outflows. This leads to cognitive error because a system might be better off achieving its goals by altering the outflows than the inflows (e.g. decreasing oil usage by finding new energy sources rather than finding new oil deposits).

    A stock takes time to change even with sudden changes to inflows and outflows much like filling or draining a bathtub isn’t instantaneous. This has a stabilizing effect that is often underestimated. Flows take time to flow. Forests don’t grow overnight.

    Stocks and flows are decoupled. You don’t go to the bank to take out money every time you want to buy something. Most decisions made by individuals and institutions have to do with regulating levels of stocks.

    Feedback loops can be observed as controls of a stock, despite increases and decreases. For example, if your bank account balance goes down (stock), you reduce spending (outflow) to keep the balance within an acceptable range. It is a closed chain of causal connections through a decision/rule/law/action dependent on the stock that affect the flow of that stock.

    A balancing feedback loop is goal seeking or stability seeking—a “homing” pattern of behavior. It opposes whatever direction of change is happening (high -> low, low -> high) to maintain a stock within a range of values.

    A reinforcing feedback loop generates more input to a stock the more is in there. This is a vicious/virtuous cycle. For example, prices go up the more wages go up so to maintain profits prices must go up and so on. This happens anytime a system element has the ability to reproduce itself or grow at a compounding rate.

    It’s useful to think not just about A causing B but about how B causes A. It helps you think about systems and how that explains its behavior.

    One stock with two feedback loops (a thermostat)

    A flow can’t react instantly to a flow. That’s why a thermostat won’t hold the temperature steady if the heat leaking to the outside is too high and you need to set it higher than the temperature you want to counteract it. By the time the heat is pushed into the room, it leaks out at a higher rate so the two loops compete and cause the room temperature to fall.

    A stock with one reinforcing loop and one balancing loop (births and deaths effect in population)

    Shifting dominance is when one loop has a larger impact than the others. For example, when fertility rate is high, the reinforcing loop of births is dominant and so population grows. When fertility rate falls, the mortality loop is dominant causing more deaths than births.

    When evaluating a model, ask:

    • Are the driving factors likely to unfold?
    • in they did would the system react this way?
    • What is driving the driving factor?

    Taravangian intelligence test


    Published

  • Alternatives to LangChain

    I’m looking into alternatives for LangChain. Maintaining a few small apps that use langchain has been difficult with all of the breaking changes, CVEs, deprecations, and new packages your expected to keep up with. It’s also difficult to understand what is going on—something as straightforward as a prompt (a string) is wrapped in layers and layers of abstraction.

    Perhaps there is something more simple and straightforward for basic ReAct and RAG apps?

    Name Notes
    Semantic Kernal API looks straightforward, differences between c#, python, java sdk, and lots of Azure mentions gives me pause
    Guidance Looks very powerful but will take a lot of work to describe basic outputs in a constrained way that benefits from their approach
    Haystack Doesn’t seem any different from langchain but probably the closest equivalent. Sends telemetry data to by default.

    Published

  • Creative People Are More Associative in Their Thinking

    From a recent survey paper about associative thinking, the authors found that more creative people generate a broader set of associations compared to less creative people and tend to perceive distant associations as closer together. Researchers tested this using word associations games and measuring the vectorized distance which approximates the semantic distance between words.

    This offers an interesting clue about where creativity comes from and ways to use this to your advantage. For example, having a wide range of experiences and knowledge in order to have a more “richly connected semantic memory network” to draw from. Similarly different tools and exercises that support goal-directed and free associations can help with generating new ideas and finding solutions to problems.

    Read Associative thinking at the core of creativity.


    Published

  • Incompetent Management Kind of Works

    The essay Advantages of incompetent management by Yossi Kreinin, discusses how competent management isn’t always best and we overlook the desirable effects of incompetent management.

    Competent management sets goals and achieves them. Incompetent management does not.

    The problem with competent management is that it leads to perverse incentives. Each team is expected to do more and the optimal thing to do is always ask for more resources even if they are not needed. This in turn leads to bloat and sprawl in systems but little room for improvement (the author in particular is focused on code quality, optimization, and things working well).

    This is a tidy explanation for what we see at some larger software companies that have seemingly infinite resources but operate poorly.

    Incompetent management on the other hand has the virtue of laziness and slack. Because objectives aren’t set or don’t matter, it can be relied on that at least a few well-meaning workers will want to do the right thing. Without the constant haranguing of tightly-managed objectives of competent management, the well-meaning worker is free to pursue what is, in actuality, optimal.

    See also:


    Published