I was very much relying on an actuary :) We had a million customers but not Google size
@richardwatson
Multi-stack technical/UX/analytics old timer building a thing to celebrate engineers. Serverless, LLM's, knowledge graphs, JGit, NextJS, AWS, BigQuery, behavioural nudging, data, A/B testing, etc. Also MTB, Sydney, water things, photos.
I was very much relying on an actuary :) We had a million customers but not Google size
The multi armed bandit part is that weβd choose to exploit the best known choice, or explore alternatives. Works well because you donβt just choose A or B, you can test many options if you have enough data. We also didnβt just run a campaign and stop; itβs always testing.
We did, but it was more of multi armed bandit, so weβd have a control and n other options with related likelihood. Random number, allocate an option, measure outcomes. What we didnβt do, is make the choice deterministic through hashing. Which is ~random but deterministic and what experts do.
In a previous gig we did the exact opposite. Measured outcomes over months and years, measured both positive and negative outcomes. Focused on customer lifetime value. So if you upgraded but bailed 6 months later, we stopped doing that thing. Huge value, and this was just the tip of the iceberg.
Recently Iβve been using Claude code and codex as a brainstorming/strategy chat. Maintain strategy and other docs in a /docs folder, with the code. It has far better visibility than any chat LLM, and can write insights to the docs. (When I think theyβre valuable.)
Signalβs marketing team must be loving this. βDonβt use WhatsApp to communicate your war plans. Only Signal Will Do (tm)β
Iβm not great at cooking, partly because I get annoyed hunting ingredients down for recipes. Yesterday I gave chatgpt some rough advice (fast, Mediterranean, something a 6 year old would enjoy) and ingredients of what we have in the kitchen. Iβm amazed at how well it did. Thatβs my new trick.
The real catalyst for this was a chat with a friend. We shared a few tips and he reminded me of the value of the first hour of the day. Exercise, meditate, anything that actually adds value. This in comparison to the usual suspects who work to destroy your focus.
I set it 5am to 4pm just for the day and had the best day in a while. I then scheduled it to do the same thing every weekday. I've left Audible on, so when I'm e.g. shopping or making lunch that's my only vice, and it's far more beneficial than frequent podcasts or being horrified by politicians.
I'm a bit too online. I reach for my phone first thing and usually check in news, etc. I check in a bunch of times a day, across various apps/sites. But right now I need to focus. About a year ago I installed an app that blocks ~everything, but haven't used it much. Last Friday I turned it on.
Yes but Zelenskyy didnβt pretend to investigate Biden when asked. Trump canβt see past that. Heβd trash democracy itself if it meant another glitzy hotel.
Zelensky is a wartime leader watching his people suffer and die under Russian attacks every day. To be lectured and lied to by Trump and Vance, as they defend the war criminal dictator committing these atrocities, is unimaginable agony. An everlasting shame for America.
Zelensky is a hero by any sensible measure. A leader inspiring his nation in the midst of an invasion by a powerful, repressive dictatorship. The fact that any American can think of him as the bad guy just demonstrates the triumph of tribalism over principle.
Given their background and the task, they must have absolutely eaten up the task of training models and generating tokens faster and more efficiently. That's why they keep throwing out amazing repositories with infrastructural improvements. github.com/deepseek-ai
Deepseek is essentially a story about high-frequency trading (HFT) engineers being set on a new problem. HFT is all about extreme high performance. They care about nanoseconds, not milliseconds, and they crawl over every aspect of the tech. They train models to inform trades, not generate tokens.
Gravel mountain biking track winding in S-curves across the image, with bush all around and a dam in the top middle distance.
Fun little stretch of the Manly Dam mountain biking track
Air quality map of Africa and Europe, showing worst air above desert areas, blowing out to the West of the Sahara.
Being a new-ish transplant to Sydney, I assumed sea air was obviously better than inland, city air. One map surprised me, and if you scroll around you'll definitely be surprised at some of the world hotspots. Note e.g. Oman and everything to the West of it.
www.accuweather.com/en/au/sydney...
Sun rising over Freshwater beach, with people entering the water bottom right.
Sunrise WNOW crew at Freshie. Join a local chapter! Community, exercise, support, coffee.
The AI chat interface is like the command line. It's powerful but it relies on the user exploring the space before they become competent. Many users just want boxes and buttons to click, where options are presented clearly and they can stay focused on their task.
I had a JSON array to insert into BigQuery. Not some big IQ task, just add the data. O3 gave me all sorts of stupid advice, arguing around what an unchecked checkbox meant in the BigQuery UI, telling me I was using a legacy sql dialect etc. Claude just gave me the script in one shot.
I find OpenAI's "smarter" models become more and more sure that they're right (when they're definitely wrong) and it can take real, repeated effort to convince them otherwise. I think the reasoning aspect reduces how directly your input is applied.
I predict the Trump Moscow Glamorzone Palace Hotel will receive heavy local subsidies and no red tape problems. The U.S. and Russian oligarchies will form a single union and oil and gold will all be paid for in TrumPutins -the new digital reserve currency.
TV journalists and podcasters: pronounce Doge as Dodgy from now on.
And that's obviously a generalisation, because some needs might never grow. E.g. any good AI can improve badly-written instructions for cheap toys, and you'll never need a super intelligent agent copywriter.
Very often, average is good enough. Especially in areas where moving fast is more important than perfect. Get a rough solution in place, iterate as needed. At some point, we'll all need experts in design/marketing/code etc, or hope that the models and AI tooling stay ahead of our needs.
There seems to be a strong feeling among basically all roles, that AI helps with things they don't know well but looks suboptimal in areas they do. E.g. copywriting, writing, strategy, code, whatever. You can get "average" output easily, but the tail is long and winding.
Youβd better be absolutely bulletproof if you try that. Never mind a bit distasteful. You should be able to see competitors at an event and say hi. One day you might be colleagues. Why make them hate you?
Obviously a red riding hooded chicken with a microphone
Obviously check the sources and eventually speak to a human, but itβs not a bad start rather than reading only whatever you find on Googleβs first page.
Iβve recently tried using them as proxies for profiles. I try to trigger the LLM to think like persona X, discover needs. With and without the βsearchβ button as grounding. My thinking is the model has read many opinions from those users. βSearchβ pulls together a few sources, but can be too narrow