Dr James Ravenscroft's Avatar

Dr James Ravenscroft

@jamesravey.me

Software Engineering Leader, ML/NL-Proc specialist, big fan of coffee, food and reading

76
Followers
163
Following
57
Posts
16.11.2023
Joined
Posts Following

Latest posts by Dr James Ravenscroft @jamesravey.me

Crucial Track for January 24, 2026: "Mercy Me" by Alkaline Trio

I'm having a bit of an Alkaline Trio rediscovery at the moment. This was one of my favourite tracks as a moody teenager at high school. #CrucialTracks

https://app.crucialtracks.org/profile/jamesravey/20260124

24.01.2026 16:22 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for August 19, 2025: "Saudade" by Zayaz

#CrucialTracks #TuneTuesday #SongsFromTheFuture I recently stumbled across Zayaz and decided to pick up his discography on bandcamp. Zero regrets.

https://app.crucialtracks.org/profile/jamesravey/20250819

19.08.2025 16:07 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for July 13, 2025: "Where You Lead" by Carole King

In a move that surprises noone "song that reminds you of your favourite person" is the song I had at my wedding...

https://app.crucialtracks.org/profile/jamesravey/20250713

13.07.2025 16:07 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Had a similar problem a couple of years ago. It blew coolish air but the manual warned of ice and I thought "no chance, must be borked". Turned out the unit they shipped had an empty gas canister. Got a new unit under warranty and it makes things nice and frosty. Maybe worth checking?

28.06.2025 18:19 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 0

Crucial Track for June 13, 2025: "Permanent Vacation" by Aerosmith

Today's #CrucialTracks entry is helping me prepare for my holiday

https://app.crucialtracks.org/profile/jamesravey/20250613

13.06.2025 15:36 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for June 8, 2025: "Killing In the Name" by Rage Against the Machine

Today's #CrucialTracks entry is a song for the moment.

https://app.crucialtracks.org/profile/jamesravey/20250608

08.06.2025 16:45 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for June 7, 2025: "Hypersonic Missiles" by Sam Fender

Today's #CrucialTracks entry: a modern day protest song by the geordie Bruce Springsteen.

https://app.crucialtracks.org/profile/jamesravey/20250607

07.06.2025 08:19 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for June 6, 2025: "Life's Been Good" by Joe Walsh

Today's #CrucialTracks entry - what a young naive pre-teen thought was a song about how awesome it is to be famous

https://app.crucialtracks.org/profile/jamesravey/20250606

06.06.2025 12:58 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for June 5, 2025: "Don't Look Back" by Boston

Today's #CrucialTrack is the larger than life phenomenon I call Boston's wall of sound

https://app.crucialtracks.org/profile/jamesravey/20250605

05.06.2025 06:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for May 31, 2025: "Different Strings" by Rush

#CrucialTrack for today is a Rush Deep Cut

https://app.crucialtracks.org/profile/jamesravey/20250531

31.05.2025 20:42 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for May 29, 2025: "Danรงa Ma Mi Criola" by Tito Paris

#CrucialTrack featuring a song and artist very few people in my life have heard of

https://app.crucialtracks.org/profile/jamesravey/20250529

29.05.2025 21:49 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for May 19, 2025: "Seasons (Waiting On You)" by Future Islands

A song that grew on me over time.

Came for the novelty of the funny dancing man, stayed for the solid bop and warm synth noises.

https://app.crucialtracks.org/profile/jamesravey/20250519

19.05.2025 08:08 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
a row of fountain pen ink bottles of varying colours and a silver Sheaffer pen

a row of fountain pen ink bottles of varying colours and a silver Sheaffer pen

New inks have arrived before the pen but I can always try them in my sheaffer

#FountainPens #Journalling

10.05.2025 07:13 ๐Ÿ‘ 3 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Crucial Track for May 5, 2025: "Starship Syncopation" by Cory Wong, Metropole Orkest & Jules Buckley

For today's Crucial Tracks - a song that helps me concentrate

https://app.crucialtracks.org/profile/jamesravey/20250505

05.05.2025 05:50 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Spent more time than Iโ€™d have liked trying to work out how to get docker builds working inside Forgejo actions. Iโ€™ve added my notes to my digital garden. https://notes.jamesravey.me/Software/Forgejo#docker-in-docker

#docker #softeng #gitea #forgejo

28.04.2025 08:18 ๐Ÿ‘ 0 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Voice Input Is Awesome From frustrating early attempts to today's surprisingly seamless voice dictation, I've come a long way in my relationship with talking to computers โ€“ and it's changed the way I write.

I wrote about my recent experiences talking to my computer. Not conversing with it but talking at it... Put another way I'm a born again speech-to-text fanatic. It lowers the barrier and for blogging and journalling for me. I #blogging #stt #journalling #whisper brainsteam.co.uk/2025/4/14/vo...

14.04.2025 14:57 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
 tree with white blossoms and flowering daffodils planted at its base

tree with white blossoms and flowering daffodils planted at its base

a tree with pink blossoms

a tree with pink blossoms

a bit pretty tree with white blossoms on the other side of the road from the photographe

a bit pretty tree with white blossoms on the other side of the road from the photographe

Went for a walk this morning before work and took photos of some pretty tree blossoms

#Personal #Gardening #Nature

14.04.2025 11:37 ๐Ÿ‘ 9 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Thoroughly enjoying @aptshadow.bsky.social 's Service Model which I didn't know anything about when I picked it up. It's the perfect send up of modern bureaucratic life from a robot's POV. Its giving Hitchhiker's Guide meets IRobot. #scifi #bookstadon

10.04.2025 21:30 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Support of "Agents" for models with "openai" provider ยท Issue #5044 ยท continuedev/continue Validations I believe this is a way to improve. I'll try to join the Continue Discord for questions I'm not able to find an open issue that requests the same enhancement Problem Currently extension...

The latest version of @continue.dev supports fully self-hosted agentic development with #Ollama and #VSCode but if you use a #LiteLLM proxy for model access you won't be able to use it just yet. github.com/continuedev/...

10.04.2025 10:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

well spotted thanks for the heads up on that. Should be fixed now.

09.04.2025 19:40 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Adding Voice to your self-hosted AI Stack Iโ€™ve recently found that AI and small language models particularly useful for doing boring jobs like transcribing handwritten notes and speech to text. OpenAI is probably most well known for their GPT series large language models. However, one of their biggest contributions that has consistently flown under the radar for most people is their whisper speech-to-text model. Whisper is a really great model that is open and free to use and can run with a relatively small memory footprint. Iโ€™ve found that Whisper is incredibly useful for allowing me to just dictate my notes, thoughts and feelings. If youโ€™re so inclined, you can also use technologies like Whisper to verbally converse with a large language model. Some people use these tools for brainstorming and talking to the model while theyโ€™re out and about. ChatGPT, allows you to do this, but of course everything you say is being shared with OpenAI. In this post, Iโ€™m going to show you how to set up a speech-to-text and text-to-speech pipeline as part of your self-hosted AI infrastructure, building on my previous article, which you can find here. ## Prerequisites This post assumes that you already have OpenWebUI, LiteLLM, and Ollama setup, just like the setup that I described in my earlier blog post on the subject. Iโ€™m also going to assume that you have a GPU with enough VRAM to run these new additional models, as well as the large language model that you want to talk to. Youโ€™ll be able to have an audio conversation with a model in a completely self-hosted setup without ever sending any data to OpenAI or other companies. To give you an idea of whatโ€™s possible, my full stack with speech to text and text to speech and a Llama 3.1 8 billion parameter model all runs on a single NVidia 3060 graphics card with 12GB of VRAM. If youโ€™re looking to talk with larger, more capable models, like Gemma 27b for example, you might need a larger graphics card or a separate machine to run the language model. ## Updated Stack Architecture In this post, weโ€™re going to introduce a new component into the existing stack. This component is called Speaches (formerly, faster-whisper-server). It provides speech-to-text via Whisper models and text-to-speech capabilities via Kokoro-82M and piper. Since LiteLLM also supports audio models, we are going to hook speeches up to LiteLLM and we should be able to serve both STT and TTS capabilities through our Caddy reverse proxy out to users on the internet. We can also optionally hook up these capabilities to OpenWebUI, which will allow us to talk to locally hosted language models using our voice. graph TD subgraph "Server" subgraph "Docker Containers" OW[OpenWebUI] SP[Speaches] OL[Ollama] LL[LiteLLM] end end subgraph "Internet" USER[Internet Users] end %% External connections USER -->|HTTPS| CADDY[Caddy] CADDY -->|Reverse Proxy| OW CADDY -->|Reverse Proxy| LL %% Internal connections OW -->|API Calls| LL SP -->|API| LL OL -->|API| LL %% Connection styling classDef docker fill:#1D63ED,color:white; classDef internet fill:#27AE60,color:white; classDef proxy fill:#F39C12,color:white; class OW,SP,OL,LL docker; class USER internet; class CADDY proxy; ## Adding Speaches to Docker Compose If youโ€™ve already followed my previous post, you should have a Docker Compose YAML with all the services that already exist on your system set up and defined. We are going to add a new service definition for speaches to this file: services: # ... # your other services like ollama... # ... speaches: container_name: speaches restart: unless-stopped ports: - 8014:8000 healthcheck: test: ["CMD", "curl", "--fail", "http://0.0.0.0:8000/health"] interval: 30s timeout: 10s retries: 3 start_period: 5s # NOTE: slightly older cuda version is available under 'latest-cuda-12.4.1' tag image: ghcr.io/speaches-ai/speaches:latest-cuda environment: - WHISPER__COMPUTE_TYPE=int8 - WHISPER__TTL=-1 - LOOPBACK_HOST_URL=http://192.168.1.123:8014 volumes: - ./hf-cache:/home/ubuntu/.cache/huggingface/hub deploy: resources: reservations: devices: - capabilities: ["gpu"] **Key details from this step:** * We are pulling the `latest-cuda` build. If you find that you have problems, check that your cuda runtime is not old/out of date. You can check this by running `nvidia-smi -q | grep 'CUDA'`. They do offer builds for older runtimes. * The `LOOPBACK_HOST_URL` is used to tell the app running inside the container what the host machineโ€™s IP address is. * Iโ€™m passing `WHISPER__COMPUTE_TYPE=int8` to quantize the models to 8 bit. You can try other options but you may find that it takes up more memory and inference takes longer. * `WHISPER__TTL=-1` forces the server to keep the model loaded in memory all the time. This is usually desirable if you have enough VRAM since loading the model can take a few seconds. If the model is in VRAM I usually get realtime transcription. Itโ€™s lightning fast. * I mapped port `8014` on my host machine to port `8000` that the app runs on inside the container. You can use any free TCP port, it doesnโ€™t have to be 8014. * We persist the huggingface cache directory to disk so that we donโ€™t have to re-download the models every time the container restarts. ## First Run Once youโ€™ve added the service, we can simply execute `docker compose up -d speaches` to get it running for the first time. We can test the transcription service by uploading an audio file. Try recording a short voice clip or converting a short video from youtube using a service like this one curl http://<server_ip>:8014/v1/audio/transcriptions -F "file=@/path/to/file/audio.wav" The first time you do this it could take a little while since speaches will have to download the models from Huggingface. Then youโ€™ll get some JSON output containing the transcript. Hereโ€™s an example from when I ran this command using a simpsons audio clip I found on youtube. > curl http://myserver.local:8014/v1/audio/transcriptions -F "file=@/home/james/Downloads/Bart Simpson Ay Caramba.wav" {"text":"Barg, you really shouldn't be looking through other people's things. Find anything good? I said it before and I'll say it again. Ay, carumba! Elise, bang bang! Aw, Barg, that's a blackhead gun! Eww!"}% If you plan to use TTS, you will need to follow some extra steps documented on the speaches website next: export KOKORO_REVISION=c97b7bbc3e60f447383c79b2f94fee861ff156ac # Download the ONNX model (~346 MBs) docker exec -it speaches huggingface-cli download hexgrad/Kokoro-82M --include 'kokoro-v0_19.onnx' --revision $KOKORO_REVISION # Download the voices.bin (~5.5 MBs) file docker exec -it speaches curl --location --output /home/ubuntu/.cache/huggingface/hub/models--hexgrad--Kokoro-82M/snapshots/$KOKORO_REVISION/voices.bin https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.bin If you would prefer to use the piper series of models, you can run **one** of the following commands to download voice models for it: # Download all voices (~15 minutes / 7.7 GBs) docker exec -it speaches huggingface-cli download rhasspy/piper-voices # Download all English voices (~4.5 minutes) docker exec -it speaches huggingface-cli download rhasspy/piper-voices --include 'en/**/*' 'voices.json' # Download all qualities of a specific voice (~4 seconds) docker exec -it speaches huggingface-cli download rhasspy/piper-voices --include 'en/en_US/amy/**/*' 'voices.json' # Download specific quality of a specific voice (~2 seconds) docker exec -it speaches huggingface-cli download rhasspy/piper-voices --include 'en/en_US/amy/medium/*' 'voices.json' we can test that it worked by running a request against the speech endpoint: curl http://myserver.local:8014/v1/audio/speech --header "Content-Type: application/json" --data '{"input": "Hello World!"}' --output audio.mp3 ## Adding Models to LiteLLM Next we need to add the audio models to LiteLLM. Weโ€™re going to edit the existing `config.yaml` file and add the two new models: model_list: # ... # your other models go here... # ... - model_name: whisper litellm_params: model: openai/Systran/faster-whisper-large-v3 api_base: http://192.168.1.123:8014/v1 model_info: mode: audio_transcription - model_name: Kokoro-82M litellm_params: model: openai/hexgrad/Kokoro-82M api_base: http://192.168.1.123:8014/v1 - model_name: piper litellm_params: model: openai/hexgrad/Kokoro-82M api_base: http://192.168.1.123:8014/v1 Once you restart litellm, you should now be able to run the same tests from the previous section but using your litellm endpoint and credentials instead. Testing Speech to Text with LiteLLM: curl https://litellm.yoursite.example/v1/audio/transcriptions \ -H "Authorization: Bearer sk-your-token" \ -F model=whisper \ -F "file=@/home/james/Downloads/Bart Simpson Ay Caramba.wav" Testing Text to Speech with LiteLLM: curl https://litellm.yoursite.example/v1/audio/transcriptions \ -H "Authorization: Bearer sk-your-token" \ -H "Content-Type: application/json" \ --data '{"model":"Kokoro-82M", "input": "Hello World! ROFLMAO", "voice":"bf_isabella", "language":"en_gb"}' \ -o test.wav ## Connecting OpenWebUI to LiteLLM To add voice capability to OpenWeb UI, log in as the admin user (by default the first one you would have set up when you installed OpenWebUI, click on your username in the bottom left hand corner of the screen. Go to Admin Panel. Navigate to the Settings Tab and then to Audio. Here you can populate the STT and TTS settings. In * You can use the same endpoint for both - it is your litellm instance URL with `/v1` appended e.g. `https://litellm.mydomain.example/v1` * You can use the same litellm API key for both - I like having different keys for different apps so that I can see usage across my software stack but you can also just use litellmโ€™s admin user password as a key if you prefer. * * For the STT model enter the corresponding name from the yaml - `whisper` in the example above * For TTS model enter the model name you used in the litellm config either `piper` or `Kokoro-82M` in the example above * In the TTS settings you also need to specify a voice. A full list of available voices can be found by going to the speaches demo gradio app (likely running on `http://yourserver.local:8014` ) and looking at the Text-To-Speech tab * My personal preference is `bf_isabella` with `Kokoro-82M` or the `en_GB-alba-medium` voice and `piper` model. I tend to stick to piper at the moment due to a weird quirk/bug I found with the voice prosody and pronunciation (see below) ## Testing Calls To talk to a model, go to OpenWebUI, select the model you want to interact with and click the call icon. In this mode, OpenWebUI will attempt to โ€œlistenโ€ to your microphone and pass the audio to the whisper endpoint. When you stop talking, whisper will indicate a break and what you said so far will be processed by the language model. A response is generated and passed to the TTS endpoint before it is played back to you. I noticed that this doesnโ€™t always work perfectly in Firefox Mobile, it seems to get stuck and not play back the response. However, Chromium based browsers seem to get this right. ## Prosody and Language Quirk/Bug As of writing there is a weird quirk with speaches where the `Kokoro-82M` UK english voices will default to american prosody/pronunciation of words if you do not specifically set the language as part of your request. For example, try running the following command against your own server and youโ€™ll see that the model pronounces โ€œcommonโ€ as โ€œcarmenโ€ and problem as โ€œpra-blemโ€ which sounds weird in an English accent. curl https://litellm.yoursite.example/v1/audio/transcriptions \ -H "Authorization: Bearer sk-your-token" \ -H "Content-Type: application/json" \ --data '{"model":"Kokoro-82M", "input": "This is quite a common problem", "voice":"bf_isabella"}' \ -o test.wav Unfortunately OpenWebUI does not currently have an option for passing the userโ€™s language preference to the model which means that the pronunciation is always off for me. There are a couple possible solutions I can think of: 1. Have OpenWebUI pass a `language` param when it is making TTS API calls to litellm. 2. Have Speaches map the language of the voice automatically. Voices in Kokoro have a naming convention with the country of origin and a gender attached (e.g. american female voices prefixed `af`, british male voices prefixed `bm` and so on.) Speaches could infer the language from that. Alternatively, language could be stored in the metadata somewhere. I might make some pull requests if Iโ€™m feeling cute later. ## Conclusion Iโ€™m pretty blown away by how accurate and realistic these small models that run on a single consumer GPU can be. Itโ€™s really useful to be able to have a full speech-to-text and text-to-speech stack running locally and not have to worry about privacy. If you wanted to, you could swap out the language model in this stack for an externally hosted one like Claude Sonnet or a groq-hosted open-ish model like Llama 3.3 70b. You could even make ironic use of GPT-4o. In future articles Iโ€™ll be covering some other use cases for these tools including external voice transcription apps and tools that can plug in to your whisper API and my self-hosted home assistant stack which I am using to replace Alexa with fully self-hosted home automation tooling.
06.04.2025 09:10 ๐Ÿ‘ 0 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Weeknote 13 Today is Motherโ€™s day and temporal confusion day when the clocks go forward for reasons nobody really understands or cares about any more. Yesterday we drove up to Purton on the outskirts of Swindon to meet my mum and her partner for an early Motherโ€™s day lunch. We went to The Bell at Purton last year too. It is quite a nice little pub, tucked away behind the A419 and far enough away from Swindon that you can barely hear the chaos of the magic roundabout. In the evening we went to see Novocaine, a silly film about a guy who canโ€™t feel pain and has to endure a lot of physical abuse to save his girlfriend. It was funny, a little dry for my tastes but Jack Quad plays a personable everyman. This week was fairly uneventful on the work front. I met one of my reports for the first time as he had been off on paternity leave since I joined the company. I also had a few interesting conversations about how people in my team and wider department are using or failing to use AI development tools. I also started exploring SLM server benchmarking tools for measuring throughput, latency etc of models. I also led discussions with the team about using small/specialised models for an NEP-like use case instead of using huge frontier models which feel a bit overkill. _**The little gnome that my father-in-law hid in our garden at Christmas, resting against the back of our tree.**_ Iโ€™ve been doing some gardening this week too. Nothing too fancy, just tidying the raised beds ready to put some vegies in and a bit of pruning. I put down some lawn weed + feed stuff last week and decided to buy myself an electric lawn scarifier to rake up dead moss rather than doing this manually. I also found a little drunk gnome that my father and uncle made hidden in our garden at Christmas time resting against the back of our tree. Iโ€™ve found that Iโ€™ve been quite good at journaling this week. Iโ€™ve been writing in my physical notebook most days and then using my local vision/language model setup to transcribe. Itโ€™s quite interesting to see my thoughts in posterity. I am aiming to keep it up. Iโ€™ve been reading Tiny Experiments and taking notes about that too. The journaling is a bit of a side effect from this book tbh. If I keep track of my thoughts better, I should be better able to track the results of my tiny experiments. The gardening bed has one of my experiments this week. Another was doing High Intensity Interval Training twice before work. I managed HIIT on Wednesday but despite doing a warm down routine and stretching, I felt quite sore and achy the rest of the week and didnโ€™t manage another session. I will try again this week and see if my fitness has improved. Next week Iโ€™m also planning to be in London twice as usual. Iโ€™ve got the dentist tomorrow which is always a joy and aside from that Iโ€™m hoping for another relatively quiet week. My main goal is to get the lawn sorted and maybe try to plant some veggies.
30.03.2025 07:37 ๐Ÿ‘ 0 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
an atom echo device being held up in front of the esp home flashing utility displayed on a dell laptop

an atom echo device being held up in front of the esp home flashing utility displayed on a dell laptop

Having a go at setting up home assistant with ESPHome since itโ€™s high time I chucked Alexa out

#HomeAssistant #ESPHome

22.03.2025 09:47 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Weeknote 11/2025 Week 11 of 2025 has happened already. This week I was in London on Monday and Thursday and got the opportunity to spend some time with colleagues from TRโ€™s Zug office who had travelled over to London for work. On Thursday, my manager even took us out for sushi which was an unexpected bonus. I started the week by doing battle with AWS Lambda Authorizer functions which it took me a long time to wrap my head around. I made some notes about my experience in my digial garden. After this, I also started to experiment with importing some data into neo4j and found that it wouldnโ€™t work on my Mac M4 machine until I fiddled with some java settings. Iโ€™m still enjoying my new role and trying to find the right balance between my management responsibilities and opportunities for individual contributions. At home we had our new laminate flooring put down in the living room and finally got rid of the cheap off-white carpet that the previous owner had put down which was tattered and had threads pulled in it from a few years of use. It is particularly funny when the cats The new floor is lovely and should prove much easier to clean and hard wearing. We are planning on getting a couple of rugs to put down and make the room a bit cosier eventually. One particularly amusing thing is that the cats have been playing on the new floor and they are having a great time. They keep skittering across the floor like Roadrunner and Coyote, running on the spot before they get any purchase. On Friday we went to see Black Bag which is a british cerebral spy movie starring Cate Blanchett and Michael Fassbender as a couple who both work for MI6. It focusses on their ability to compartmentalise their personal and work lives as they are embroiled in a conspiracy. It was a really good, fairly twisty film. Itโ€™s a little bit of a slow-burner but it does a great job of building tension and suspense. Following their annoying and anti-consumer move to prevent you from downloading Kindle books, Amazon this week announced thta they are turning off local processing of voice commands for some Alexa devices. Although, as Terence Eden points out, this is something of a nothing-burger for most Alexa users since most devices have always sent the voice commands to Amazon servers. However, it reminded me about my ambition to drop Amazon and spin up some local voice assistant software, namely Home Assistant. I finally got started this weekend, installing Home Assistant on my Raspberry Pi and setting up a few basic automations. Itโ€™s been interesting to get to know how the system works and connect it to my existing โ€œsmartโ€ devices. The main use cases I have for Alexa are turning my lights on and off, setting timers for cooking and playing music through bluetooth speakers. It looks like Home Assistant isnโ€™t brilliant for music stuff but I will be exploring setting up a music player satelite with a Pi Zero and Squeezelite. Iโ€™m also continuing to use Hoarder to create a personal web archive. One feature that hoarder doesnโ€™t yet have is the ability to export epubs of articles that Iโ€™ve captured for later reading. I still use Wallabag and KOReader on my Kobo device for this purpose and managing both is a bit of a pain. This week, I created HoardBag which is a python-based cron script that periodically checks a list in my Hoarder instance and syncs new items over to Wallabag for later reading. The script copies the captured content from Hoarder straight over to Wallabag without the need for any further scraping or web requests which means it works nicely with paywalled content captured using the Hoarder SingleFile plugin. Iโ€™ve got the script set up to run once every 15 minutes on my server. This will allow me to keep track of all the articles that Iโ€™ve captured and easily access them later from my Kobo e-ink reader when Iโ€™m ready. I initialy struggled for a while trying to get my python tests to run in CI due to mis-configured Forgejo runner labels. The official documentation currently recommends setting up the runner so that `ubuntu-latest` uses a nodejs docker image but this does not work with setup-python or other python action steps. Eventually I realised that aliasing `ubuntu-latest` to catthehackerโ€™s ubuntu act images would do what I needed. I made notes on this in my digital garden. Iโ€™m still (slowly) reading the first of Michael Connellyโ€™s Bosch books, โ€˜The Black Echoโ€™ but Iโ€™m excited to get stuck into Anne Laureโ€™s Tiny Experiments when I have the headspace. Next week, I hope that the weather will continue to be fine and sunny and I can get some garden work done. Iโ€™ll be in London on my usual schedule for work and on the weekend we are planning to head to Twickenham stadium to see Sailawaze Live, a cruise expo that Mrs R won tickets for a couple of weeks ago. Hopefully weโ€™ll get plenty of free samples and to try out some of the activities including archery and zip lining.
16.03.2025 13:22 ๐Ÿ‘ 0 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Getting AI Assistants to generate insecure CURL requests Testing AI code assistants willingness to generate insecure CURL requests

AI code assistants can introduce hidden security risks. I observed that 4 frontier models add Hard to spot but potentially catastrophic HTTPS vulnerabilities when fixing "broken" code. #infosec #AI #CodeSafety #curl brainsteam.co.uk/2025/2/12/ai...

12.02.2025 13:20 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Estimates on the number of professional software engineers range around 20-35 million, globally.

With GenAI, anyone can instruct an agent to code. That doesnโ€™t make them a professional: but eventually it might force them to hire one when complexity gets out of hand!

21.12.2024 08:49 ๐Ÿ‘ 32 ๐Ÿ” 2 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 0
The AI Employee Era Has Begun
The AI Employee Era Has Begun YouTube video by ThePrimeTime

Replacing bus drivers with "self-driving" buses that actually require 2 drivers on board for safety seems like a little glimpse into the future for any companies thinking of replacing software developers with "cheaper" "A.I." devs.

www.youtube.com/watch?v=97wr...

19.12.2024 13:52 ๐Ÿ‘ 7 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1

My body is a machine that turns small tasks into weeks of anxiety

15.12.2024 18:14 ๐Ÿ‘ 95 ๐Ÿ” 9 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

I reckon it depends where you want to end up career-wise. A stint at OpenAI probably reflects well if you're trying to get into SilVal or raise VC funding (at least while the bubble holds). If you want to go into research/academia it might not have quite the same appeal!

08.12.2024 11:23 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0