broke my toe this week but we still shipped ep 2 😅
DGX Spark: “inference box” vs dev rig
▶️ youtu.be/0CI19dXmOws
reply w/ your stack: GPU/CPU + RAM + runner + model/quant
broke my toe this week but we still shipped ep 2 😅
DGX Spark: “inference box” vs dev rig
▶️ youtu.be/0CI19dXmOws
reply w/ your stack: GPU/CPU + RAM + runner + model/quant
Last week this post was for paid subscribers. Today it’s live for free subscribers.
Have You Heard of Logits? Tool calling vs grammars, and the “one character off” failure mode that turns into agent loops.
open.substack.com/pub/soypetet...
I keep hearing about tool calling. I rarely hear about logits.
If you’re building agents (not chatbots), “almost valid” outputs don’t fail gracefully. They turn into retries, loops, and wasted cycles.
open.substack.com/pub/soypetet...
I wrote up the workflow I actually use with AI as a software engineer: scope by ticket, keep debugging in the same session, use tests as forcing functions, and keep guardrails in Claude.md.
open.substack.com/pub/soypetet...
2025 was a wild ride—AI deep-dives, honest data modeling talks, and more community time than ever. My take: AI doesn’t replace engineering, it enables it. Check out my year-in-review thoughts and what’s coming in 2026. open.substack.com/pub/soypetet...
#GoWestConf
Sharing a fantastic blog by @soypetetech.bsky.social!! 🙌
Thank you so much for mentioning my session!
Even as a first-timer, I’m really happy and proud of being part of such an amazing moment with everyone. 🥰
Five Years of Go West
soypetetech.substack.com/p/five-years...
Go West Conf 2025 is LIVE 🎉
Sixth year since 2020 — 85% Utah-based, 100% community-powered.
Couldn’t do it without Derrick, Annalisa, and Boston ❤️
🎥 Watch live: twitch.tv/soypetetech
🙌 And join us next year for the best Go vibes around.
#GoWest #Golang #UtahTech
Been thinking about how most GenAI tools feel designed from the front-end in. Langfuse, LangChain, even MCP all assume the client leads.
What happens when we shift toward server-side agents?
I think Go has a role to play.
open.substack.com/pub/soypetet...
#genai #golang #aiinfra #llmops
Just published a new post
Go was built for concurrency—but should you be using it? Most modern Go services run better without it.
Here’s when to lean in, and when to step back.
📖 open.substack.com/pub/soypetet...
#golang #microservices #cloudnative #devtools
AI just leveled up for devs.
I turned a 2-day Go job into a 4-hour sprint using Claude Code.
Prompt-driven AI > autocomplete.
You’re still the engineer—AI just ships faster.
📝 Read:
open.substack.com/pub/soypetet...
#ClaudeCode #Golang #AIforDevs #LLMOps #DevTools
I gave a talk recently about self-hosting AI models, and I’ve turned that into a new post.
The post covers how I started with local-first tools like Ollama and Llama.cpp, and why I still run them
Check it out here:
open.substack.com/pub/soypetet...
#LLMDev #SelfHost #LocalAI #Ollama #LlamaCpp
Finally setting up my Mac Studio on stream today. I've got a script that boots up my whole dev environment in one shot—CLI tools, AI stuff, everything. Come hang if you like clean setups, terminal life, or software tools that just work.
🔴 [twitch.tv/soypetech]
#MacStudio #DevSetup #HomeLab
Small data team? Big goals? I wrote about how I’ve built practical, high-impact data platforms as a team of one—balancing AI dreams, BI demands, and real-world constraints.
Read & comment: substack.com/@soypetetech...
#DataEngineering #Startups #ModernDataStack #AI #BI
Everyone’s building agents. But the truth? They’re just software wrapped around an LLM.
If we want reliability, scale, and ROI, we need fewer frameworks and more engineering—and we can’t ignore NLP.
open.substack.com/pub/soypetet...
#LLM #AI #NLP #DataEngineering #Agents #LangChain #Substack
Can an M3 Ultra Mac really outperform an RTX 5090 PC in LLM benchmarks?
This Memorial Day, I’m running a full LLaMA.cpp showdown to test it live.
Will be Live on Twitch & YouTube.
#LLAMAcpp #RTX5090 #M3Ultra #MacStudio #AIInfrastructure #Benchmarking #LocalLLM #OpenSourceAI
New post: Prompt Engineering Without the Bloat
What I learned building and why most AI features come down to 2 questions:
How do you talk to the model?
How do you talk to the user?
No overengineering. Just clean design.
open.substack.com/pub/soypetet...
#GenAI #PromptEngineering #SoftwareDesign
gosh I love me some IAC. I dont know why, but it just is sooo cool to run
terraform apply
and watch a world get built
Just streamed compiling llama.cpp for GPU on my RTX 1590.
Ran into:
CUDA arch 12.0 not supported yet
Missing curl + SSL dev libs
WSL docs buried deep
With Twitch chat’s help, we got it running!
👉 open.substack.com/pub/soypetet...
#llamacpp #AIinfra #opensource #gpu #WSL2
I just posted a new video breaking down Pedro’s full 5090 upgrade:
Setup, WSL, LLaMA.cpp install, and a 9x performance boost.
I’m trying to hit 500 subscribers to unlock YouTube monetization—this is now my full-time gig.
Drop a comment if you like the video!
youtu.be/hm4_VJP4GnE
Wild analytics anomaly this week on Twitch:
Jumped from 9 to 835 viewers.
No raid. No follow bump. Just… poof. 800 viewers?
But only 57 unique viewers.
68 live views.
No other signals.
Gut check matters.
Don’t trust dashboards at first glance.
Sanity-check your data—always.
#dataengineering
great! just make sure you have a few rounds of human revision before production 😆
Got a technical deep dive on Go?
GoWest Conf 2025 wants:
• Production stories
• Tools & internals
• Compiler tricks
• Infra, performance, scale
🎤 Submit your in-person talk:
🔗 sessionize.com/gowest-conf-...
📍 Lehi, UT — Oct 24
🔧 Live now! Thought I could just download a binary… turns out I need to recompile llama.cpp for GPU 😅
Today’s stream includes:
- CMake & CUDA debugging
- Compiling from source
- Running a bigger model from Hugging Face
🔗 twitch.tv/soypetetech
#llamacpp #AI #CMake #CUDA
I made a supercut of my RTX 5090 unboxing stream—
featuring the MSI Infinite RS Tower w/ the Ultra 9 285K and all the fixings.
If you were there live, drop a like!
If not, check it out and see what’s powering Pedro now:
youtu.be/ixtMcmEZtGo
🎥 + ⚙️ Live now!
Today’s stream is half content creation, half DevOps adventure:
Recording the talking head segment for the PedroGPT unboxing vid
Deploying the Discord bot to Kubernetes
Come for the AI bot, stay for the cluster chaos.
www.twitch.tv/soypetetech
#Kubernetes #DevOps #LiveCoding
Hey everyone!
Today’s stream is all about refactoring Pedro’s Connector API—we're improving how the LLM connects to Twitch + Discord and pulling out the database logic to make the AI layer clean, reusable, and self-contained.
🔗 twitch.tv/soypetetech
I left my last job as part of the lay-offs.
While I job hunt, I’m going all-in on building content, learning new tech, and investing in myself while I look for the next thing.
Leads welcome!
All links: linktr.ee/soypete_tech
🔧 Live now! We’re refactoring the Pedro Connector API to better integrate our LLM with Twitch + Discord—and removing database logic from the AI layer to make it self-contained.
Cleaner code, smarter bots.
🔗 twitch.tv/soypetetech
#AI #LLM #Golang #LiveCoding #TwitchDev
🚨 Today’s Stream: PedroGPT Goes K3s! 🚨
🔗 twitch.tv/soypetetech
🔗 youtube.com/c/miriahpete...
Join me and let's make Pedro cloud-native!
#K3s #Kubernetes #PedroGPT #LLMs #DevOps #HomeLab #LiveCoding #SelfHosted #Tailscale #Prometheus #InfraAsCode
Going live with Pedro’s Upgrade Party 🎉
🚀 RTX 5090 unboxing
🤖 Migrating my LLM bot Pedro
📊 LLaMA.cpp vs Ollama showdown
🗳️ Community poll results live on stream!
📺 Watch on Twitch + YouTube
twitch.tv/soypetetech