Zac Siegel's Avatar

Zac Siegel

@zsiegel.com

Software Wizard. Writing about and building with Kotlin and Ollama AI models http://zsiegel.com http://youtube.com/@zsiegel87

62
Followers
521
Following
13
Posts
18.01.2025
Joined
Posts Following

Latest posts by Zac Siegel @zsiegel.com

Preview
The Questions AI Coding Agents Are Forcing Me to Ask It All Starts with People and Teams The Productivity Divide and Team Dynamics AI coding agents are delivering remarkable productivity gains to engineering teams, but how do we ensure these benefits...

The Questions AI Coding Agents Are Forcing Me to Ask

zsiegel.com/the-question...

05.06.2025 02:49 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Why 2025 Feels Like 2005: AI and the Rebirth of Personal Computing The release of Claude Code and other advanced AI software development tools marks a pivotal moment in personal computingβ€”one that feels remarkably similar to the Web 2.0 revolution of the mid-2000s. ...

Inspiring weekend cooking with Claude Code. Sharing some thoughts after spending $37 in tokens and building the beginning of my "personal" MCP server.

zsiegel.com/why-2025-fee...

26.05.2025 02:35 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
An Interview with Tailscale Co-Founder and CEO Avery Pennarun An interview with Tailscale CEO and co-founder Avery Pennarun about Tailscale, and how he’s been learning to build a New Internet his whole life.

Really enjoyed this podcast with one of the founders of @tailscale.com

stratechery.com/2025/an-inte...

16.03.2025 01:56 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This is such an awesome part of uv.

I used to really dislike and not understand python build/package tooling but now I feel like uv has solved a lot of the problems I had with previous tooling.

04.02.2025 23:01 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Very exciting and running great via Ollama! Trying it out today replacing qwen2.5 coder

30.01.2025 17:57 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - zsiegel/mlx-gpt: Recreate GPT with Apple MLX framework guided by Andrej Karpathy Recreate GPT with Apple MLX framework guided by Andrej Karpathy - zsiegel/mlx-gpt

Spent some time recreating GPT alongside the video from @karpathy.bsky.social using the Apple MLX machine learning framework.

I found it to be a fun exercise and learned a ton!

github.com/zsiegel/mlx-...

29.01.2025 03:11 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
The chart illustrates two sets of comparisons for large language models (LLMs) in multimodal and text-to-image benchmarks:

Left Panel:

Performance vs. Model Size
	β€’	X-axis: Number of LLM Parameters (in billions).
	β€’	Y-axis: Average performance on four multimodal understanding benchmarks.

Key Observations:
	β€’	Janus-Pro-7B: Achieves the highest average performance (~64) with 7 billion parameters.
	β€’	LLaVA-v1.5-7B: Performs slightly lower (~60), with similar parameters.
	β€’	TokenFlow-XL also shows notable performance at a higher parameter scale (>10B).
	β€’	Smaller models, such as Show-o and Janus-Pro-1B, have significantly reduced performance scores (~46–54).

Right Panel:

Instruction-Following Benchmarks (GenEval and DPG-Bench)
Accuracy (Y-axis):
	β€’	GenEval:
	β€’	Top-performing models: Janus-Pro-7B (80%), SDXL (~67%).
	β€’	Lowest-performing model: PixArt-Ξ± (48%).
	β€’	DPG-Bench:
	β€’	Best performance: Janus-Pro-7B (84.2%) and SDXL (~83.5%).
	β€’	Other models like Emu3-Gen (~71.1%) perform less consistently.

Key Takeaways:
	1.	Janus-Pro Family consistently outperforms other models across both understanding and generation tasks, emphasizing its robustness.
	2.	Model size correlates positively with performance in multimodal understanding tasks. However, some smaller models (e.g., LLaVA) deliver competitive results in specific benchmarks.

The chart illustrates two sets of comparisons for large language models (LLMs) in multimodal and text-to-image benchmarks: Left Panel: Performance vs. Model Size β€’ X-axis: Number of LLM Parameters (in billions). β€’ Y-axis: Average performance on four multimodal understanding benchmarks. Key Observations: β€’ Janus-Pro-7B: Achieves the highest average performance (~64) with 7 billion parameters. β€’ LLaVA-v1.5-7B: Performs slightly lower (~60), with similar parameters. β€’ TokenFlow-XL also shows notable performance at a higher parameter scale (>10B). β€’ Smaller models, such as Show-o and Janus-Pro-1B, have significantly reduced performance scores (~46–54). Right Panel: Instruction-Following Benchmarks (GenEval and DPG-Bench) Accuracy (Y-axis): β€’ GenEval: β€’ Top-performing models: Janus-Pro-7B (80%), SDXL (~67%). β€’ Lowest-performing model: PixArt-Ξ± (48%). β€’ DPG-Bench: β€’ Best performance: Janus-Pro-7B (84.2%) and SDXL (~83.5%). β€’ Other models like Emu3-Gen (~71.1%) perform less consistently. Key Takeaways: 1. Janus-Pro Family consistently outperforms other models across both understanding and generation tasks, emphasizing its robustness. 2. Model size correlates positively with performance in multimodal understanding tasks. However, some smaller models (e.g., LLaVA) deliver competitive results in specific benchmarks.

πŸ‹ Alert! DeepSeek Janus-Pro-7B

It’s multimodal and outperforms Dalle-E and StableDiffusion

Probably the biggest feature is it’s ability to generate text in an image that actually makes sense

They be cooking, I’m here for whatever is served

huggingface.co/deepseek-ai/...

27.01.2025 20:41 πŸ‘ 33 πŸ” 4 πŸ’¬ 2 πŸ“Œ 0

Very cool just ordered one!

26.01.2025 20:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Anyone with experience using n8n or langchain?

Looking to run my own self hosted system for automations and agents. Seems like n8n has all the right integrations and then langchain has more AI sauce. Thoughts?

26.01.2025 03:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The biggest difference I notice when using deepseek-r1 for coding tasks is that it examines and uses existing code so much better than previous LLMs.

That is a massive win for most developers who are working in existing large codebases.

25.01.2025 19:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I love the idea of this and thinking about running my own MCP server to expose things both privately and publicly for various use cases.

25.01.2025 02:27 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The extra money spent on 64GB ram on my M4 Max is very well spent.

Being able to run larger LLM models locally via Ollama for code assistance is wonderful.

Qwen coder and Deepseek R1 are both excellent for my daily use cases.

25.01.2025 02:24 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Been debugging an issue for 48 where our apps built from CI pipelines break at runtime. But when we push to the App Store manually via a local machine it works.

Finally realized our CI machines are running a different version of Xcode than what we all are developing on. 🫠🫠🫠

25.01.2025 02:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Been polishing up a "home AI" project that lets me use Llama 3.2 using Ollama to find anything in my personal documents.

Been a fun project trying out OCR tech, LLM tool calling, and different kinds of search and RAG techniques!

19.01.2025 01:11 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0