A Concrete Evaluation Framework for LLM-Powered Pipelines | Michael Cizmar
How to stay confident in your model choices when the landscape changes every 90 days
Most #AI teams don't have an evaluation strategy. They have an evaluation event.
Tested before launch. Shipped. Never looked back.
Meanwhile 3 new models dropped this month, 2 cheaper, 1 probably better for the task.
Here's a 5-step framework 2 fix that: michaelcizmar.com/blog/2026/03...
#MLOps
06.03.2026 16:00
π 0
π 0
π¬ 0
π 0
LinkedIn Login, Sign in | LinkedIn
Login to LinkedIn to keep in touch with people you know, share ideas, and build your career.
I'm already tired of hearing about the #hussleculture. It seems to project laziness and apathy. While running an award winning small business for 20 years, I tried to instill 1 thought:
"If you are not exceptional, you are obsolete".
#leadership #oneteam #culture
www.linkedin.com/analytics/po...
04.02.2026 16:13
π 1
π 0
π¬ 0
π 0
Purrview: The Tiny AI Project That WorkedβAnd Why Most Donβt
Most AI projects fail. Not because of missing technology.
AI projects fail bcause theyβr chartered poorly. I built #Purrview 2 prove the opposite...a tiny tool that detects when a cat enters frame & records it.
1) No POC.
2) No roadmap.
3) No βAI transformation.β
4) Use what works
& it works.#AI isnβt failing.AI projects are.
linkedin.com/pulse/purrvi...
07.12.2025 22:07
π 0
π 0
π¬ 0
π 0
Microsoft Virtual Events Powered by Teams
Microsoft Virtual Events Powered by Teams
βHey Michael, this event page is pure AI slop.β
Fair enough, but dont judge a book by its cover.
U, on the other hand, still have 1 hour to join 2 of the most experienced AI practitioners as they share notes from the field bfore it turns in2 actual slop:
events.teams.microsoft.com/event/cc5517...
12.11.2025 17:04
π 0
π 0
π¬ 0
π 0
Hello @microsoft.com #ai #tour to #chicago
25.09.2025 16:31
π 0
π 0
π¬ 0
π 0
Checkout this image I made with #chatgpt5. The prompt was βgenerate me an image of me getting after itβ
14.08.2025 10:40
π 0
π 0
π¬ 0
π 0
At least it was strangers doing it to you versus your loved ones.
21.07.2025 19:43
π 0
π 0
π¬ 0
π 0
#Skype is a dead product except for when you....
21.07.2025 19:42
π 0
π 0
π¬ 0
π 0
Itβs a plant! Donβt be fooled.
21.07.2025 14:16
π 1
π 0
π¬ 1
π 0
Drive Requirements to Testing with BDD to Deliver AI
Successful AI Projects are Focused on Outcomes, Not Simply Outputs
Projects get stuck in the POC Production often because we do not describe the behavior or the work steams we aim to proof. "Is this an image of a cat?" - Asked no one. "Is this claim something we should further review" - Priceless.
#LLM #BDD #TDD #Agents
michaelcizmar.com/drive-requir...
10.12.2024 16:40
π 0
π 0
π¬ 0
π 0
Judging LLM Performance By Synthetic Data Is A Failing ApproachβββPart 1
Knock-offs are never as good as the real thing
Having the Title be : "Plastic Foodservice Film" and your LLM creating the synthentic queries is not the best method to judge your LLM's performance at finding Seran Rap or saran wrap.
#LLM #AI #Relevancy
michaelcizmar.com/judging-llm-...
05.12.2024 15:51
π 1
π 0
π¬ 0
π 0