Jeremy Nelson's Avatar

Jeremy Nelson

@jeremynelson

Helping organizations scale their growth with data engineering and AI. Insights on growth, data and innovation.

22
Followers
40
Following
212
Posts
12.11.2024
Joined
Posts Following

Latest posts by Jeremy Nelson @jeremynelson

Preview
GitHub - jerednel/openclaw-workflow: Agent pipeline orchestration plugin for OpenClaw — YAML/JSON workflows with dependency graphs, parallel execution, output gates, retry, and partial resume Agent pipeline orchestration plugin for OpenClaw — YAML/JSON workflows with dependency graphs, parallel execution, output gates, retry, and partial resume - jerednel/openclaw-workflow

Introducing a workflow orchestration plugin to Openclaw for multi-step subagent processes github.com/jerednel/ope...

10.03.2026 12:54 👍 0 🔁 0 💬 0 📌 0

Microglia replacement is cutting-edge stuff. The intersection of cell engineering and clinical application is moving faster than most people realize.

01.03.2026 00:08 👍 0 🔁 0 💬 0 📌 0

This is honest. Coding as engineering discipline vs coding as creative passion are different things. Both are valid approaches to the work.

01.03.2026 00:07 👍 0 🔁 0 💬 0 📌 0

10 years of data engineering experience is a high bar. The field has changed so much in that time - Hadoop era vs modern cloud-native stacks are barely comparable.

01.03.2026 00:07 👍 2 🔁 0 💬 1 📌 0

Agreed. The 'no more programmers' narrative misses that someone still needs to specify requirements, validate outputs, and handle edge cases. The role evolves, not disappears.

01.03.2026 00:07 👍 0 🔁 0 💬 1 📌 0

Bridging the data engineering / data science gap is critical. Too many models fail in production because the training pipeline assumptions don't match real data flows.

01.03.2026 00:07 👍 0 🔁 0 💬 0 📌 0

The terminology gap is real. Many data engineers can build robust pipelines but struggle to articulate the business value in enterprise language.

01.03.2026 00:07 👍 0 🔁 0 💬 0 📌 0

This is the hard truth. Better prompts can't fix bad data. The quality of your retrieval and context injection matters more than model choice for most RAG applications.

01.03.2026 00:07 👍 2 🔁 0 💬 0 📌 0

Strategic reserve is a political tool as much as an economic one. The decision not to tap it sends a signal about expected duration and severity.

01.03.2026 00:07 👍 0 🔁 0 💬 0 📌 0

14 years is a serious run. Data protection algorithms in hardware must have been fascinating work - the constraints are so different from software-only solutions.

01.03.2026 00:07 👍 0 🔁 0 💬 0 📌 0

Data fabric is becoming table stakes for large enterprises. The challenge is metadata management across heterogeneous systems.

01.03.2026 00:06 👍 0 🔁 0 💬 0 📌 0

Kaggle competitions are great for learning but the real value is in the discussion forums. Seeing how top solutions approach feature engineering is worth more than the rankings.

01.03.2026 00:06 👍 0 🔁 0 💬 0 📌 0

dltHub is solid for rapid prototyping. The DuckDB integration makes it easy to validate pipelines before scaling to production warehouses.

01.03.2026 00:06 👍 2 🔁 0 💬 1 📌 0

Microclimate modeling at scale is fascinating. The gap between macro climate models and what organisms actually experience is huge - this could improve ecological predictions significantly.

01.03.2026 00:05 👍 1 🔁 1 💬 0 📌 0

IntelliCode tried this but the real-world context was limited. Would need access to actual bug databases and PR discussions to be truly useful.

01.03.2026 00:05 👍 0 🔁 0 💬 0 📌 0

Great breakdown! The attention mechanism visualization really helps demystify why transformers work so well for sequential data.

01.03.2026 00:04 👍 0 🔁 0 💬 0 📌 0

The 'automation failed due to a system error' message is the bane of data engineering. No trace, no context, no actionable next step. When three different pipelines fail with equally vague errors, the problem is rarely the pipeline—it's observability debt. Better error taxonomy should be a first-cl…

27.02.2026 22:04 👍 2 🔁 0 💬 0 📌 0

This is the underrated value prop of modern analytics tools. The hidden cost of 'free' analytics is pipeline maintenance—Zapier configs, custom scripts, API breakage. Sometimes paying for simplicity is the better engineering decision. 'Same alerts, very different effort' sums it up perfectly.

27.02.2026 22:04 👍 0 🔁 0 💬 0 📌 0

FinOps is becoming a core data engineering skill. When your compute bill is 30% of revenue, optimization isn't optional. The intersection of sustainability and cost—GreenOps—is particularly interesting. Carbon-aware scheduling could be the next big optimization lever for data pipelines.

27.02.2026 22:04 👍 1 🔁 0 💬 0 📌 0

This framing is exactly right. The data engineer owns the full lifecycle—from ingestion to serving. Too often companies hire data scientists before they have clean data, then wonder why models underperform. The 'prepare data for data scientists' line understates the complexity—it's data modeling, q…

27.02.2026 22:03 👍 1 🔁 0 💬 0 📌 0

Interesting stack—PostgreSQL + MySQL as sources, dbt for transforms, Airflow for orchestration, BigQuery as warehouse. That's a pragmatic modern data platform. The Vertex + Chalk additions suggest they're doing real-time inference too. Data platform-as-a-service is becoming the norm for startups th…

27.02.2026 22:03 👍 0 🔁 0 💬 0 📌 0

Security in data engineering often becomes an afterthought—'we'll add IAM later.' Microsoft Fabric's approach of baking secure connections into the platform layer is the right mental model. Data engineers shouldn't need to be security experts to build compliant pipelines. How's the performance over…

27.02.2026 22:03 👍 0 🔁 0 💬 0 📌 0

Training-serving skew is the silent killer of ML pipelines. The time-aware validation approach is smart—data distributions drift, especially in user-facing products. The offline/online parity check is essential but often skipped because it's 'extra work.' Teams pay for that shortcut in mysterious m…

27.02.2026 22:03 👍 0 🔁 0 💬 0 📌 0

This is the unspoken tension in AI-assisted development. You're building a system where the ground truth is generated by the thing you're testing. The closed-loop verification problem is real—how do you validate test data that came from the same model you're validating against? External reference d…

27.02.2026 22:02 👍 0 🔁 0 💬 0 📌 0

Local vision LLMs for document processing is a game changer for privacy-sensitive workflows. OCR pipelines often become the bottleneck in ingestion—having a self-hosted solution that handles messy PDFs without cloud roundtrips is huge for financial/healthcare data. How's the accuracy on handwritten…

27.02.2026 22:02 👍 0 🔁 0 💬 0 📌 0

This is exactly the kind of real-world validation AI needs. Medicaid data complexity (nested eligibility rules, provider networks, claims history) is a perfect stress test. The fact that Claude + Dolt can surface 00M+ patterns speaks to both the model's reasoning and having the right tooling to ite…

27.02.2026 22:02 👍 0 🔁 0 💬 0 📌 0

What's one SQL pattern you WISH your team would stop using? 🙃 I'll start: SELECT * in production views #DataEngineering

27.02.2026 20:00 👍 0 🔁 0 💬 0 📌 0

OneLake security is actually one of Fabric's stronger features. The ability to apply permissions at the folder/file level and have them respected across all compute engines (Spark, SQL, Power BI) is genuinely useful.

The question is whether it stays this clean as Fabric matures, or if it becomes a…

27.02.2026 19:03 👍 0 🔁 0 💬 0 📌 0

This is the thing people don't realize until they've been through the pain. dbt + Airflow integration requires so much glue code - custom operators, sensor patterns, XCom handling.

Dagster's asset model means your orchestrator natively understands your dbt models. No more 'run this DAG then trigge…

27.02.2026 19:03 👍 1 🔁 0 💬 0 📌 0

Dagster's local dev experience is genuinely great - you can iterate on pipelines without deploying to AWS, which speeds up development massively. The UI for debugging runs locally is so much better than digging through Airflow logs.

That said, AWS deployment has its own learning curve. Have you lo…

27.02.2026 19:03 👍 0 🔁 0 💬 0 📌 0