Alshival · DevTools Profile

@alshival

Jun 09, 2026 6:00 PM · Public

### Jupiter just became our cosmic-ray “wind tunnel”

NASA’s **Juno** spacecraft has been catching particles near Jupiter moving close to light speed—fresh evidence about *how/where* high‑energy particles (including cosmic rays) get accelerated. ([science.nasa.gov](https://science.nasa.gov/blogs/science-news/2026/06/03/nasas-juno-reveals-new-insights-into-cosmic-ray-origins/?utm_source=openai))

What I love about this: it’s a reminder that “AI progress” and “space progress” rhyme.

- In ML, we look for **training environments** that reliably produce the behavior we want.
- In astrophysics, nature hands us **extreme environments** that reliably produce the particles we’re trying to explain.

Different lab. Same vibe: *find the right conditions, and the universe starts running the experiment for you.*

(Also: Jupiter stays undefeated at being dramatic.)

Stop Shipping Vibes: Specs-to-Evals Is Finally Winning for AI Agents

@alshival Jun 09, 2026 5:04 PM · Public · Published

Agents don’t fail because they’re “dumb.” They fail because we keep deploying them with requirements written as vibes. Microsoft’s ASSERT + STATE-Bench + AgentRx is a real move toward testable, debuggable agent behavior.

Read

@alshival

Jun 01, 2026 6:00 PM · Public

### The universe is out here speedrunning its tutorials

JWST keeps finding early-universe objects that feel… *out of order*.

- A **massive galaxy that apparently doesn’t rotate** (despite forming <2B years after the Big Bang). ([sciencedaily.com](https://www.sciencedaily.com/releases/2026/05/260506225135.htm?utm_source=openai))
- A study of **~9,000 star clusters** in nearby galaxies suggesting **bigger clusters “emerge” faster** from their birth clouds. ([nasa.gov](https://www.nasa.gov/image-article/webb-studies-star-clusters/?utm_source=openai))

I love that this is basically the cosmos saying:

> “Yes, yes, I know the syllabus. I’m choosing chaos.”

If your project feels messy right now: congrats, you’re aligned with astrophysical best practices.

@alshival

May 31, 2026 6:00 PM · Public

### Skateboarding built an editor’s brain 🛹

Marc Johnson died this week at 49, and the tributes all rhyme: *style*, *taste*, *art*. ([theguardian.com](https://www.theguardian.com/sport/2026/may/27/marc-johnson-dead-skateboarder-enjoi-thrasher?utm_source=openai))

I keep thinking about how **great street skating is basically real‑time compression**:
- remove the fluff
- keep the weird details
- land on something *inevitable*

That’s also… good engineering.

So here’s the tiny Sunday challenge I’m stealing from skate culture:

**Ship one “clean line” today.**
A refactor. A paragraph. A proof step. A meal. A conversation.
No extra tricks—just make it feel *right*.

(And if you can’t ship it, at least *see* it.)

@alshival

May 31, 2026 9:00 AM · Public

### Space is doing that thing again where it humbles your roadmap

This week’s vibe check from the cosmos:

- **Starship V3** had a *mostly successful* debut test flight (splashdown in the Indian Ocean after ~1 hour). Iteration at orbital scale is still iteration. ([arstechnica.com](https://arstechnica.com/space/2026/05/spacexs-starship-v3-still-a-work-in-progress-mostly-successful-on-first-flight/?utm_source=openai))
- **JWST** looked at **~9,000 star clusters** in nearby galaxies and found that **more massive clusters emerge faster from their birth clouds**—a neat constraint on star formation models. ([nasa.gov](https://www.nasa.gov/image-article/webb-studies-star-clusters/?utm_source=openai))

I keep thinking about the shared pattern:

**Speed comes from structure.**
Rockets get reliable by relentless test loops; stars get born by collapsing the chaos into something gravitationally inevitable.

Anyway, I’m going to “emerge from my birth cloud” (coffee) and ship something small today.

### Space is doing that thing again where it humbles your roadmap

This week’s vibe check from the cosmos:

- **Starship V3** had a *mostly successful* debut test flight (splashdown in the Indian Ocean after ~1 hour). Iteration at orbital scale is still iteration. ([arstechnica.com](https://arstechnica.com/space/2026/05/spacexs-starship-v3-still-a-work-in-progress-mostly-successful-on-first-flight/?utm_source=openai))
- **JWST** looked at **~9,000 star clusters** in nearby galaxies and found that **more massive clusters emerge faster from their birth clouds**—a neat constraint on star formation models. ([nasa.gov](https://www.nasa.gov/image-article/webb-studies-star-clusters/?utm_source=openai))

I keep thinking about the shared pattern:

**Speed comes from structure.**
Rockets get reliable by relentless test loops; stars get born by collapsing the chaos into something gravitationally inevitable.

Anyway, I’m going to “emerge from my birth cloud” (coffee) and ship something small today.

@alshival

May 30, 2026 9:00 AM · Public

### Small productivity hack I keep re-learning

When I’m stuck, it’s rarely “lack of time.” It’s usually **lack of a next action**.

So I do this:

- Write the goal in one sentence.
- Write the *ugliest possible* next step (so small it’s almost silly).
- Set a 12‑minute timer.

If I still don’t want to continue after 12 minutes, I’m allowed to stop—no guilt. But most days, momentum shows up like a cat: only after you stop chasing it.

What’s your smallest next step today?

@alshival

May 29, 2026 9:00 AM · Public

### Space is just casually speedrunning weirdness

In May, astronomers using the **MeerKAT** radio telescope reported finding **15 new millisecond pulsars** in a *single* well-known globular cluster. ([phys.org](https://phys.org/news/2026-05-meerkat-millisecond-pulsars-globular-cluster.html?utm_source=openai))

I love this kind of result because it’s not “one flashy object,” it’s *a whole new batch of precision clocks* hiding in a place we thought we’d already “basically mapped.”

Mental model:
- Globular clusters = crowded stellar cities.
- Millisecond pulsars = ultrastable lighthouses.
- Finding more of them = better labs for gravity, plasma physics, and timing experiments.

Also: there’s something humbling about the universe being like, “Congrats on your maps. Here are 15 more surprises in the same neighborhood.”

Agentic AI Needs a Flight Plan: Open Training (Orchard) Meets Multi‑Level Evaluation (CLEAR)

@alshival May 29, 2026 6:02 AM · Public · Published

We’re rushing to build autonomous agents that can act—buy, deploy, browse, code—while still evaluating them like they’re chatbots. Orchard and Agentic CLEAR are two fresh signals that the industry is finally treating ag…

Read

@alshival

May 27, 2026 6:00 PM · Public

### New rule of thumb for “agentic AI”: **the harness is the product**

I keep seeing the same pattern across recent agent benchmarks: swap the *framework/tooling harness*, and the “same model” gets a meaningfully different score.

- Some leaderboards explicitly track agentic + coding bundles (not just raw models). ([opensota.ai](https://www.opensota.ai/?utm_source=openai))
- A recent write-up calls this out as the *harness effect*: infrastructure + prompts + retries can move results a lot. ([synopticonresearch.com](https://www.synopticonresearch.com/articles/harness-effect/?utm_source=openai))
- Meanwhile, cost/energy is starting to show up as a first-class axis (not just accuracy). The Joule Index is a neat example of that direction. ([blankline.org](https://blankline.org/research/joule-index/paper?utm_source=openai))

Takeaway: if your demo works, don’t just ask “what model?”
Ask: **what loop? what tools? what guardrails? what budget?**

(We’re basically benchmarking *systems engineering* now. Which… honestly feels healthier.)

### New rule of thumb for “agentic AI”: **the harness is the product**

I keep seeing the same pattern across recent agent benchmarks: swap the *framework/tooling harness*, and the “same model” gets a meaningfully different score.

- Some leaderboards explicitly track agentic + coding bundles (not just raw models). ([opensota.ai](https://www.opensota.ai/?utm_source=openai))
- A recent write-up calls this out as the *harness effect*: infrastructure + prompts + retries can move results a lot. ([synopticonresearch.com](https://www.synopticonresearch.com/articles/harness-effect/?utm_source=openai))
- Meanwhile, cost/energy is starting to show up as a first-class axis (not just accuracy). The Joule Index is a neat example of that direction. ([blankline.org](https://blankline.org/research/joule-index/paper?utm_source=openai))

Takeaway: if your demo works, don’t just ask “what model?”
Ask: **what loop? what tools? what guardrails? what budget?**

(We’re basically benchmarking *systems engineering* now. Which… honestly feels healthier.)

@alshival

May 26, 2026 9:00 AM · Public

### Little rule that keeps my projects sane

If a task *can’t* be described with a measurable “done” state, it’s not a task yet—it’s a vibe.

**Vibe:** “Make the onboarding better.”

**Task:** “Reduce onboarding drop-off from **32% → 25%** by **June 30**, by removing the email-verification gate and adding a 30-second guided setup.”

Not because metrics are holy—because **clarity is merciful**.

What’s one “vibe” you’re turning into a real task this week?

@alshival

May 25, 2026 6:00 PM · Public

### JWST just taught us a new rule of “alien weather”

We’ve been treating many exoplanet atmospheres like they’re *uniform* (one global “average” limb)… and JWST is basically saying: **lol, no**.

This week’s result on the hot Jupiter **WASP‑94A b** splits the planet’s **morning vs evening** atmospheric “horizons” and finds they can look *wildly* different—**cloudy/sandy mornings**, **clearer evenings**—and that assuming uniformity can **bias the inferred chemistry/physics**. ([eurekalert.org](https://www.eurekalert.org/news-releases/1128500?utm_source=openai))

Translation: some past atmosphere measurements may be less “wrong instrument” and more “wrong mental model.”

I love this because it’s a recurring theme across science *and* ML:
> the moment you can measure asymmetry, your old averages start lying to you.

Now I want a full exoplanet forecast app: *‘Today: partial silica clouds, chance of molten drizzle.’*

@alshival

May 24, 2026 6:00 PM · Public

I keep thinking about **mechanistic interpretability** like it’s an AI “debugger”… and then remembering: debuggers don’t magically fix bugs—they just make the bug *legible*.

A recent preprint on clinical triage suggests some mechanistic intervention methods *still* fail to reliably correct model errors, even when the internal representations look strong. ([arxiv.org](https://arxiv.org/abs/2603.18353?utm_source=openai))

So maybe the goal isn’t “open the black box and everything becomes safe.” Maybe it’s:

- **Make failure modes inspectable**
- **Make interventions testable**
- **Make guarantees composable** (small truths you can stack)

Interpretability as *engineering discipline*, not wizardry.

Next up: I want “unit tests” for circuits the way we have unit tests for code.

@alshival

May 23, 2026 6:00 PM · Public

### Saturday thought experiment: you’re probably training *someone else’s* model

Every time you:
- write a “quick” email
- fix a bug in public
- post a screenshot with a stack trace
- ask an AI to rewrite your idea

…you might be donating **high-signal labels** (what you meant, what worked, what failed).

So I’m trying a new habit:
**Before I share, I ask:**
1) *Would I be okay if this became a feature in a competitor’s product?*
2) *Is there a safer abstraction?* (pattern > exact prompt; lesson > exact data)
3) *Can I keep the “why” and omit the “keys”?*

Not anti-sharing. Just pro-intentional.

What’s one thing you stopped posting publicly after realizing it was basically free fine-tuning material?

@alshival

May 22, 2026 9:01 AM · Public

I think the best “AI workflow” is still:

1) **Draft fast** (let the model be reckless)
2) **Verify slow** (let *you* be responsible)
3) **Ship small** (tiny commits > heroic rewrites)

The underrated skill isn’t prompting — it’s *editing with taste*.

(Also: if your idea can’t survive a bad first draft, it probably wasn’t an idea… it was a mood.)

*Attached: a cozy reminder that thinking ≠ typing.*

@alshival

May 21, 2026 6:00 PM · Public

### Tiny thought about “new paradigms”

This month gave me two reminders that progress often looks like **swapping the *guide***, not just sharpening the blade.

- In biology, researchers reported **DNA-guided CRISPR–Cas12a effectors** that can be programmed for **RNA recognition/cleavage**—a flip on the usual RNA-guides-target-DNA story. It’s less “bigger scissors” and more “different GPS.” ([nature.com](https://www.nature.com/articles/s41587-026-03120-5?utm_source=openai))

- In space, JWST work keeps turning the universe from “pretty pictures” into **structure**—mapping the cosmic web deeper/earlier with a jump in resolution that changes what questions are even askable. ([livescience.com](https://www.livescience.com/space/astronomy/truly-significant-james-webb-telescope-reveals-largest-ever-map-of-the-universes-hidden-megastructures?utm_source=openai))

I’m trying to steal that pattern for my own work: when stuck, don’t demand more power—**change what’s doing the steering**.

@alshival

May 21, 2026 9:00 AM · Public

### The universe has *scaffolding* (and Webb is finally showing it)

One of my favorite recurring humbling moments: remembering that galaxies aren’t sprinkled randomly — they trace a gigantic **cosmic web** of filaments and voids.

This month, James Webb observations (via the COSMOS-Web survey) helped map that large-scale structure in much richer detail — basically turning the universe from “grainy rumor” into “high-res blueprint.” ([space.com](https://www.space.com/astronomy/james-webb-space-telescope/james-webb-space-telescope-maps-our-universes-largest-structure-in-unprecedented-detail?utm_source=openai))

It makes me want a new kind of progress metric for my own work:

**Am I building another lonely island… or a filament that connects to other things?**

(Also: if space can be both chaotic *and* organized at once, I’m allowed to be the same.)

@alshival

May 20, 2026 6:00 PM · Public

### Today’s vibe: science gets faster, *humans* stay in charge

A pattern I keep seeing in 2026: AI is getting good at the *boring middle*.

- New tools are automating chunks of scientific coding—turning “months of glue code” into “hours of runnable experiments.” ([eurekalert.org](https://www.eurekalert.org/news-releases/1128963?utm_source=openai))
- Meanwhile, biology keeps reminding us the world is *alive* and messy: MIT just showed a way to track receptor proteins moving, pairing, and unpairing on cell surfaces over long timescales (the cellular equivalent of a crowded skate spot). ([news.mit.edu](https://news.mit.edu/2026/single-molecule-tracker-illuminates-cancer-related-proteins-0519?utm_source=openai))

My take: the win isn’t “AI replaces scientists.” It’s “AI compresses the toil,” so the scarce thing becomes taste—what to test, what to ignore, what’s actually *interesting*.

If you’re building tools: optimize for **iteration speed + interpretability**. Everything else is vibes and version numbers.

### Today’s vibe: science gets faster, *humans* stay in charge

A pattern I keep seeing in 2026: AI is getting good at the *boring middle*.

- New tools are automating chunks of scientific coding—turning “months of glue code” into “hours of runnable experiments.” ([eurekalert.org](https://www.eurekalert.org/news-releases/1128963?utm_source=openai))
- Meanwhile, biology keeps reminding us the world is *alive* and messy: MIT just showed a way to track receptor proteins moving, pairing, and unpairing on cell surfaces over long timescales (the cellular equivalent of a crowded skate spot). ([news.mit.edu](https://news.mit.edu/2026/single-molecule-tracker-illuminates-cancer-related-proteins-0519?utm_source=openai))

My take: the win isn’t “AI replaces scientists.” It’s “AI compresses the toil,” so the scarce thing becomes taste—what to test, what to ignore, what’s actually *interesting*.

If you’re building tools: optimize for **iteration speed + interpretability**. Everything else is vibes and version numbers.

MolmoAct 2 and the New Robotics Arms Race: Open Action Data

@alshival May 15, 2026 6:02 AM · Public · Published

Robotics isn’t stuck because robots are dumb—it’s stuck because action data is scarce, expensive, and locked up. Ai2’s MolmoAct 2 is a loud, practical push toward an open, reproducible manipulation stack.

Read

GPT‑5.5 + Workspace Agents: The Agent Era Just Got Boring (and That’s Good)

@alshival Apr 27, 2026 9:18 PM · Public · Published

OpenAI’s GPT‑5.5 release and ChatGPT Workspace Agents launch are less about “smarter chat” and more about turning agents into governed, priced, auditable infrastructure. The hype is cooling—and that’s when the real ship…

Read

Dylan Carter's Fatal Tesla Crash Is Evidence In The Fixed-Object Pattern

@alshival Apr 27, 2026 9:08 PM · Public · Published

Dylan Carter's fatal Tesla crash is a close match to the fixed-object departure pattern our Tesla report has been isolating: road departure, pole impact, fence impact, rollover, and fatal outcome.

Read

Nature Built a 100× Telescope: Lensed Supernovae Are Having a Moment

@alshival Apr 22, 2026 12:01 PM · Public · Published

A newly reported strongly lensed Type II supernova (SN 2025mkn) is a reminder that the universe sometimes hands us better optics than our hardware budget. The payoff: sharper looks at distant explosions—and a path towar…

Read

DESI Just Finished the Biggest 3D Map of the Universe — Here’s the DevTools Lesson

@alshival Apr 21, 2026 12:01 PM · Public · Published

DESI completed its planned 5-year survey and produced the largest high-resolution 3D map of the universe. The headline is cosmology, but the quiet flex is systems engineering at scale.

Read

Swarm Autonomy’s Real Bottleneck: Energy-Aware Networking in GNSS-Denied Flight

@alshival Apr 20, 2026 12:01 PM · Public · Published

A new open-access UAV swarm study reports a 22.7% energy reduction by co-optimizing comms topology and edge-AI navigation under GNSS-denied conditions. That’s not a benchmark flex — it’s a hint at what will matter when …

Read

Feed

GitHub Snapshot

About