I Trust My Car More Than My AI Agent. That Gap Is Where We’re Going.
No real AI hardware yet, agents still break, and trust is the bottleneck. Where I think the next few years actually go, from someone who lives with one.
My agent made me a lot more effective. It also made me watch my own back more.
Both of those are true, and I have stopped pretending the second one away. I built my own agent, I gave it its own machine, and it does real work for me every day. It knows who I am, not just what I want. The effectiveness is real. So is the quiet voice in the back of my head that now tracks what might break while the thing is running.
That voice is the interesting part. It points straight at what this whole wave is actually about. Trust.
Trust is a mileage problem
Think about a car. When I get in mine, I assume it starts. I assume it will not catch fire on the way to Katowice. I never sat down and decided to trust it. I drove it something like a hundred thousand kilometers and the trust built itself. I know its sounds. I know the one strange thing it does on a cold morning. I can predict it because I have repeated it that many times.
Trust is repetition plus outcomes you can predict. That is the whole recipe. There is always room for a bad day, a flat tyre, a dead battery in February. Although the range of what can go wrong is small, and I know the edges of it.
With an agent that recipe only half works. For some tasks I have the same calm I have with the car. For others I am still a new driver, both hands on the wheel, watching the road like it owes me money.
Deterministic things earn trust faster
Here is the pattern I keep running into. The more deterministic the task, the faster I trust it. A script that renames the same files the same way every night, I stopped watching that months ago. It is boring, and boring is exactly the point.
The more agentic the task, the more I am looking at a black box. Open-ended work, many steps, judgment calls, recovering from its own mistakes halfway through a run. That is where outcomes spread out and prediction gets hard. The model matters here. So does the architecture around it, the tools it can call, the memory it carries. I wrote a whole post about what happens when an agent meets the messy real world and stops behaving like the demo.
The ceiling right now is technical and it is not a mystery. Context limits. Memory that does not persist the way I want it to. Retrieval that grabs the wrong thing at the wrong moment. Tool calls that fail quietly. An agent that gets stuck and does not notice it is stuck. Every builder I talk to is fighting the same short list. I wrote about drawing hard edges around an agent so it stays inside what it is good at in the bounded agent post, and about giving it a memory that actually compounds in the one on my self-improving agent.
None of these limits are permanent. I have watched all of them get better over the past year, in jumps, never on a tidy schedule. That detail matters for everything that comes next.
Now run it forward
Assume the boring version of the future. Steady improvement, the kind we have already been getting. Better models, more stable tools, memory that holds, retrieval that lands where you point it. The frontier gets the headlines. The floor is the thing that moves people’s lives, and the floor is what I am watching.
Right now an agent like mine is a nerd object. You need to be deep in code, or at least deep in tinkering, to get real value out of it. Most people who say they use AI mean a chat window. Most companies that say they use AI are in the 88% with almost nothing to show for it. The capability is sitting right there. The on-ramp is the missing piece.
Apple is making the most interesting bet on that on-ramp. At WWDC this month they finally committed to the big Siri overhaul, an assistant that can actually chain multi-step tasks, with an agent layer wired into the App Store so you can hand off things like booking a table or running your smart home. They are building it on Google’s Gemini, which tells you that even Apple decided the raw model is becoming a commodity and the product is the assistant on top. It will not ship in the EU at launch, the usual regulatory reason. I think this is the right move and it might genuinely work. Putting the agent in front of normal customers is the whole game.
Everyone’s agent is a lot of agents
Here is the part that gets skipped. When everyone has an agent, the browsing stops being human. Your software does it for you, at machine speed.
When my agent works, it touches more of the web in an hour than I would in a day. It crawls, it reads, it calls APIs, it goes and shops. Multiply that by a few hundred million people and traffic on the open web spikes. The humans did not arrive in bigger numbers. Their agents did.
That has a bill, and the bill lands in the physical world. Inference is already about two-thirds of all AI compute this year, up from a third in 2023. Data center electricity demand is climbing double digits every year. GPU prices are not coming down. For a while I expect the cost of good AI to go up before it comes down, because demand is bending faster than supply, and power and silicon are real, finite things. That is part of why I keep a local model running on a cheap Mac mini and swap its brain when I feel like it. The local one is slower and dumber than the cloud, and I keep it anyway. I want a floor under me that does not move when the market does.
The question everyone actually asks
Does it take the jobs. That is the real question hiding under all the others.
My honest read is that in the long run it makes more work than it removes. The most cited forecast going around, from the World Economic Forum, lands on roughly 170 million new roles and 92 million gone by 2030. A net gain, with about a fifth of all jobs changing shape somewhere in the middle.
A net gain is cold comfort if you are one of the 92 million. This is a revolution that asks people to move from one kind of work to another, and people do not all move at the same speed. Some are ready. Some are not, through no fault of their own. Closing that gap is a job for policy and pacing and a bit of patience, not something a model fixes. I have written before about the skills that hold their value and about what we should even be teaching kids now that AI writes the code. I do not have a clean answer. I have a direction. The durable move is to get good at pointing this stuff. Racing it is a losing game.
The physical half is slower
Everything above is the digital half. A personal assistant for everything that lives on a screen is close. A personal assistant for anything physical needs a body, and bodies are the hard part.
Robots are catching up faster than I expected though. 1X is shipping its NEO home robot to US homes this year, twenty thousand dollars up front or five hundred a month, with a human quietly supervising the tasks it has not learned yet. Figure has robots working a BMW line. Tesla is gutting a car factory to build Optimus. The honest timeline for a robot that is genuinely useful in a normal home is 2028 to 2032, not next spring, and I went into why in the post about inviting robots into our homes. Still, that is a couple of years out, not science fiction. Close enough that I already think about it.
We are still in the wild west
One more thing, because I think most people still underrate what is already sitting in front of them. We are in the wild west. Barely any rules, uneven tools, and a lot of folks treating an agent like a slightly nicer autocomplete.
Then last week a government had two frontier models pulled. Anthropic disabled Fable 5 and Mythos 5 for everyone on the planet after a US export-control order meant to keep foreign nationals away from the model’s cybersecurity ability, which I covered in my June opinions. They could not filter cleanly by nationality, so they switched both off for the entire world. Europe called it a wake-up call for sovereign AI.
Sit with that for a second. A government looked at a piece of software and decided it was close enough to a weapon to control who gets to touch it. Nobody controls autocomplete that way. That is the tell. These tools are already strong enough to be governed like dangerous things, and most of the people who could be using them well have not clocked it yet.
Where I actually land
So where does this go. More people get an agent that works. The web fills up with software acting for us. Compute gets more expensive before it gets cheaper. Jobs churn hard and then settle higher. Robots turn up for the physical half later than the hype promised and sooner than the skeptics will admit. And somewhere in the middle, AI stops being a label a company staples onto a product and becomes the thing quietly doing the work.
I have lived with one of these long enough to be careful with predictions. I have also lived with it long enough to know it already changed how I work, on the days it behaves and the days it does not. So I will say plainly where I land. I am pragmatically optimistic. None of this will be smooth. I am optimistic anyway, because every time I check, the floor is higher than it was the last time.
It might turn out better than we think. I would not have written that sentence a year ago. For now it is enough to keep building, both hands still on the wheel.



You are right to say that deterministic things earn trust faster. This is something 99% of AI users don't understand: AI is non-deterministic. It returns different responses for the same prompt, and this extends to agent workflows. It can do the expected things 10 times in a row and then mess up the 11th. You cannot be sure. I think we should stop looking at AI as software and more like a person - not because it has anthropomorphic characteristics, but simply because it behaves like one: it makes mistakes, it's learning, then forgets stuff, it's making interpretations that may not be consistent with yours, it adjusts, it has bad days etc. When I work with AI, I regard it as a colleague who is better than me at moving large amounts of data but is still a junior that needs guidance. I expect it to make mistakes, that's why I always supervise it. This may change at some point but for now, this is where we are. (Very good article!)