Why a Vision Pro AR Veteran Is Going Back to Phones
Before the Vision Pro existed as hardware, the demos that would define it were built on iPhones and iPads. That detail tends to get lost in conversations about headset versus phone AR, but it matters. The person who led that work, who helped create Encounter Dinosaurs, the interactive prehistoric-creatures experience that shipped preinstalled on Vision Pro, has spent the two and a half years since leaving Apple building a mobile AR app instead. That is not a retreat. It is an argument.
Drummond led Apple's Character Intelligence Team before departing in 2023. His new company is called Pixi, launching in the coming weeks, and it is designed explicitly around what phones do better than headsets. His position as a Vision Pro AR veteran going back to phones carries more weight than a typical founder pivot precisely because he built the thing he is now arguing against for consumer use. As The Verge reported today, his team was prototyping visionOS AR development experiences on iPhones and iPads before the headset was available to them internally. Mobile was not a fallback; it was the foundation.
The commercial context for his bet is pointed. IDC estimates Apple shipped roughly 45,000 Vision Pro units during the 2025 holiday quarter, and Sensor Tower data shows the company cut its Vision Pro digital marketing spend by more than 95% in the US and UK, per VR.org. At $3,499, Vision Pro was never positioned as a mass product. Those numbers confirm it has not become one.
This is a case study, not a verdict on every corner of XR. What Drummond's decision reveals is specific: for consumer AR that needs to reach ordinary people through ordinary behavior, the phone still wins. What follows examines why, what Pixi actually does about it, and where that leaves the industry's longer wager on glasses.
The distribution problem mobile AR has never solved
The central challenge for mobile augmented reality experiences has never been the technology. It has been getting people to use them more than once.
Pixi's approach to that problem is structural. A user picks an interactive character, a cat or a robot, attaches a personal message, and sends it through iMessage or WhatsApp. The recipient opens the message and sees the character overlaid in their phone's camera view, able to tell jokes, play tic-tac-toe, or run a whack-a-mole game on whatever surface happens to be nearby. The Verge reported today that Drummond describes this as "the AR version of the email greeting card."
That framing is doing real work. Earlier mobile AR products, standalone character apps and location-triggered experiences, required users to seek them out, open them unprompted, and return without any social pull. The friction to first use was high, and there was no structural reason to return. Pixi routes the experience through messaging instead, which means the recipient is already engaged before the AR layer begins. Someone chose to send this to them. The social trigger is built into the delivery mechanism, not bolted on afterward.
Whether that solves the retention problem or simply lowers the barrier to a first encounter is the genuine open question. The greeting-card format has a clear social logic, but the data on whether it produces habit rather than one-off curiosity does not yet exist. Drummond is candid about the limits of a version-one product: "If you're not slightly embarrassed by the first product, you launched way too late," he told The Verge today, signaling that characters and storylines will expand post-launch. His bet is that a recipient who has a reason to open the experience, and a sender whose identity is attached to it, changes the dynamic that undercut earlier mobile AR products. That may be true. It is also the kind of hypothesis that only proves out if the experience itself earns a second look, which is why the AI underneath it matters as much as the messaging layer around it.
AR on phones vs headsets: what Drummond's move gets right
Of all the arguments for phone-first AR, the one the industry talks about least is also the most structurally interesting: social observability.
Drummond's critique of headsets for consumer entertainment is partly about loneliness. Strapping a device over your face separates you from the people in the room. "It's kind of lonely," he told The Verge today. Phone AR is inherently visible to others: someone next to you can lean over your shoulder, watch the same character, and participate without any hardware of their own. That is not a minor convenience. It is the difference between a shared experience and a solo one, and it is the property that makes Pixi's greeting-card metaphor actually work. A headset version of Pixi would collapse the concept entirely: the moment a recipient has to strap on hardware before seeing the character, the casual warmth of a sent message disappears.
The compute argument reinforces the social one. Making a character feel genuinely present requires real-time attention: facial expression recognition, object detection, context-aware responses. "In order to make a character feel like it's present, it has to pay attention," Drummond told The Verge, and he argues that kind of attention depends on on-device AI. Current iPhones can run the complex ML models that glasses and wearables, with their limited power and thermal budgets, cannot yet support. Pixi downloads custom ML models dynamically to identify objects in the environment and incorporate them into the experience. Glasses will eventually close that gap, but right now the phone is the more capable AI runtime, which is a counterintuitive fact for an industry that tends to position wearables as the smarter, more contextual platform.
Underneath both arguments is simple ubiquity. "We have phones with us all the time," Drummond said, which sounds obvious until you consider everything headset AR requires before the experience can begin: setup, teardown, a price point that reaches a narrow audience, and a device that most people do not carry with them. The experiences that justify a headset are real; they are just specific: productivity in a focused environment, cinematic VR, spatial design review. None of those are paths to mainstream consumer AR adoption.
Glasses are the goal. The phone is the present.
Every major player in AR with a publicly stated roadmap believes the long-term form factor is glasses. The question is what to build before that infrastructure exists.
Meta CTO Andrew Bosworth wrote in late 2024 that glasses are "by far the best form factor for a truly AI-native device" and that combining AI glasses with true augmented reality is "the next big step," per the Meta blog. Apple's leaked roadmap points toward a lighter "Vision Air" and AR glasses targeting 2027, as VR.org noted. Both companies believe glasses win eventually. Neither has shipped consumer AR glasses yet.
Drummond does not actually dispute the glasses thesis. He told The Verge that Apple Watch and spectacles from Apple will eventually "do much the same job," implying the phone may evolve into a compute hub that wearables depend on rather than a device competing against them. His argument is about sequence, not destination.
The recent track record of screenless AI wearables should temper confidence in any 2027 timeline. Humane raised $230 million for its AI Pin, launched it in early 2024 at $699 plus a monthly subscription, and had its assets acquired by HP for $116 million in early 2025, a company that raised more capital than its eventual sale price, per Stephen Van Tran. The device could not deliver enough value in the narrow contexts where it was actually usable. The problem was not the concept. It was the gap between what the hardware promised and what it could realistically do.
For developers building today, the choice of platform is not ideological. It is arithmetic. IDC's estimate of roughly 45,000 Vision Pro units shipped in a single holiday quarter does not constitute a distribution platform for consumer entertainment, per VR.org. Developers who need scale cannot wait for the wearable infrastructure, battery life, compute density, social acceptance, price, to catch up with the roadmaps that promise it. The phone is not the form factor anyone in AR is excited about in 2026. It is simply the one that exists, in enormous numbers, in people's pockets.
The industry split is less about whether glasses are good and more about what "ready" means. Drummond's two-and-a-half-year bet on mobile is a practical answer: build for the platform that actually exists, and let the wearable transition happen when the conditions support it rather than when the slides say they will.
What comes next
The next wave of consumer AR may look less like spatial computing demos and more like AI-animated characters moving through group chats. That is a smaller canvas than the industry imagined, but it has one property that headset-based mixed reality has not yet achieved: it fits inside something people already do every day.
As iPhones become more capable AI runtimes, running facial recognition, object detection, and contextual modeling locally, the gap between what a phone can do and what a pair of glasses can do may widen before it narrows. Glasses are constrained by the same power and thermal limits that have hampered every small wearable. That constraint does not go away because the roadmap says 2027. Apple's leaked plans and Meta's stated ambitions are real signals, but so is the Humane collapse: roadmap ambition and market readiness are different things, and the distance between them has burned serious capital before.
Pixi is one data point, and an early one. The app Drummond is launching in the coming weeks is, by his own admission, not the finished version of what he is building. What it represents is a testable thesis: that mobile augmented reality experiences distributed through existing social behavior, powered by on-device AI, and built for the moment when two people can share a screen without any extra hardware, are better positioned for mainstream traction than anything requiring a headset today.
Whether Pixi earns a permanent place in people's messaging habits, or becomes another novelty that fades after the first forward, is the test Drummond signed up for when he walked away from the headset. The answer will say something useful about whether consumer AR's next chapter is written on phones or still waiting for glasses that are ready to wear.

Comments
Be the first, drop a comment!