Google officially declares war
Author of this article: Chengbei Xugong Data support: Gougu Big Data
The 2026 Google I/O Developer Conference gives the impression of only two words: arrogance.
Not only did they seamlessly stuff AI agents into all core traffic entry points like search, browsers, mobile phones, and smart glasses, but they also continuously released three major innovations: Gemini 3.5 Flash, the video model Omni, and the brand new AI assistant Spark.
After showcasing their strengths, they proudly announced that Gemini's monthly active users surpassed 900 million; they also officially announced significant price reductions.
The meaning is straightforward: I am stronger than you, and I am cheaper than you.
Isn't this a declaration of war?
01
The most stunning reveal at the conference was undoubtedly the debut of Gemini 3.5 Flash.
Normally, "Pro" represents the backbone, while "Flash" signifies lightweight and speed.
In terms of model parameters, 3.5 Flash is indeed smaller than 3.1 Pro, but in almost all inference and coding benchmark tests, the former performed surprisingly better:
In the complex mathematical reasoning GSM8K test, 3.5 Flash scored 95.8%, surpassing 3.1 Pro's 93.2%; in the full version of the SWE-bench code generation capability, 3.5 Flash achieved a solution rate of 38.4%, far exceeding 3.1 Pro's 32.1%...
Why?
According to the "Gemini 3.5 Technical Report" released by DeepMind, there are two core technologies that are most important.
Extreme Knowledge Distillation: Google did not simply rely on stacking computing power to train Flash; instead, they used the never-before-released "Gemini 3.5 Ultra" as a teacher model to perform dimensionality reduction distillation on Flash.
According to a tweet analysis by DeepMind's chief scientist Jeff Dean, the fine-tuning ratio of 3.5 Flash on high-quality logical chain datasets improved by 400% compared to the previous generation.
This means it inherits the "logical brain" of a super-large model, rather than a rote "knowledge base."
Brand new MoE architecture (Mixture of Experts): Inside 3.5 Flash, Google adopted a more finely-grained expert network.
Traditional MoE may have only 8 or 16 experts, activating only 1-2 at a time, which is sufficient to support trillion-parameter scale models.
According to an analysis in a16z's 2026 AI infrastructure investment memo, 3.5 Flash employs 256 micro-experts, activating up to 4 of the most efficient during each inference.
This allows it to cover an extremely large multimodal feature space while maintaining an extremely low activation parameter count.
In terms of TTFT (Time to First Token), 3.5 Flash has already reached under 65 milliseconds.
And a human blink takes 100-150 milliseconds.
In short, when it operates as an agent, from a human physiological perspective, there is no noticeable pause at all.
For developers who need to frequently call tools, engage in multiple rounds of reflection, and require extremely low latency, this is the perfect super agent foundation.
Only with such extreme engineering optimization can one establish dominance in "edge deployment" in a fiercely competitive environment.
The first is the native multimodal Gemini Omni Flash.
Omni means all-powerful, corresponding to the earlier GPT-4o; just by the name, one can feel the intense competition.
At least in terms of performance, Gemini Omni Flash is far more qualified to use the "o" character than GPT-4o.
Early versions like Sora or Gemini 1.5 were essentially patchwork, converting speech to text and then text to visuals.
But the Omni released this time is truly a native end-to-end multimodal alignment. It can not only natively understand the temporal coherence and physical laws in videos but also reduce the industry average delay from 400-600 milliseconds to 120 milliseconds.
For example, during the conference: a user wearing a camera pours water, and as the cup is about to overflow, Omni can say "stop stop stop!" 0.5 seconds before the water spills.
This real-time inference of the physical state of the real world may seem simple, but it is significant: AI has officially evolved from a chatbot on the screen to an auxiliary tool in the real world.
Even if it is still in its early stages.
The second is the intelligent assistant Spark.
According to a report from The Verge interviewing the Vice President of Android Engineering, Spark has been granted control over the native API of the Android 17 system.
In short, complex processes that previously required opening many apps can now be completed without lifting a finger; just instruct Spark, and it can handle everything for you, even sending messages, organizing emails, summarizing schedules, tracking web dynamics, identifying hidden charges on bills, batch processing documents, and so on...
In other words, with the AI assistant, we will hardly need apps anymore; any complex operation is simplified into a single command.
The third is smart glasses.
Why glasses again?
At least from Google's perspective, seamless access to vision and hearing is the ultimate host for multimodal large models.
These glasses have no flashy appearance, focusing entirely on practical capabilities:
Micro-OLED full-color waveguide lenses weighing only 4 grams, with a light transmittance of up to 85%;
Equipped with a self-developed lightweight Gemini edge chip, local inference latency ≤12ms, capable of real-time translation, image recognition, and scene analysis without needing to connect to the internet;
Natively linked to the Spark agent, synchronizing mobile and cloud data to provide personalized services such as schedule reminders, real-time translation, and environmental alerts.
In short, it bypasses the smartphone screen, integrating the agent into the human first-person perspective through glasses.
There is simply too much content; Google seems to have emptied all its trump cards at once, declaring a truth to the market:
An algorithm without an entry point is nothing.
The era of rolling out model parameters and benchmark scores is over; pure model providers no longer have a moat. The future is a four-dimensional space battle of "edge + cloud + ecosystem + hardware."
Stuffing AI into a family bucket is actually reshaping the entire internet's traffic distribution logic: from "users actively searching/clicking" to "AI agents actively distributing services."
For a vast number of developers and small to medium enterprises, this is excellent news, as the underlying computing power and models have become extremely cheap, allowing everyone to focus on innovation at the application layer.
But other competitors are probably just cursing at this moment.
02
When they casually announced from the stage that "Gemini's monthly active users have officially surpassed 900 million," it caused quite a stir in the audience.
900 million is more than the combined MAUs of all competitors in the U.S.
How did they achieve this?
The answer is simple and brutal: force-feeding.
Google does not need to spend advertising money to acquire users like independent AI companies; it just needs to add an icon next to the address bar in the Chrome browser, integrate a shortcut key in the bottom navigation bar of 3 billion Android phones, and push updates throughout Google Workspace...
The customer acquisition cost is essentially zero.
More critically, over the next period, the 900 million active users' gazes while using smart glasses to view products, the logic corrected while processing tasks with Spark, and interactions with the Omni visual model will generate a massive amount of high-quality, multimodal real-world feedback data, all of which will nourish Gemini 4.
This creates an extremely solid barrier: the better the model is to use -> the more users it attracts -> the more data it generates -> the better the model becomes.
To quickly strengthen this closed loop, Google directly announced a price war against all competitors: the AI Ultra package was slashed from $249.99/month to $99.9/month.
The input price for 3.5 Flash's million tokens dropped to $0.02, and the output price for a million tokens is $0.08.
What kind of incredible price is this?
In comparison, the average prices for models of similar levels in the industry are around $0.15-0.2 for input and $0.6-1 for output.
Chopping the numbers, the top clients process about 1 trillion tokens daily. Shifting 80% of the workload to Gemini 3.5 Flash for a year could save over $1 billion.
Why dare to sell AI at such a low price?
The biggest reliance is: vertically integrated computing power infrastructure.
Including giants like OpenAI and Anthropic, they may seem glamorous, but essentially they are still "computing power tenants," needing to buy computing power from Microsoft and Amazon, who in turn have to pay the old Huang.
Google has its own TPU, and combined with the extremely efficient MoE sparse activation of 3.5 Flash, it has compressed computing power costs to the extreme.
They can fully leverage their heavy asset advantages to strike down pure algorithm companies.
The logic is clear.
Basic large models are rapidly commodifying. Just like water and electricity, have you ever seen a water company making exorbitant profits?
Google is not afraid that large models themselves do not make money because they can earn it back through search ads, cloud services, and commissions from the Android ecosystem.
But for companies like OpenAI, Anthropic, Cohere, and Mistral that rely solely on selling large model APIs, this is impossible.
Investors are probably now wanting to press Ultraman's head and ask: "Google's API price is only one-tenth of yours, and its performance is better than yours. How do you expect your business model to work?"
The competitive landscape across multiple industries will thus enter an accelerated reshuffling period.
AI vendors must quickly find cheaper sources of computing power or start making chips themselves.
Next is Apple, which is still building in isolation.
The combination of smart glasses + Omni video large model + Spark's native system-level takeover undoubtedly threatens the iPhone.
According to Macquarie's "Consumer Electronics Trend Forecast Report": In the next three years, the proportion of screenless interactions based on vision/voice is expected to jump from the current 8% to 35%.
If users become accustomed to completing daily work and entertainment using glasses and voice, the usage time of screens will inevitably be significantly reduced.
If Apple cannot produce sufficiently impressive wearable devices to counter (Vision Pro is too heavy and expensive, destined to be a toy for a minority), its monopoly on entry points in the mobile internet era will face unprecedented challenges.
This is not iteration; it is revolution.
Google has thrown down the gauntlet to all competitors with technology, traffic, and price as three weapons.
At this moment, is there anyone still mocking it for having the disease of a large enterprise?
You may also like

Galaxy Deep Research Report: How Hyperliquid's HIP-4 Upgrade Changes the Landscape of Prediction Markets?

ZachXBT: Humanity private key leak and abnormal surge in H token should be viewed separately
On June 9, according to related disclosures, on-chain investigator ZachXBT posted an update on Humanity’s roughly $31 million security incident, saying that after further analyzing fund flows, he currently tends to believe the project team was not involved in an “inside job” or a self-staged attack. According to him, the official explanation about the private key leak was broadly accurate, but before the token unlock, the price of H had been artificially pushed higher, and the hacker later took advantage of that market environment; therefore, the private key leak and the earlier abnormal price pumping should be regarded as two separate and independent events. This reframing has shifted the market’s understanding of the nature of the incident. Earlier discussion around Humanity had focused on whether the team directly participated in the attack or used the security incident to cover up internal operations. ZachXBT’s latest remarks shift the focus from “whether it was self-theft” to “whether there were pre-unlock market structure issues.” He also questioned whether the team may have.

Morning Report | OpenAI has submitted an S-1 registration statement draft to the U.S. SEC; Morpho completes $175 million financing

Morning Report | BitMine increased its holdings by 126,971 ETH last week; trader Eugene announced his exit from the crypto market

Wang Chuan: How can one not feel anxious after the neighbor Old Wang made thirty times profit by investing in storage stocks? (Seven) - A quarter-century cycle

Cryptocurrency CEXs are flocking to sell US stocks, and traditional brokerages are facing an "uninvited guest."

$75 billion in foreign capital has fled, and South Korean retail investors have absorbed it all using leverage

Japan’s Three Megabanks Plan Joint Stablecoin Issuance in Fiscal 2026
MUFG, SMBC, and Mizuho reportedly plan to jointly issue fiat-pegged stablecoins in fiscal 2026, signaling Japan’s growing push into bank-led digital payment infrastructure.

Humanity Discloses H Token Dual-Chain Attack Details, With Losses on Ethereum and BSC Exceeding $36 Million
Humanity said the H token attack across Ethereum and BSC caused more than $36 million in losses after leaked ProxyAdmin keys enabled malicious contract upgrades and token minting.

White House Discusses CLARITY Act With Law Enforcement Ahead of Senate Vote
The White House discussed the CLARITY Act with law enforcement ahead of a Senate vote, focusing on illicit finance risks and developer protections.

Bitcoin Trading Guide 2026: Strategies for Experienced Traders

What Is XAUT and PAXG? Why Tokenized Gold Is Booming in 2026

Will the SpaceX IPO Hurt Bitcoin? Here's What Traders Are Watching

Foreign selling in the South Korean stock market accelerates, with cumulative net sales reportedly reaching $75 billion this year
On June 9, The Kobeissi Letter, citing Goldman Sachs data, reported that global investors are selling South Korean stocks at an unusually rapid pace. In the latest trading session, foreign investors sold about $801 million worth of Kospi constituent stocks again; total foreign outflows last week reached about $10 billion, and the market has been in net foreign selling on nearly every trading day over the past month. According to the data cited in the report, foreign investors have sold about $75 billion worth of South Korean stocks so far this year. Meanwhile, South Korean retail and institutional investors together recorded roughly $69 billion in net buying over the same period, suggesting that the market’s main buying support has come from domestic capital rather than returning overseas funds. The information currently disclosed still mainly comes from The Kobeissi Letter’s retelling and Goldman Sachs data summaries, while public details on the statistical period and the specific definition of “selling” remain relatively limited.

Fortune Warns of Strategy’s Financing Structure Risks as Bitcoin Premium Narrows
Fortune warned that Strategy’s Bitcoin treasury model faces growing financing risks as MSTR’s net asset premium narrows and preferred stock dividend pressure increases.

Ferrari Challenge Le Mans: Carl Moon to Dominate in WEEX Livery

Sahara AI Responds to SAHARA’s Sharp Drop: No Contract or Product Security Issues Found, Internal Investigation Underway
Sahara AI responded to SAHARA’s 60% price drop, saying no token contract or product security issues have been found and an internal investigation is underway.

WEEX Deposit/Withdrawal Dynamic Island: Your Asset Status, Always in Sight
Galaxy Deep Research Report: How Hyperliquid's HIP-4 Upgrade Changes the Landscape of Prediction Markets?
ZachXBT: Humanity private key leak and abnormal surge in H token should be viewed separately
On June 9, according to related disclosures, on-chain investigator ZachXBT posted an update on Humanity’s roughly $31 million security incident, saying that after further analyzing fund flows, he currently tends to believe the project team was not involved in an “inside job” or a self-staged attack. According to him, the official explanation about the private key leak was broadly accurate, but before the token unlock, the price of H had been artificially pushed higher, and the hacker later took advantage of that market environment; therefore, the private key leak and the earlier abnormal price pumping should be regarded as two separate and independent events. This reframing has shifted the market’s understanding of the nature of the incident. Earlier discussion around Humanity had focused on whether the team directly participated in the attack or used the security incident to cover up internal operations. ZachXBT’s latest remarks shift the focus from “whether it was self-theft” to “whether there were pre-unlock market structure issues.” He also questioned whether the team may have.
