MOUNTAIN VIEW, 19 MAY 2026 — Google opened its annual I/O developer conference on Tuesday with a clear declaration that the next phase of the AI race will be fought over agents, not chat windows, unveiling Gemini 3.5 Flash and a new always-on assistant called Gemini Spark designed to act on behalf of users across Gmail, Google Sheets, and the wider Google Workspace footprint.
Key Takeaways
- Gemini 3.5 Flash rolled out the same day across all Google products and APIs; Gemini 3.5 Pro is scheduled for June.
- Google said Flash is "four times faster than other frontier models" while delivering its strongest coding and agentic performance to date.
- Gemini Spark is a personal AI agent that runs continuously, even when the user's device is off, and can read and act inside Gmail and Google Sheets without manual prompting for each task.
- Spark is invitation-only to start — limited to trusted testers and subscribers of Google's $249.99-per-month AI Ultra tier, with a wider Workspace rollout staged later in 2026.
- The same keynote previewed Omni, a "world model" that takes video and physical-environment data as input — Google's clearest move yet toward embodied AI and robotics.
The Facts
Speaking from the Shoreline Amphitheatre stage on the morning of 19 May 2026, Google chief executive Sundar Pichai framed the day's releases as a pivot from "asking Gemini" to "delegating to Gemini." The Gemini 3.5 Flash model is the centrepiece. According to TechCrunch's reporting from the keynote, Flash is positioned as a fast, low-cost workhorse aimed squarely at agentic deployments: it can independently execute coding pipelines, manage research projects, and — in internal evaluations Google ran but did not externally benchmark — build a working operating system "entirely from scratch."
The model is priced significantly below frontier competitors. Google did not publish a full price card during the keynote, but the company said Flash 3.5 delivers comparable capability at "half, or in some cases close to one-third, the price" of frontier rivals such as GPT-5.5 and Claude Opus 4.7. Rollout was immediate: the model is now the default behind the consumer Gemini app, available through the Vertex AI API for enterprises, and replaces the previous Flash version across Google Workspace's AI features.
Gemini 3.5 Pro — a heavier model targeting frontier-tier reasoning and long-context work — was previewed but not shipped. Google said Pro will debut in June. The company did not commit to a context-window number for Pro, although it claimed continued improvements over Gemini 2.5 Pro's 1-million-token window.
The second major announcement, Gemini Spark, drew the longest applause of the keynote. Google's product blog described Spark as the "next evolution" of the Gemini app: a persistent agent that runs 24/7, can be assigned tasks once and trusted to follow up over hours or days, and reaches into a user's connected Google services to act without supervision. Demos shown on stage included Spark monitoring an inbox for a specific shipment notification, then automatically updating a Sheets-based travel itinerary when the package was rerouted; planning a multi-day research itinerary by pulling from Google Maps, Calendar, and Search; and writing first drafts of routine email replies overnight while the user slept.
CNBC's coverage of the keynote noted that the agent will land first inside the existing Gemini app, then expand into Workspace surfaces. Spark is gated behind Google AI Ultra, the $249.99-per-month tier that the company has been positioning since late 2025 as the home for its most capable consumer features.
The keynote also previewed a third release: Omni, described as a "world model" trained on video and 3D scene data, intended to give Gemini agents an internal physics-aware representation of the environment they operate in. Google did not give a release date for Omni.
Technical Deep-Dive
Gemini 3.5 Flash represents a step change in how Google has been engineering its mid-tier models. The previous Flash generation was primarily a smaller, cheaper, distilled variant of the Pro model. Flash 3.5 is presented as a first-class model in its own right, with separate post-training optimised for tool use and agentic loops rather than chat completion.
Three architectural choices stand out from the technical disclosures Google made on stage and in the accompanying developer blog. First, the model has been trained extensively on synthetic agentic trajectories — long multi-step chains of tool calls — rather than only on conversational data. That training mix is the reason Google is willing to ship the agent product (Spark) on the same model. Second, native tool-use latency was a primary optimisation target: the company says Flash 3.5 returns its first tool call in well under a second on standard benchmark workloads, which is the metric that matters when a single user request might spawn ten or twenty downstream API hits. Third, the model is fully multimodal at inference — text, image, audio, and video inputs are routed through the same forward pass rather than handed off to specialist encoders, reducing the inference cost of multimodal queries.
Spark is a separate piece of system engineering layered on top of the model. The agent runs on Google's infrastructure, not on the user's device. That is what enables the "24/7 even when your phone is off" claim. Practically, Spark is a long-lived workflow engine: when a user assigns it a task, the agent is registered as a server-side process with its own scoped permissions into the user's Google services, a memory of prior interactions, and a scheduler that lets it wake itself on inbox events, calendar triggers, or fixed intervals. Each action it takes is logged for review.
The integration with Gmail and Sheets is implemented through Google's existing OAuth scopes — the same permissions surface that third-party apps use — but with a privileged Google-internal grant that lets the agent read drafts, modify cells, and write replies on behalf of the user. Google said all actions are reversible from a single audit log in the Gemini app, and that users will be prompted to confirm any send-mail or external-share action before it executes.
The competitive context matters. OpenAI shipped GPT-5.5 to enterprise in April and made GPT-5.5 Instant the ChatGPT default on 5 May 2026. Anthropic's Claude Opus 4.7 went generally available on 16 April with explicit positioning around long-running coding tasks. Google's Flash 3.5 sits between those two on capability and below both on price — the classic mid-tier squeeze play Google has historically executed well in cloud services.
ASEAN Perspective
For enterprises across Southeast Asia, the Gemini 3.5 Flash release is the most consequential of the three announcements because it lands in regional Google Cloud regions immediately. Singapore is one of Vertex AI's two Asia-Pacific anchor regions (the other is Tokyo), and Flash 3.5 is generally available through both from launch day, putting low-latency frontier-class AI in reach of Singapore-based banks, telcos, and government agencies that have data-residency requirements.
That matters because Singapore's Cyber Security Agency advisory on frontier AI models — issued on 15 April 2026 — explicitly noted that frontier models can reduce vulnerability discovery time "from months to hours," cutting both ways for defenders and attackers. Cheap, fast Flash-tier capability inside a domestic GCP region gives Singapore enterprises an actually usable defensive option that doesn't require shipping prompts and code to overseas regions.
Malaysia's data-sovereignty story is more uneven. There is no domestic Vertex AI region; Malaysian customers route to Singapore or Jakarta. For sectors like banking and healthcare, where Bank Negara Malaysia and the Personal Data Protection Department both require careful handling of cross-border data flows, that adds compliance friction. Expect to see Malaysian fintech firms preferring the API-only path with strict prompt-filtering middleware over the full agentic Spark experience until Google announces a domestic region.
Indonesia has had a Jakarta Vertex AI region since 2024, and Flash 3.5 should land there in the rollout window Google described as "all GA regions, same day." That puts Indonesia in unusual company — alongside Singapore — as one of only two ASEAN countries with frontier-class agentic AI available in-region. Bank Mandiri, BCA, and Telkom Indonesia have all publicly disclosed Gemini-related pilots over the past 12 months and are the most likely first-wave Spark adopters once the agent moves out of trusted-tester status.
The downside for the region is consumer access to Spark itself. The $249.99-per-month AI Ultra subscription is steep relative to local purchasing power in Vietnam, Philippines, Thailand, and Indonesia. For most of ASEAN's 700-million-person population, Spark will remain a corporate tool surfaced through employer Workspace licences rather than a personal assistant — at least until Google introduces a regional pricing tier. Vietnamese and Filipino developers, in particular, are likely to gravitate toward the Flash 3.5 API and build their own lower-cost agent layers on top.
What Organisations Should Do
The right reaction to this announcement is neither rushing to deploy Spark in production nor dismissing the agent as a demo. There are five concrete steps a CTO or head of AI should be taking inside the next 30 days:
-
Update your model evaluation harness to include Gemini 3.5 Flash. If you have an internal LLM-routing layer, add Flash as a candidate model for the cost-sensitive tier. The price-per-token differential against GPT-5.5 and Claude Sonnet 4.6 is significant enough that switching even routine workloads will move budget.
-
Audit your existing Workspace agent permissions. If your organisation is on Google Workspace, Spark will eventually arrive in your tenant by default. Understand which OAuth scopes Spark will hold and whether your existing data-loss-prevention rules will catch unexpected external sends or sheet exports. Talk to your Google account team about admin controls now, not after rollout.
-
Pilot Flash 3.5 on a single low-risk agentic workflow before touching Spark. Routine internal tasks — drafting expense reports, summarising meeting notes, monitoring a public-facing inbox — are good first targets. Build the observability before the surface area expands.
-
Lock down credential exposure in any system that an agent will touch. If a Spark-class agent has access to your Gmail and Sheets, it has access to anything mailed or filed in them. That includes API keys casually shared in email threads, customer PII in spreadsheets, and SaaS credentials in calendar invites. Run a credential scan across the last 12 months of Workspace data and rotate anything found.
-
For ASEAN-based teams, validate data-residency for any production deployment. Vertex AI region selection determines where prompts, completions, and agent state are stored. Confirm with Google that your chosen region matches your contractual data-residency obligations — and document it.
RECATOOLS Verdict
We believe the Gemini 3.5 Flash launch is the more important of the two announcements, even though the headlines belong to Spark. Spark is a product. Flash is infrastructure — and infrastructure is what changes industry economics.
The case for Spark being overhyped is straightforward. Always-on agents that act inside personal data stores are the single largest unsolved problem in AI safety today: prompt injection, scope creep, and accidental sends are not theoretical risks but daily occurrences in every agent demo we have seen privately over the past 12 months. Google's audit log is necessary but not sufficient. Our view is that Spark spends 2026 in a "magical for individuals, dangerous for enterprises" zone, and that responsible enterprise adoption lags the consumer adoption curve by at least a year.
The case for Flash 3.5 being underhyped is the part most coverage missed. By pricing a model with credible agentic competence at one-third of frontier cost, Google has effectively put a price ceiling on what OpenAI and Anthropic can charge for their high-volume API tier. Within 90 days, expect ChatGPT Team, Microsoft Copilot, and the smaller AI startups building on top of those APIs to be re-pricing aggressively. The winner of this round is the developer building any AI-augmented product where token cost matters — which is to say, every developer in our reader base.
Our advice to RECATOOLS readers, especially those building tools and SaaS in the Singapore, Kuala Lumpur, Jakarta, and Manila ecosystems: route your inference through Flash 3.5 unless you have a specific reason to use a frontier model. The capability gap closes faster than the price gap.
Frequently Asked Questions
How does Gemini 3.5 Flash compare to OpenAI GPT-5.5 Instant on price? Google has not published a complete price card, but stated at I/O 2026 that Flash 3.5 is priced at "half, or close to one-third" of frontier rivals. GPT-5.5 Instant is priced at approximately $2.50 per million input tokens and $10 per million output tokens on OpenAI's API. Industry estimates for Flash 3.5 land near $0.90 per million input and $3.50 per million output, although Google has yet to confirm the exact figures. Both models target the same workload tier.
Is Gemini Spark available in Singapore, Malaysia, or Indonesia? Spark is invitation-only to start, limited to trusted testers and Google AI Ultra subscribers ($249.99/month). The tier is available in all three countries through the global Google One subscription system, but actual agent enrolment is being staged. Singaporean and Indonesian users are likely to be in the first wave because Vertex AI is GA in those regions; Malaysian users will route through Singapore until a domestic region is announced.
Will Gemini 3.5 Flash work with my existing Vertex AI deployment? Yes. The model is available through the standard Vertex AI Model Garden as of 19 May 2026 and uses the same REST/gRPC interfaces as Gemini 2.5 Flash. Most production deployments require only a model-name string change. Note that prompt formatting around tool calls has changed slightly — Google's migration guide details the differences.
What stops Spark from sending an email to the wrong recipient or leaking data? Google's stated controls are: per-action audit logs, explicit user confirmation before any external send or share, and admin-level Workspace controls that can disable Spark organisation-wide. The agent operates within the user's existing OAuth scopes — it cannot access services the user has not granted. Our view is that these controls are necessary but unproven; we recommend organisations disable external-send permissions for Spark during the first 90 days of any pilot.
When will Gemini 3.5 Pro be released? Google said Pro will debut in June 2026, without committing to a specific date. Based on Google's previous Pro launch patterns, expect a multi-week rolling release starting with Vertex AI customers, expanding to AI Studio and the consumer Gemini app within two to three weeks of initial availability. Pro is expected to lead in long-context reasoning and complex tool-use chains.