TL;DR
- OpenAI launched GPT-5 — a unified model with built-in thinking that automatically routes between fast responses and deep reasoning without model switching.
- Three tiers available now: GPT-5 ($1.25/$10 per 1M tokens), GPT-5 mini ($0.25/$2), GPT-5 nano ($0.05/$0.40) — all with 400K context and vision support.
- Key upgrades: SWE-bench 74.9% (vs o3 69.1%), HealthBench 67.2%, ~45% fewer hallucinations than GPT-4o, plus ChatGPT integrations for Gmail and Google Calendar.
What Happened
OpenAI announced GPT-5 on August 7, 2025, calling it their “smartest, fastest, and most useful model yet.” The launch represents a fundamental architectural shift: instead of forcing users to choose between models (GPT-4o, o1, o3), GPT-5 uses a real-time router that decides whether a query needs quick answers or deep reasoning — and switches automatically. Users can also explicitly request “think hard about this” to force the deeper reasoning path.
The model family includes three tiers: the flagship GPT-5, a cost-efficient GPT-5 mini, and the ultra-cheap GPT-5 nano. All three share 400K context windows, 128K max output, and multimodal (text + vision) capabilities. Rollout began immediately for ChatGPT Plus, Pro, and Team users, with Free tier access and Enterprise/Edu following within a week.
Source: OpenAI GPT-5 announcement — August 7, 2025
Key Details
- Version/Release: GPT-5, GPT-5 mini, GPT-5 nano (unified model family)
- Availability: Rolling out now to Plus/Pro/Team; Free tier within days; Enterprise/Edu next week
- Pricing (API):
- GPT-5: $1.25 input / $10.00 output per 1M tokens
- GPT-5 mini: $0.25 input / $2.00 output per 1M tokens
- GPT-5 nano: $0.05 input / $0.40 output per 1M tokens
- Platforms: ChatGPT (web, mobile, desktop), OpenAI API, ChatGPT Team/Enterprise
- Context Window: 400K tokens (all tiers)
- Max Output: 128K tokens (all tiers)
- Modalities: Text + Vision (all tiers)
What Changed (Technical + User-Facing)
| Area | Before (GPT-4o / o3) | After (GPT-5) |
|---|---|---|
| Model Selection | Manual switching between GPT-4o, o1, o3 | Automatic router picks fast vs deep reasoning |
| Reasoning | Separate “thinking” models (o1, o3) | Built-in thinking with reasoning: minimal option |
| Coding (SWE-bench) | o3: 69.1% | GPT-5: 74.9% |
| Math (AIME 2025) | o3: 88.9% | GPT-5: 94.6% (Pro: 96.7%) |
| Health Accuracy (HealthBench) | GPT-4o: 32.0% | GPT-5: 67.2% |
| Hallucination Rate (Web Search) | GPT-4o: 22.0% | GPT-5: 4.8% (~45% fewer) |
| Token Efficiency | o3 baseline | 50–80% fewer output tokens at same accuracy |
| ChatGPT Features | Basic chat, plugins | Customization, Voice, Study Mode, Gmail/Calendar apps |
| Developer API | Standard completion | reasoning + verbosity parameters, long tool chains |
| Safety Approach | Refusal-based | Safe completions — helpful within boundaries |
Why It Matters
The unified architecture solves the “model paralysis” problem where users and developers had to guess which model to use for each task. By embedding routing logic into the model itself, OpenAI shifts complexity from the user to the system — you ask, it figures out the right depth of thought.
For developers, the API improvements are substantial: 74.9% on SWE-bench puts GPT-5 at the top of the coding leaderboard, the new reasoning and verbosity parameters give fine-grained control over cost/quality trade-offs, and support for long chains of tool calls enables genuine agentic workflows. The 50–80% token reduction versus o3 at equivalent accuracy directly lowers inference costs.
For businesses, the connected apps integration (Gmail, Google Calendar, Google Drive, SharePoint) with permission-aware access means ChatGPT can now operate inside actual work contexts — drafting emails from your calendar, summarizing Drive folders, querying internal docs — without IT security nightmares.
For everyday users, the removal of model switching and addition of Study Mode, voice improvements, and app connections makes ChatGPT feel more like a genuine assistant than a chatbot. The health and citation improvements (Mayo Clinic, FDA, Cleveland Clinic sources in medical responses) address a major trust gap.
Industry signal: OpenAI is betting that unified intelligence beats model menus. The router approach — trained on user switches, preferences, and correctness — creates a flywheel where the system gets better at picking the right depth over time. Competitors (Anthropic, Google, xAI) will likely follow with their own unified architectures.
Who It Affects
- Developers/Builders: New API parameters, top-tier coding benchmarks, agentic tool chains, lower token costs. Migration from o3/GPT-4o recommended for production workloads.
- ChatGPT Plus/Pro/Team Users: Immediate access to GPT-5 with higher limits; GPT-5 Pro (extended reasoning) for Pro tier. Free users get access within days.
- Enterprise/Business: Connected apps (Gmail, Calendar, Drive, SharePoint) with existing permission models; Team/Enterprise rollout next week. Implementation guide: Inside GPT-5 for Work PDF.
- Researchers/Analysts: State-of-the-art on FrontierMath (13.5% → 32.1% with tools), Humanity’s Last Exam (42% with tools), and multimodal benchmarks. Interactive evolution timeline at progress.openai.com.
- Health/Legal/Finance Professionals: Hallucination rates dropped to 4.8% (web search) and 3.6% (HealthBench Hard) with thinking enabled — approaching usable reliability for high-stakes domains.
What to Watch Next
- API pricing pressure — At $1.25/$10 per 1M, GPT-5 is cheaper than o3 but pricier than GPT-4o. Expect competitive responses from Anthropic (Claude 4/Opus) and Google (Gemini 2.5) on price/performance.
- Router behavior in production — The real-time router’s decisions on “fast vs deep” will be stress-tested at scale. Watch for edge cases where it under-thinks complex prompts.
- Enterprise adoption of connected apps — Gmail/Calendar/Drive integration is the biggest workflow unlock. IT policies, data residency, and permission governance will determine rollout speed.
- GPT-5.5 / mini / nano roadmap — The April 2026 GPT-5.5 update (per OpenAI’s blog) suggests rapid iteration. Nano at $0.05/$0.40 could unlock high-volume, low-margin use cases.
- Safety paradigm shift — “Safe completions” replacing refusals is a philosophical change. Monitor whether partial answers and transparent boundary-setting hold up under adversarial testing.
Source & References
- Primary: OpenAI GPT-5 announcement — August 7, 2025
- Technical Deep Dive: Introducing GPT-5 — Benchmarks, architecture, safety
- Developer Guide: GPT-5 for Developers — API params, migration
- Business Use Cases: GPT-5 New Era of Work — Amgen, Cursor, Vibes case studies
- API Documentation: platform.openai.com/docs/models/gpt-5
- Model Evolution Timeline: progress.openai.com
[IMAGE: OpenAI GPT-5 announcement hero — unified model architecture diagram showing router, fast model, and thinking model]
Caption: GPT-5 architecture — real-time router chooses between fast responses and deep reasoning — Source: OpenAI official announcement
