DevDay 2025 Is In: Powerful Codex, Meaningless Open, and Secretive Apps

Another year, another new layer on top of ChatGPT.

Like last years tech week, I again flew to San Francisco for OpenAI DevDay 2025, held at Fort Mason on October 6, 2025
https://openai.com/devday/

San Francisco is not my favorite place to be. It is hard to ignore the contrast. Inside the venue, it is all optimism about the future of software. Outside, you see a city, and honestly a state and a country, that are visibly struggling. That part weighs on you. Still, I felt privileged to be in the room. DevDay brought together more than 1,500 developers, and regardless of how I feel about OpenAI as a company or about increasingly closed ecosystems, being part of that conversation matters as last time I missed the chance.


ChatGPT Is Becoming a Platform Again

DevDay 2025 was not just about new models. It was about OpenAI trying, once again, to solidify ChatGPT as a platform.

  • Codex, now generally available, positioned as a serious coding agent (which I already use daily)
  • AgentKit, a toolkit for building and operating agents
  • Apps inside ChatGPT, with a new Apps SDK
  • API updates around cost, latency, and capability
  • Sora 2 entering the API for video generation

The story is clear. Models lead to agents, and agents lead to distribution inside ChatGPT.

Whether that story actually works this time is still an open question.


1. Codex: The Most Practical Part of the Day

If there was one announcement that felt immediately useful, it was Codex.

OpenAI announced that Codex is now generally available, alongside several workflow-focused additions.
https://openai.com/index/introducing-codex/

These included:

  • Native Slack integration, where you can tag @Codex and delegate tasks
  • A Codex SDK for embedding the agent into internal tools and workflows
  • Administrative controls for monitoring, permissions, and governance

This was not framed as “look how smart it is.” It was framed as “this is how we actually use it internally.” OpenAI stated that Codex is now deeply integrated into their own engineering workflows, reviewing most pull requests and increasing merge velocity.

Whether you take those metrics at face value or not, the signal is important. Codex is being positioned as a collaborator, not an autocomplete engine.

What stood out to me is that Codex finally has real workflow surface area.

  • Slack integration means it fits into existing team coordination
  • The SDK means you can wire it into CI, scripts, or internal platforms
  • Admin tooling makes it possible to deploy without flying blind

There is also a GitHub Action and a CLI (codex exec) that let you run Codex inside shell-based workflows.

The mental model that works best is not “AI pair programmer,” but “very fast junior engineer.” It is excellent at well-scoped, reviewable tasks. It is dangerous when instructions are vague. Used correctly, it is leverage.


2. GPT-OSS: “Open,” but Not Really the Point

One of the more headline-grabbing announcements was GPT-OSS, OpenAI’s so-called open-weight models, gpt-oss-120b and gpt-oss-20b, released under an Apache 2.0 license.
https://openai.com/index/introducing-gpt-oss/
https://huggingface.co/openai

On paper, this checks many boxes. Open weights, reasoning-focused, tool use, structured outputs, compatibility with OpenAI-style APIs.

In practice, this does not feel like a serious attempt at competing in the open model ecosystem.

The 120B model requires datacenter-class hardware. The 20B model, even with quantization, still sits in an uncomfortable middle ground for real-world local or edge use. OpenAI themselves note that these models are intended for powerful GPUs, not consumer devices.
https://huggingface.co/openai/gpt-oss-20b
https://huggingface.co/openai/gpt-oss-120b

More importantly, GPT-OSS does not feel like a strategic focus. It feels like a fig leaf.

It exists so OpenAI can say, “Yes, we did an open model,” while the core investment, polish, and roadmap energy remain firmly on closed, hosted systems. Compared to genuinely competitive open models like Qwen3 from Alibaba, GPT-OSS is neither as portable nor as useful.
https://huggingface.co/Qwen

Qwen3 runs efficiently, scales down better, and is far more practical for teams actually trying to build local-first or edge-capable systems. GPT-OSS, by contrast, feels like it was released to quiet a conversation, not to lead one.


3. Apps in ChatGPT: Third Time Is the Charm?

The most ambitious and most questionable announcement was Apps in ChatGPT.

OpenAI introduced apps that live directly inside the chat interface, alongside a new Apps SDK built on top of the Model Context Protocol.
https://openai.com/index/introducing-chatgpt-apps/

Apps can render UI inside conversations and be invoked directly or suggested contextually. Launch partners included familiar names like Canva, Expedia, Booking.com, Spotify, Figma, and Zillow.

If this feels familiar, that is because it is. We have been here before.

  1. Plugins
  2. Custom GPTs
  3. Apps

Each time, the promise is distribution and monetization. Each time, the details lag behind.

To OpenAI’s credit, the Apps SDK itself is more interesting than previous attempts. It is open source, built on an open protocol, and designed so apps can theoretically run outside ChatGPT as well.
https://github.com/openai/model-context-protocol

That is the first time this ecosystem story has had a technical foundation that could outlive a single product iteration.

What’s still missing is economic clarity. Monetization is positioned as something that comes later, with only vague references to future revenue sharing and agentic commerce. The distribution model also remains opaque: OpenAI controls the access and the rulebook is not yet clear. As a result, it’s uncertain whether the time developers invest in this experimental feature and in building apps will translate into official support or discoverability within the ecosystem.
https://openai.com/index/introducing-chatgpt-apps/

As this happened before with the plugins and GPTs, developers will likely wait. Without a clear and credible business model, some will not commit serious resources for the third time.


4. AgentKit: Admitting Agents Are Hard

AgentKit was a quieter announcement, but an important one.

OpenAI bundled together tooling for building, evaluating, and operating agents.
https://openai.com/index/introducing-agentkit/

This includes:

  • Agent Builder, a visual workflow tool with versioning
  • Connector Registry, for managing tool and data access
  • ChatKit, for embedding chat-based agent interfaces into products

This is OpenAI implicitly acknowledging that agent failures are usually system failures, not model failures. Without instrumentation, evaluation, and control, agents look impressive in demos and fall apart in production.

ChatKit and evaluation tooling are generally available, while other components are still rolling out in beta.
https://openai.com/index/introducing-agentkit/


5. Everything Else

The remaining announcements were incremental but useful.

  • GPT-5 Pro in the API, positioned as the most precise model for high-stakes tasks
  • Cheaper “mini” models for realtime voice and image generation
  • Sora 2 becoming available via API for video generation

These matter for cost curves and product viability, but they were not the emotional center of the event for me.


Where I Landed

Codex was the clear win. It felt grounded, practical, and immediately applicable. From my experience, it is a great daily companion for skilled senior engineers who would normally juggle many juniors engineers from task to task.

GPT-OSS did not land for me. It feels like a symbolic move rather than a serious commitment to open and portable models, especially when competitors like Qwen push much harder in that direction.

Apps are the biggest bet and the biggest question mark. The SDK is promising. The ecosystem story is still unproven.