OpenAI Launches ChatGPT Desktop Version, Mirroring Microsoft’s Copilot

OpenAI enters into computer vision and agent control.

ChatGPT can now work with different apps on macOS and Windows desktops, OpenAI announced on X on 15 November. This marks the company’s first direct attempt at computer vision and agent control. 

This early beta update claims to let ChatGPT examine coding apps to provide better answers for Plus and Team users. It not only assists with codes like VS Code, Xcode, Terminal, and iTerm2 but also talks to its users (through its voice assist feature), lets them take screenshots, upload files, and search the web (through SearchGPT). 

As reported earlier, Anthropic also made Claude Artifacts available to all users on iOS and Android, allowing anyone to create apps easily without writing a single line of code. 

A ChatGPT feature that becomes highly beneficial in desktop use is asking anything. Users can select any section of any document and open ChatGPT to ask for meanings, explanations, and feedback. This is a desktop implementation of ChatGPT’s most evident function.

This development follows the discussions from a day ago about OpenAI’s agent, ‘Operator,’ which is to be released in January 2025. Rowan Cheung, founder of ‘The Rundown AI,’ speculates that the next step beyond this would be to allow ChatGPT to control and see desktops as an agent. 

OpenAI Follows Suit

In October this year, Microsoft released its ‘Copilot Vision’ to transform autonomous workflows with Copilot. According to Microsoft, these autonomous agents would be the new ‘apps’ for an AI-driven world, executing tasks and managing business functions on behalf of individuals, teams, and departments. 

Meanwhile, the company also introduced ten new autonomous agents in Dynamics 365, to automate processes like lead generation, customer service, and supplier communication for organisations. 

Following that, Anthropic made a big announcement by releasing its new Claude 3.5 Sonnet which would control computers with the beta feature, ‘Computer Use’. The company had reported that the model made significant progress in agentic coding tasks, which involved AI autonomously generating and manipulating code.

This approach to Anthropic Claude’s computer feature stood out extensively as it didn’t rely on multiple agents to perform different tasks; instead, a single agent managed multiple tasks. 

As compared by AIM earlier, Microsoft integrated Copilot into MS Excel, while Claude directly operated Excel. This called into question the existence of Copilot.

OpenAI wasn’t behind, even though this move by Anthropic and others (like Google Jarvis, speculated to release this month) had created a stronghold in the AI industry. OpenAI’s focus has also shifted to interface from expanding its features. 

OpenAI entered this race by introducing the Swarm framework, an approach for creating and deploying multi-agent AI systems. It was the missing piece that simplified the process of creating and managing multiple AI agents helping them work together to accomplish complex tasks.

Following that, the launch of ChatGPT on desktops was a major step for a pioneer in AI to transform the way this chatbot is used, only to be enhanced by ‘Operator’ in January. 

Now, the chatbot will be able to provide answers, be a companion, and assist with daily tasks.

📣 Want to advertise in AIM? Book here

Picture of Sanjana Gupta

Sanjana Gupta

An information designer who loves to learn about and try new developments in the field of tech and AI. She likes to spend her spare time reading and exploring absurdism in literature.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.