Windows Copilot API on GitHub

Project Overview

A GitHub repository called Windows-Copilot-API was published recently. It reverse engineers the consumer Copilot interface at copilot.microsoft.com into an OpenAI-compatible REST API. The project lets users call GPT-4 and GPT-5 models through standard OpenAI client libraries without API keys or billing. It runs locally after a one-time browser login with a Microsoft account. The code is available at

Windows-Copilot-APIsumitgautam0101

View on GitHub →

and includes both a Python client and a FastAPI server.

How the Implementation Works

The repository provides two entry points. The Python library exposes a simple client object that accepts a chat method call and returns either a full response or a streaming generator. Conversation state is tracked through an optional conversation_id parameter so multi-turn exchanges stay coherent across requests.

The server component starts a local endpoint at http://localhost:8000/v1 that implements the chat completions path used by the official OpenAI SDK. Any application already written against the OpenAI client can switch base_url to the localhost address and continue working without code changes. Playwright handles the browser automation that maintains the authenticated session with Microsoft.

Session persistence is achieved by storing cookies after the initial sign-in step. The script refreshes tokens automatically when they expire. The approach works on Windows, macOS, and Linux once the Chromium browser binaries are installed through the playwright install command.

Practical Integration Steps

Clone the repository and create a virtual environment with Python 3.9 or newer. Install the listed dependencies from requirements.txt, then run the Playwright setup. The sign-in flow launches a browser window where the Microsoft account credentials are entered once. After that step the session file is written and reused for subsequent runs.

To start the OpenAI-compatible server, execute the provided app.py script. It binds to port 8000 and logs incoming requests in the same format as the real OpenAI service. Streaming responses are delivered as server-sent events so token-by-token output appears in tools that support it.

Developers who already use LangChain or other frameworks built on the OpenAI client can point the configuration at the local server and test prompts immediately. No rate-limit headers from Microsoft are forwarded, so client-side retry logic may need adjustment if long conversations trigger throttling.

Limitations and Trade-offs

The service depends on an active Microsoft account and the continued availability of the consumer Copilot web interface. Any change to the underlying page structure can break the automation until the repository is updated. Because the implementation scrapes and replays browser traffic, latency is higher than a direct API call and includes the overhead of maintaining a browser context.

Microsoft's terms of service do not explicitly authorize this usage pattern, so accounts risk temporary restrictions if unusual traffic patterns are detected. The project supplies no mechanism for handling paid Copilot Pro features or enterprise tenants. Output quality remains identical to the free web experience, including any content filters that Microsoft applies at the browser layer.

FAQs

Does this require a paid Microsoft 365 subscription? No. A standard free Microsoft account used for copilot.microsoft.com is sufficient.

Can the local server be exposed publicly? The code contains no authentication layer for the localhost endpoint, so exposing it beyond the machine is not recommended.

How does streaming compare to the official OpenAI SDK? Token chunks arrive through the same SSE format, but overall throughput is limited by the browser automation layer rather than a dedicated inference endpoint.

---

📖 Related articles

Need a consultation?

I help companies and startups build software, automate workflows, and integrate AI. Let's talk.

Get in touch