Running Qwen 3.6 in Cursor with Alibaba Cloud Model Studio

Around two years ago I wrote about testing Alibaba Cloud Model Studio when it first launched. Back then, the Qwen models were good but still felt like the underdogs compared to OpenAI and Anthropic. Today, the Qwen family has quietly become one of the most competitive model families out there, especially for coding. And now, with Qwen 3.6 out, things just got really interesting.

Qwen3.6-35B-A3B dropped less than 2 weeks ago and I’ve been wanting to plug it into Cursor ever since. Why? Because this model uses a Mixture of Experts (MoE) architecture with 35 billion total parameters but only 3 billion active at inference time. That means you get the reasoning depth of a much larger model while burning way fewer tokens per request. For a coding assistant that’s making hundreds of calls a day, that matters quite a lot.

The benchmarks speak for themselves: 73.4 on SWE-bench Verified (real-world GitHub issue resolution), 51.5 on Terminal-Bench 2.0 (outperforming Claude Sonnet 4.5), and a 262K native context window. This thing is built for agentic coding workflows, which is exactly what Cursor does.

What you need before starting

If you’ve never touched Alibaba Cloud Model Studio before, go read my previous article first. It covers:

Accessing the Model Studio console
Getting an API Key
Activating the Model Service

Once you have your API key (something like sk-fexxxxxde91xxxxxe06cb34axxxxx8e), come back here.

The trick: Model Studio is OpenAI-compatible

Here’s the thing that makes this whole setup possible. Alibaba Cloud exposes an OpenAI-compatible endpoint for all Qwen models. That means any tool that speaks the OpenAI API format can talk to Model Studio without any custom integration. You just swap the base URL, the API key, and the model name.

The base URL for the international endpoint is https://dashscope-intl.aliyuncs.com/compatible-mode/v1.

You can verify it works with a quick curl:

curl --location 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer <your-dashscope-api-key>" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen3.6-35b-a3b",
    "messages": [
        {
            "role": "user",
            "content": "Write a Python function that reverses a linked list"
        }
    ]
}'

If you get a proper JSON response with code, you’re good.

Configuring Cursor

This is the fun part. Cursor supports custom OpenAI-compatible endpoints natively, so connecting it to Model Studio takes about 30 seconds.

Step 1: Open Cursor Settings

Go to Cursor Settings > Models. You can get there via Cmd + Shift + P and typing “Cursor Settings”, or just through the gear icon.

Step 2: Override the Base URL

At the top of the Models section, find the “Override OpenAI Base URL” field. Enter:

https://dashscope-intl.aliyuncs.com/compatible-mode/v1

Cursor appends /chat/completions to whatever you put here, so the /v1 suffix is all you need.

Step 3: Set your API Key

In the “OpenAI API Key” field (yes, it says “OpenAI” but it sends the key to whatever endpoint you configured above), paste your Alibaba Cloud Model Studio API key from Model Studio.

Click Verify to confirm the connection works.

Step 4: Add the model

Click ”+ Add Model” and type qwen3.6-35b-a3b.

This must match exactly what Alibaba Cloud expects. You can now select qwen3.6-35b-a3b from Cursor’s model picker in Chat, Inline Edit, or Agent mode.

Is it actually good for coding in Cursor?

Short answer: yes. The MoE architecture means responses come back fast because only 3B parameters are active per token, but the model still has access to the full 35B parameter space for routing complex reasoning. In practice, it handles multi-file refactors, debugging sessions, and code generation with a good level of context awareness.

The 262K context window is also a big deal for Cursor specifically. As I discussed in my post about Vector RAG and Agentic Search, the size of the context window directly impacts how well the AI assistant can reason about your codebase. More context means fewer retrieval errors and better architectural understanding.

The cost angle

Running inference through Alibaba Cloud Model Studio is pay-per-token, but because Qwen3.6-35B-A3B only activates 3B parameters per forward pass, it’s significantly cheaper than calling a dense model of comparable quality. You’re getting performance that competes with models ten times its active size at a fraction of the cost.

For comparison, running this through Cursor’s built-in models uses your monthly quota. Running it through your own Alibaba Cloud Model Studio key means you control the billing directly and there are no arbitrary request limits.

What about other Qwen models?

Once you have the Alibaba Cloud Model Studio endpoint configured in Cursor, you can add any model available on Model Studio. Just click “+ Add Model” and use the model name. Some worth trying:

qwen3.6-plus — the hosted flagship, even more capable but pricier
qwen3.5-plus — solid all-rounder
qwen3.5-flash — faster and cheaper for simpler tasks
qwen3-max — the previous generation flagship

You can switch between them in Cursor’s model picker depending on the task.

Wrapping up

The fact that you can point Cursor at Alibaba Cloud Model Studio and run Qwen 3.6 with three settings changes is honestly great. No custom SDKs, no wrapper scripts, no Docker containers. Just a base URL, an API key, and a model name.

What excites me most is the direction this is heading. When I first wrote about Model Studio in 2024, it was a new platform with a lot of promise. Now it’s a legitimate alternative for daily coding workflows. And as I wrote about running LLMs locally with Docker Model Runner, we’re living in a time where the monopoly on AI inference is crumbling. You can run models locally, you can run them on Alibaba Cloud, you can run them wherever makes sense for your workflow.

The MoE approach that Qwen 3.6 takes is probably the future of coding models: massive parameter spaces for deep reasoning, tiny active footprints for speed and cost. If you haven’t tried running a non-OpenAI, non-Anthropic model in Cursor yet, this is a great place to start.