Deploy your site using here.now completely for free, copy the prompt for your agent here! https://here.now/r/twt I accidentally discovered Cursor's new Composer 2.5 model mid-project — and it was completing tasks 3–4x faster than Opus 4.7 with results that were just as good, if not better. I had no idea it had even switched. So I ran a proper head-to-head test: Composer 2.5 vs Opus 4.7 vs GPT 5.5 — all given the exact same prompt inside Cursor. The results surprised me. Want to make real money with coding? I share high-signal insights on careers, monetization, and leverage in my free newsletter. Join here and get my guide How to Make Money With Coding instantly: https://techwithtim.net/newsletter 🚀 Tools I Use Get 10% off with code techwithtim Openclaw setup: https://www.hostinger.com/techwithtim VPS setup: https://www.hostinger.com/techwithtim10 Wispr Flow (Best AI Dictation): https://ref.wisprflow.ai/techwithtim ⏳ Timestamps ⏳ 00:00 | My Experience with Composer 01:10 | What is Composer 2.5 02:28 | Composer 2.5 Benchmarks & Stats 04:30 | Understanding Agent Harnesses 08:36 | Here.now 10:35 | Live Demo - Composer vs Opus 4.7 16:27 | Live Demo - Composer vs GPT 5.5 Hashtags #CursorAI #AIComposer #CodingAgent UAE Media License Number: 3635141
ADVERTISEMENT
curser is microsoft right? but composer is kimi, a chinese product? so weird that microsoft just sucks at ai. i did like bings image generation for disney style stuff but thats about it, all other ai stuff was kinda bad
Great point about harness, but it would be also nice to actually see Opus in Claude Code vs Opus in Cursor (or GPT in Codex vs Cursor)
I always wondered why the same model acted differently in different tools. Does the Composer harness work well with bigger codebases or mostly fresh projects I have been looking at Llamapress ai since they have a free plan and I keep hearing good things. How does Cursor compare to something like llamapress
This is an unfair comparison. You're using the Cursor HARNESS with a model optimized for Cursor. You need to use Claude Code and Codex to compare the results. Rookie mistake.
what about opus 4.8 in vs code or terminal ?
How would you compare antigravity’s IDE w cursors IDE in terms of running Claude/codex extensions?
is it composer 2.5 or kimi ?
Hey Tim, so what's the better option currently - Cursor or Windsurf?
i tried your prompt with opus 4.8, on high effort. it took 18 min (but it asked clarification questions first to confirm it builds what i want).. after 18 min, drawing shapes didn't work same as your demo. i told it drawing shapes doesn't work, it took 5 mins to fix, then it worked.. typing still doesnt work.. so yeah i see your point that composer 2.5 is faster.. will give it a shot
how much cursor paid for ad video?
What about Composer vs 5.5 ?
Cursor just crushed Claude Code? Not bad for a Claude fork
Every decent benchmark and real-world use case puts 2.5 miles behind Opus 4.8 and ChatGPT 5.5.Claude Haiku is better than Composer 2.5.
Is it just me or have they removed the native extensions/plugin for Claude code within Cursor? I can only run it through the terminal within Cursor. Am I doing something wrong?
be honest, is this a sponsored video by cursor ?
composer is nothign but a wrapper around chinese Kimi K2.5 base model
Harness doing the heavy lifting. Fits the pattern - diminishing returns on scale, real gains from orchestration and context management. We've been oversold on parameter count as a proxy for capability.
What gives? I asked google "What LLM does Cursor use?" it answers "Cursor does not use just one LLM; it is an AI-first IDE that allows you to switch between several models depending on your needs. Anthropic: Claude 3.7 Sonnet, Claude 3.5 Sonnet, and Claude Opus or OpenAI: GPT-4o, GPT-4, and reasoning models like o1-mini and o3-mini Google or Gemini Pro (e.g., Gemini 2.5 Pro) Is Cursor a middle-man for other AI agents?
I think it all DEPENDS on what you're doing! Are you trying to improve a 100k C++ financial application.
Composer 2.5's been great, mainly tried out as the token usage via Opus 4.8 was just not manageable.