Notes on Windsurf

Observations, bugs, and feature requests

Jimmy E. Chan

Dec 11, 2024

This post is an extension of my main post:

I built FearCutter, an AI interview prep tool, in just 5 days

Jimmy E. Chan

December 11, 2024

Read full story

Introduction

Windsurf is an emerging AI code editor. It launched about a month ago and it features a more agentic way to help edit your codebase with AI.

Using Windsurf is like having the productivity of a small team, connected via mind meld. My take on the future of AI coding agents is that one developer + an AI agent will be more effective than having a team of 2-3 people working on the same project, especially for MVPs and early-stage features.

I was able to ship the first version of FearCutter, in just 5 days. The entire process of defining, designing, and coding the product from idea to a live, production-grade, and self-serve version of the product would have taken a small engineering, design, product team ~2 months. With an AI coding agent like Windsurf’s Cascade, I could do it all myself in days.

This post is for builders or developers interested in trying out Windsurf. Because the product is relatively new, there are some issues you need to be aware of to avoid some of the pain and frustration I went through. The product is evolving very quickly though, so by the time you read this, they may already be addressed.

My take on Windsurf

Windsurf is incredible. The heart of what makes it so is Cascade, its “flow” feature (i.e. agentic). It’s really the heart of what makes it all work. It’s a new UX on how devs and AI agents interact and this is the single most compelling part of it.

Cascade’s Write mode, in particular, was effective at making multi-file code changes across my codebase. It’s especially convenient for grunt work, like updating your codebase to make room for a newly added column in your database or refactoring a React component used in multiple places.

Of course, there were limitations. Windsurf sometimes rewrote working code (one step forward, two back), got stuck in debugging loops, and lacked access to the latest documentation or SDK/API usage. But these are minor issues compared to the value it provides.

While AI agents like Windsurf’s Cascade help you get pretty far, the last 30% could indeed be frustrating. The reality is that you still need some good technical foundation. You need to at least be able to read basic code, understand how web apps and the internet generally works, how web clients and servers communicate and exchange data, and how to generally design systems.

Why is having a technical foundation so important? Because AI doesn’t have the vision, motivation, and taste that you have. It can’t read your mind, so its agentic actions are as good as your prompts. It doesn’t deploy and it cannot ship on its own. The process of working with AI code agents isn’t a sequence of AI deploying the first 70%, then switching to you to complete the remaining 30%. Instead, it’s more of a constant refining process of prompting and guiding AI to do the work, but faster, more diligently and tirelessly than if you had done it yourself.

Evidently, the future for code writing isn’t quite here yet, but it isn’t too far away either. AI doesn’t do everything and you still need to be technical enough to direct it. AI coding agents like Windsurf and foundational models fine-tuned for coding will lead us there.

Below are some notes I took about Windsurf while using it to build my side project.

Observations

Cascade is an amazing UX for AI code editing: Its ability to look through and make multi-file code changes across the entire codebase is incredibly powerful. If I change my database schema, I can just ask it to update all my types accordingly based on my new schema. This work is not fun for anyone to do and the AI can do it faster and with fewer mistakes.
Clear boundaries, better code: My experience building Dropbase’s AI features taught me that defining clear boundaries for code generation leads to more accurate results. Full-stack frameworks like Next.js likely make it easier for LLMs to generate consistent code across the app. I highly recommend using a popular full stack framework to build your app with.
Personal AI coding buddy: I was delighted to see suggestions for terminal commands to run. Cascade watches stdout and automatically suggests a command to run if the previous one failed – it felt like having a helpful coding buddy who is ready to work whenever I want to, minus the judgement for not remembering syntax for common commands.

Bugs

Anthropic Warnings: Sometimes when asking Cascade to bypass checks or modify system prompts for the AI interviewer, I received warnings from Anthropic. It’s like Anthropic thinks I’m trying to hack it, but I’m just trying to get things done in my own app.
Accept/Reject Issues: Occasionally, "Accept Changes" and "Reject Changes" in Cascade Write mode didn’t work, requiring me to rely on commit history to recover. In a couple of instances, accepting/rejecting changes actually deleted all the code from my root page causing me major panic. Committing often is essential. In 1-2 cases, I lost progress and had to manually go back to multiple previous commits to manually pick up working code and re-stitch a working version manually. At least twice after explicitly rejecting changes it deleted my entire /page.tsx which broke the app.
Writer Flow Risks: Stopping Writer Flow midway can break the entire app, so over-committing is highly recommended. By over-committing, I mean running git commit after every Write mode iteration or after each successful code change that doesn’t break your app.
Chat Mode/Founder Mode: Cascade Chat mode operates in Founder Mode, by sometimes trying to write code on my behalf, which can be both helpful and frustrating. I expected chat mode to just chat, not code, but it occasionally it wanted to take over.
Aggressive Write Mode: Write mode can be overly aggressive, so I often had to ask it to confirm changes with me before editing files. I’m conflicted on this one though, because in many cases it went ahead and did what I wanted.
Unwanted Actions: While Writer mode usually makes good judgments, it sometimes takes unexpected actions. For example, it automatically implemented a redirect from /dashboard page to the root page /, which wasn't my intention. This was frustrating because prior to this, it had proposed 3 options. Those were good and I preferred option 1 but it proceeded to option 2 without giving me a chance to choose. For context, I generally dislike UX that forcefully redirects logged in users to the app’s /dashboard instead of allowing them to go to the company’s landing page /, especially when trying to read about product features or pricing.

Feature Requests

Feature Scoping/Code Boundaries: Asking Cascade to write new features sometimes makes it rewrite previously working features, a kind of regression induced by AI hallucination. This was painful because it broke things that were already working. I’ve heard the comments on “one step forward, two steps back” with coding tools. To be fair, human devs do this all the time. My workaround was to do an extraordinary amount of write —> test —> commit cycles. This allowed me to revert changes in case things went wrong. Maybe some sort of “feature scoping” or “code boundary” feature would help. Or maybe I just need to write longer, more detailed prompts or manually add comment boundaries after I have a good working set of features. Ideally the product UX would handle this for me. It’d feel more magical and delightful
Enhanced "Flow" and Terminal Awareness: Windsurf could be even more agentic by continuously monitoring the terminal or browser for errors and feeding that information back into the code editing process. It already does this to some extent, but it could be taken further. For example, it could take periodic screenshots or snapshots of the browser’s console/network logs of my localhost:3000 or copy the text error to do another Write run. Codeium has already built a Chrome extension and could just close the loop here to create a magical debugging experience for web app or frontend development, where the extension syncs/feeds data to Cascade (with the right user permissions enabled, of course).
Database Schema Awareness: Currently Windsurf isn’t aware of my database schema even though it knows what database I’m using and knows how to construct calls to fetch the data. It also keeps forgetting that I gave it schema information in previous steps so I need to constantly re-supply it. In my experience, schema awareness significantly reduces hallucinations and improves type code generation (if you use TypeScript or Pydantic). When I built Dropbase AI features, I learned that if there’s one piece of information about your app that will improve code generation is having context of the database schema. Maybe using something like the Anthropic MCP to connect to a database or documentation context could help. This would allow the AI to understand the data model and generate more accurate code. Or enable memory feature that saves schema info and always provide it to the agent’s context.
Better Auto-Debugging: When Cascade gets stuck in a loop of writing and re-writing the same code, it’s usually a sign that it’s missing some context or information. Cascade should suggest adding logs or print statements to isolate the problem. Better auto-debugging tools would be a huge win.
Up-to-Date Knowledge: Windsurf needs to be able to access and incorporate information from web searches, recent documentation, and SDK/API updates. Trying to build a Next.js 15 app, for example, is likely to fail. Any other ways to keep updated documentation in context can be a huge competitive advantage because as good as Sonnet 3.5 is, it still has a training cutoff date.

Surgical Version Control: Since Windsurf is a fork of VS Code, there’s a lot of UX flexibility. I'd love to be able to revert changes from specific Writer iterations, not just steps within a Writer iteration. For example, if you go through multiple Writer runs that progressively edit the same file, you might realize that it made some mistakes later that broke one of your other files. At that point you can’t recover checkpoints in a more surgical manner. “Revert, backtrack, surgical version control” would be killer features – maybe even tie code changes to specific Writer runs.
Reset Context: Resetting app context would be useful. I want to be able to access some of the context that Windsurf stores and edit or reset it so that it better reflects the current app’s functionality. This could help reduce the issues with “one step forward, two steps back” and better keep track of database schema. Reset could be done by scanning through the codebase and re-generating codebase context to match a more updated version of the app.

I built FearCutter, an AI interview prep tool, in just 5 days

Jimmy E. Chan

December 11, 2024

Read full story

How I shipped an interview prep app in just 5 days with AI.

Jimmy E. Chan

December 11, 2024

Read full story

@jimmyechan

I built FearCutter, an AI interview prep tool, in just 5 days

I built FearCutter, an AI interview prep tool, in just 5 days

How I shipped an interview prep app in just 5 days with AI.

Discussion about this post