AI Code Assistants for Android Engineers

I reviewed the available agentic coding options for Android engineers while pursuing my own workflow adaptation.

AI Code Assistants for Android Engineers

I've recently written about adapting to AI coding. It was great and exciting, especially because I really wasn't doing any Android product work all to often. When I did need to build an app to test something I switched back and forth with Android Studio. That was pretty annoying and never felt like an effective workflow. Also the Kotlin language server (available at the time) was pretty bad and a huge resource hog even without any compilation running. VS Code hasn't had a track record for JVM projects so I knew that I needed to be on the lookout for what was going to be next. This is the journey I went on in my personal projects.

Evaluation Criteria

My workflow expectation today is a fully agentic coding experience with checkpoints, AI code review, rules, MCP, and context management all in an ergonomic UX. I know its not unrealistic because I have that today. Everything else in my criteria list is more of an iterative filter.

  • Working from Android Studio takes the top spot for me as an Android engineer. I've seen arguments that in 6 months the mobile specific tooling of these IDEs isn't going to matter - I'm in the opposite camp. I think there is enough competition that we're going to get great experiences everywhere because all the IDE maintainers want to keep or grow their developer market share. The ones that can't do this hog resources and create unnecessary context switching.
  • Model lock-in. Yes Claude 4 and Gemini 2.5 Pro seem like the best coding agent models today, but being able to easily swap to what works for your workflow is going to enable you to adapt and find greater usefulness in tools as they become compatible.
  • Greater MCP tool support. None of the clients today implement the entire spec and I bet this area will be a race to see who can be more compatible with more tools faster while not incurring security incidents. Tool discovery, resources, LLM streaming, and a number of other features aren't implemented in any of the following clients. Any tool that doesn't have part of the MCP spec another tool does is worth measuring. I've implemented all the transports for my MCPs, fast-agent has implemented the entire MCP spec, so its a matter of time before the AI coding tools are expected to have full support.
  • UPDATE: While it wasn't in my criteria when I originally wrote this post, background agents and headless is quickly becoming a capability across multiple agentic coding tools.
I have limited time and there are a lot of tools. If its not listed here its probably because its not being discussed in the Android community. Don't send me review requests, just write your own.

GitHub CoPilot

To be honest I haven't used this in a really long time, probably mid-2024 was the last time. I had a paid subscription on GitHub that apparently ended in December when they made a free tier and I didn't notice. Autocomplete just wasn't compelling at the time and I don't really want to try it for a proper review because I keep hearing from peers that its just not great. Glad they stopped billing me for what I wasn't using.

CodeGPT / ProxyAI

I'd just ended 2024 leaving an AI startup where I spent a lot of my time iterating on a local LLM product and workflow, so I was excited to see an IntelliJ plugin named CodeGPT that could both integrate with all the typical AI provider APIs (OpenAI/Anthropic) while also providing Ollama support out of the box. Ollama still doesn't have MLX (though I'm absolutely tracking that development).

I tried this plugin for about a week while I was getting to learn Cursor and establish my workflow there. It was buggy and strange and I moved on. Just to be fair in this evaluation I installed the latest version, now named ProxyAI. Its vastly improved since then.

Open source doesn't mean free, but their pricing is super modest compared to the rest.

Pros:

  • All the models
  • Works in Android Studio
  • Interesting UX
  • Open source! Neat as heck to be able to see what the engine is doing, and I bet with the release of Koog we could see OSS get better faster.

Cons:

  • No MCP support (but says its coming soon)
  • The apply model when I tried it months ago was pretty bad, but now it seems like those kinks are mostly worked out. Still not as good as the other solutions, but its impressive that its getting closer.

Cursor

Wow. What a trip it was to write Dockerfile optimizations that I could instantly and easily verify. Writing quick scripts, asking it for analysis while crawling web resources for documentation, rewriting some JVM memory analysis scripts, it did everything I threw at it. The code review UX they authored was something that took getting used to - but also wasn't hard to grok or get up to speed on. Suddenly I wasn't writing code some days, even though I was still creating highly valuable software.

Pros:

  • All the models, released about as fast as I heard about them. I've read about claims that they're doing weird stuff with context windows on certain models but in practice it hasn't felt that significant.
  • The extensible rule system caught my eye but I didn't know what to do with it at the time. Lately I've shared what I've been able to do with Firebender with Cursor users and they're getting similar benefits, which shows that both rule systems are flexible enough to allow for sophisticated AI assisted workflows.
  • Table stakes MCP tooling support.

Cons:

  • Kotlin Language Server is terribly unperformant and didn't even seem to help the coding agent - it just allowed me to get syntax highlighting. This was without running Java and Gradle extensions which were each also resource hogs. I found it more efficient to switch back and forth with Android Studio.
  • While all the models are available there is some extra add-on pricing around certain models or reduced context windows. In practice I didn't actually notice that being such a big issue, but
  • The "Auto" model selection by default really threw me for a loop. There were definitely times I felt like I was talking to Claude 3.5 and now that I've gone back and retested the same with MCPs its clear that Cursor has some internal heuristics that swap out models.
  • MCP tooling limited to ~40 tools. Doesn't quite make sense to me to have an arbitrary unconfigurable limit in a dev tool. I stopped using it before they rolled out the ability to enable/disable tools to focus on what you want, glad its iterating in the right direction.

Firebender

Its come a long way since I first heard about it in March. Today its the best agentic coding experience possible in Android Studio + IntelliJ. As an individual this is what I use day-to-day for my personal work. I haven't had a chance to try the new Composer features (and I'm not doing anything with Figma right now), so hopefully someone else can explore that part of it. I wrote my Android MCP SDK and AWS Bedrock implementation for koog with it, its definitely helped me get going in open source contributions in a way I didn't have time for before.

Pros:

  • All the models. Yes Claude 4 and Gemini 2.5 Pro seem like the best coding agent models, but being able to easily swap to the latest and greatest.
  • Fast shipping. They're loosely coupled with the IDE due to being a plugin, so the features and bug fixes come often. When I was writing this post they released support for the latest OpenAI o3-pro model on 6/11 before I'd even heard it was a thing.
  • Extremely open to bug reports, feature requests, and developing a community on their Discord. Lots of things have support communities and extremely few actually give the level of responsiveness they are dedicating. None of the other tools come close in this aspect.
  • Its performant. They keep advancing their Apply model + code completion + inline editing features to be faster and faster. I started working on an MCP in both Cursor and Firebender and Firebender got to a working MVP twice as fast. While it does take some resources to run its nothing compared to the Gradle builds I run on my machines.
  • Rules have support for directly referencing markdown via Full File Rules  which lets allows leveraging existing docs as part of agent rules. While I've wondered about whether the Cursor rules style of being able to use rules to guide the invocation of rules, in practice I haven't found that to be a serious gap.
  • MCP support for tool calls included. Ability to disable specific individual tools - great for when I want to use parts of MCP servers. I discovered this before I knew Cursor had it, not sure who came up with it first.
  • Its great at Kotlin - the way it resolves issues with observing IDE lint feedback is neat. The lint warning fix doesn't always work well in other languages like TypeScript, but I just use rules to tell it to run npm eslint . --fix.
  • The Gradle sync feature they recently released is excellent, especially in small projects. Up until recently I depended on the JetBrains MCP tool call for Gradle sync which was a big unnecessary use of AI interpretation when a deterministic tool does the job 20x faster.
  • One of the few that outlines in their code policy that they do not train on code data.
Note on code policy and model providers

There is a line in the code policy that says


"If you use Firebender Chat, the code you upload will be sent to a third-party model provider such as OpenAI or Anthropic based on the model you select. The handling of your data by model providers is subject to their own policies, which may change at any time. We are not responsible for anything a third-party model provider may do with your data."



So I asked about it and Firebender's response:


"The code sent to us is 0 retention on our servers. For the model providers, we have 0 retention agreements with Anthropic and Open AI. The other model providers (Gemini, Grok) have a 30 day retention policy but a no train policy on that code (https://cloud.google.com/vertex-ai/generative-ai/docs/data-governance, https://x.ai/legal/terms-of-service-enterprise) does that clear up your confusions? happy to answer more questions! we wouldn't use a model provider / send code to a third party if they didn't have a DPA preventing training on data"

Cons:

  • Bugs! Like any good startup its shipping often so that's going to happen, but across their releases I only encountered one show-stopper that was fixed immediately and the auth and network issues were resolved in the recent 0.11.2 release. Most bugs I've encountered are small usability things like Markdown formatting being slightly off on bold tags directly above list items.
  • The checkpointing experience still has a lot of room for improvement - specifically in providing a full undo-redo on agent coding suggestions. If you're like me and constantly attempting to do multiple things at once in the IDE while the agent is attempting to work for you, this isn't quite there yet.
  • Local LLM support is restricted to Read mode only, which is pretty limited compared to other tools. I don't see a realistic local LLM use for coding assistants today (unless you have a $10k Mac Studio laying around) so its a bit moot.
  • MCP support works for what I need it to but there are a number of things like SSE/Streaming and tool progress that are becoming table stakes in other agentic tools.

Windsurf

Pros:

  • All the models.
  • Agent mode was competent, especially when using Claude Sonnet.

Cons:

  • When I tried it the only way to get it working at all was enabling a pre-release version of the plugin and then restarting the IDE multiple times.
  • No rules or MCP support.
  • Show stopper broken code review UX.
  • WebViews on their own aren't inherently bad, but this implementation is subpar.
  • Instantly hallucinated in code completion mode every time.

IntelliJ AI Assistant

Pros

  • Got it working in Android Studio.
  • All the models - Anthropic, Google, Deepseek, etc. Its unclear what the pricing would look like for access of them.
  • Ollama + LM Studio fully integrated. Its pretty huge to be able to use those since it enables a completely offline local workflow - and probably a big cost saver for power users.
  • Good at Kotlin. It generally passes my tests on how it understands and applies code.
  • MCP support included for tool calling included, nothing special though relative to others.
  • Since its from the folks who created IntelliJ it is implicit that they will maintain this as a first party feature.

Cons

  • It is available in Android Studio - with caveats. If you are on Meerkat and scroll through a couple pages of plugin versions to download and manually install the right one. For some reason my Meerkat couldn't find the plugin on Jetbrains Plugin Marketplace. I didn't have any issue using it from IntelliJ.
  • JetBrains deleted low scoring reviews of their plugin. You just don't do that, you ship the better product. Or rebrand. They did respond and apologize, hopefully they handle this better in the future.
  • I think they're mixing the branding of some IDE helper bits with the AI Assistant icon and that's confusing. Makes it seem like you can't completely uninstall it.

Gemini in Android Studio

I tried it recently in Narwhal Canary 4 and while its competent as an agentic experience its missing a few things:

Pros:

  • Its in Android Studio, no caveats, and integrated into a far greater number of areas of the IDE than any other tool.
  • Good at Kotlin. It generally passes my tests on how it understands and applies code.
  • I really like the context management UX here, specifically the information hierarchy and discoverability.
  • Since its from Google it is implicit that they will maintain this as a first party feature of Android Studio. They have marketed it pretty heavily and from watching Google I/O are investing a lot.

Cons:

  • No Anthropic models, no MCP support.
    • On Anthropic I presume that this is precluded by Google. Gemini 2.5 Pro is a great model and I'm sure Google will develop more good models... but right now I use Claude and its better for tool calling. As benchmarks like tau-bench come out we should be evaluating model performance against them.
    • I'm an MCP developer, so I've got to have MCP support.
  • The review UX is similar to what Firebender's was several months ago: multi-step, hard to see what the diff was in the agent pane, and making inline code edits to the agent suggestions isn't possible before accepting changes. It does have the ability to show the inline diff via a button in the top right corner of code suggestion pane, but it renders this in a read-only diff window instead of the actual file.
  • Noticed a bug where the agent loaded forever or just stopped iterating without an indicator.
  • Apply model is okay but slower than I'm used to.

Claude Desktop

Pros:

  • Decent MCP support. Since its built by Anthropic I'm pretty sure whatever they add here they're internally convinced the spec will have partner support for. Other clients should fast follow. The permissions system is probably good for people starting MCP use for the first time.
  • There are a lot of other pros of the system. I use it all the time for knowledge work. However, I'm evaluating for Android I need to jump pretty fast to the cons...

Cons:

  • The main issue is this is not usable for Android development. While I can control Android Studio via the JetBrains MCP, this is not a good workflow for an Android engineer. It took me 3 tries just to build the app without running over on context windows. This partially shows how important it is to build MCPs that are responsible with the amount of output they generate and minimizing tool call counts.
  • Model lock-in to Anthropic's models doesn't feel so bad to me since I use Claude 4 Sonnet 99% of the time, but it is what it is.
  • The only issue I have with MCP support in Claude Desktop is I really wish permissions were part of the MCP spec instead of baked into the client. Instead of having the user approve each tool call (either just once or forever), there should be a more adaptable permission system. Approving for time periods (days/weeks/months), having the ability for MCP tools to specify that approval should always be asked every time, other stuff. Filed a discussion request for the concept.
  • Observed UX jank several times when moving through the app.

Pick what works best for you

Its still early days IMO for all these tools. Most of the peers I talk to tried GitHub CoPilot back in 2023 and eventually decided they're better off without it. Hopefully this is a helpful guide for the ever evolving landscape and the evaluation criteria can be remeasured. Today I use Firebender personally am experimenting with claude code, koog and other agentic frameworks. Now that I've got my IDE tooling where I want it I'm looking forward to headless automation.