Article
As part of my AI coding evaluations, I run a standardized series of four programming tests against each AI. These tests are designed to determine how well a given AI can help you program. This is kind of useful, especially if you’re counting on the AI to help you produce code. The last thing you want is for an AI helper to introduce more bugs into your work output, right?
Test 1: Write a simple WordPress plugin
Wow. Well, this is certainly a far cry from how Bard failed twice and Gemini Advanced failed back in February 2024. Quite simply, Gemini Pro 2.5 aced this test right out of the gate. The challenge is to write a simple WordPress plugin that provides a simple user interface. It randomizes the input lines and distributes (not removes) duplicates so they’re not next to each other.
Test 2: Rewrite a string function
In the second test, I asked Gemini Pro 2.5 to rewrite some string processing code that processed dollars and cents. My initial test code only allowed integers (so, dollars only), but the goal was to allow dollars and cents. This is a test that ChatGPT got right. Bard initially failed, but eventually succeeded.
Test 3: Find a bug
At some point during my coding journey, I was struggling with a bug. My code should have worked, but it did not. The issue was far from immediately obvious, but when I asked ChatGPT, it pointed out that I was looking in the wrong place.
Test 4: Writing a script
This last test isn’t all that difficult in terms of programming skill. What it tests is the AI’s ability to jump between three different environments, along with just how obscure the programming environments can be. This test requires understanding the object model internal representation inside of Chrome, how to write AppleScript (itself far more obscure than, say Python), and then how to write code for Keyboard Maestro, a macro-building tool written by one guy in Australia.
Conclusion
It was really just a matter of when. Google is filled with many very, very smart people. In fact, it was Google that kicked off the generative AI boom in 2017 with its "Attention is all you need" research paper. So, while Bard, Gemini, and even Gemini Advanced failed miserably at my basic AI programming tests in the past, it was only a matter of time before Google’s flagship AI tool caught up with OpenAI’s offerings.
FAQs
Q: Have you tried Gemini Pro 2.5 yet?
A: Yes, I have tried it, and it has performed well on my coding tasks.
Q: How did it perform on your own coding tasks?
A: Gemini Pro 2.5 was able to assist me in writing a simple WordPress plugin, rewriting a string function, and finding a bug in my code.
Q: Do you think it has finally caught up to, or even surpassed, ChatGPT when it comes to programming help?
A: Yes, I believe Gemini Pro 2.5 has caught up to ChatGPT in terms of programming help.
Q: How important is speed versus accuracy when you’re relying on an AI assistant for development work?
A: Accuracy is more important than speed when relying on an AI assistant for development work.
Q: Did Gemini Pro 2.5 surprise you the way it did here?
A: Yes, Gemini Pro 2.5 surprised me with its ability to assist me in writing code and finding bugs in my code.

