The Thing I Find Most Baffling About the Programming Tests I’ve Been Running
The thing I find most baffling about the programming tests I’ve been running is that tools based on the same large language model tend to perform quite differently.
The Best AI for Coding in 2025 (and What Not to Use)
For example, ChatGPT, Perplexity, and GitHub Copilot are all based on the GPT-4 model from OpenAI. But, as I’ll show you below, while ChatGPT and Perplexity’s pro plans performed excellently, GitHub Copilot failed as often as it succeeded.
Testing GitHub Copilot
I tested GitHub Copilot embedded inside a VS Code instance. I’ll explain how to set that up and use GitHub Copilot in an upcoming step-by-step article. But first, let’s run through the tests.
Test 1: Writing a WordPress Plugin
So, this failed miserably. This was my first test, so I can’t tell yet whether GitHub Copilot is terrible at writing code or whether the context in which one interacts with it is limiting to the point where it can’t meet this requirement.
Test 2: Rewriting a String Function
This test is fairly simple. I wrote a function that was supposed to test for dollars and cents but wound up only testing for integers (dollars). The test asks the AI to fix the code.
Test 3: Finding an Annoying Bug
GitHub Copilot got this right. This is another test pulled from my real-life coding escapades. What made this bug so annoying (and difficult to figure out) is that the error message isn’t directly related to the actual problem.
Test 4: Writing a Script
Here, too, GitHub Copilot succeeded where Microsoft Copilot failed. The challenge here is that I’m testing the AI’s ability to create a script that knows about coding in AppleScript, the Chrome object model, and a little Mac-only third-party coding utility called Keyboard Maestro.
Final Thoughts
Given that GitHub Copilot uses GPT-4, I find the fact that it failed half of the tests discouraging. GitHub is just about the most popular source management environment on the planet, and one would hope that the AI coding support was reasonably reliable.
Conclusion
As with all things AI, I’m sure performance will get better. Let’s stay tuned and check back in a few months to see if the AI is more effective at that time.
Frequently Asked Questions
Q: Do you use an AI to help with coding? What AI do you prefer? Have you tried GitHub Copilot?
A: We’d love to hear your thoughts! Share your experiences with AI-powered coding tools in the comments below.
Q: How do you stay up-to-date with the latest developments in AI-powered coding?
A: Follow our AI and coding coverage, and stay tuned for more updates on the latest AI-powered coding tools and services.