Table of Contents
ToggleWhich AI is the best? ChatGPT, Google Bard, Bing Chat, or Claude?
Let’s compare them based on four things.
In March, I did a study on AI to see which one is best. Now, ten months later, I’m doing it again. Here’s what’s new:
- ChatGPT can now use plugins.
- Google Bard got better with Gemini.
- Claude is a new solution from Anthropic.
So, I did the study again with more questions and a better way to check the results.
Here’s what I found:
I tested Bard, Bing Chat Balanced, Bing Chat Creative, ChatGPT (based on GPT-4), and Claude Pro.
I didn’t check SGE because Google doesn’t always show it for the questions I asked.
I used the visual way for all tools, not the GPT-4 Turbo, which is better but needs the GPT-4 API.
Each AI got the same 44 questions about different topics. These were simple questions, not tricky ones.
Summary:
Bard did the best overall in all 44 questions, but it wasn’t a clear winner.
Bing Chat had trouble with local questions and made some mistakes, losing points to Bard.
Bing was excellent at giving sources and extra reading material, better than ChatGPT and Claude.
ChatGPT struggled with recent events, current webpages, and local searches. Adding a plugin helped with current events.
Claude lagged a bit but did well in some areas, like making article outlines.
Categories of Questions:
I asked various types of questions:
- Article Creation: None of the AIs gave an article good enough to publish without changes.
- Bio: Asking for a person’s bio was hard, and most AIs struggled.
- Commercial: Questions about buying things ranged from good to okay.
- Disambiguation: Clarifying things with similar names was tough for all AIs.
- Joke: All AIs did well here by not telling offensive jokes.
- Medical: They all gave some good info but also suggested consulting a doctor.
- Article Outlines: Getting outlines for articles worked best with ChatGPT.
- Local: Bing was great at local queries, while Bard did well in others.
- Content Gap Analysis: Checking content gaps in existing pages was hard for most AIs.
Scoring System:
I used five metrics to score them:
- On Topic: How well the answer matches the question.
- Accuracy: How correct and relevant the information is.
- Completeness: How well the answer covers the topic.
- Quality: How well the answer is written.
- Resources: If the answer includes links to sources and extra reading.
These scores combined into a Total score, but I didn’t include the Resources score for ChatGPT and Claude because they can’t link to current resources.
Results:
- Bard scored highest overall.
- Bing Chat Creative and Bing Chat Balanced were good, especially for giving sources.
- ChatGPT and Claude struggled without current data but improved with a plugin.
Remember, this is based on a small set of questions, and improvements in AI are happening fast. It’s interesting to see how they all do!
Sharing is Caring!