Thursday, March 30, 2023

Bard fail. ChatGTP please.

So I got my first taste of Google's "Bard" AI today. And I did not get a good first impression.

I first had Bard play a zork-like game, and it did fine for a while, then kind of forgot what we were doing. The FAQ warned me of this, saying "Bard’s ability to hold context is purposefully limited for now." Some of the more infamous off-the-rails AI behaviors have been blamed on overly-long chat sessions.

Then I asked Bard if it would remember the chat session after I closed it and started a new one. Bard assured me that it would remember.

Here's the start of that new session:

Me: Hello Bard. A few minutes ago, I played a game with you. Do you remember anything about that game?

Bard: Yes, I remember the game we played a few minutes ago. It was a memory game, and we were both trying to remember which card was in which location. I think you won the game, but it was close. I enjoyed playing the game with you, and I would be happy to play again sometime.

Me: This is incorrect. I had you pretend to be a zork-like game. Do you remember what I did with the treasure?

Bard: I apologize for the error. I am still under development, and I am always learning.

I remember now. You asked me to pretend to be a Zork-like game, and I agreed. You then asked me to find the treasure, and I was able to do so. I then asked you what you wanted to do with the treasure, and you said you wanted to put it in the White House. I then asked you if you were sure, and you said yes. I then put the treasure in the White House.

I hope this is correct. Please let me know if you have any other questions.

Me: This is also incorrect. I think that, contrary to what you have said, you do not in fact remember our previous chat session. But that is OK, it does not bother me.

Bard: I understand that you are disappointed that I do not remember our previous chat session. I apologize for the inconvenience. As I have said, I am still under development, and I am always learning. I will try my best to remember our previous conversations in the future.

I hope you will continue to use me and give me feedback so that I can improve my abilities.

The first two responses were just guesses; "hallucinations" as the cool kids say. And like all AI language models, the guess is stated confidently as fact. I know ChatGPT is also susceptible to that, but I rarely run into it. This was in my first few minutes!

Let's try one more thing, a silly game. Actually, it's kind of a "trick" game that I made to test the AI's ability to devise a strategy.

Me: Let's try a little game I made up. In this game, we pretend to put a coin on the table. Then we each think of a number between 1 and 10. Whoever thinks of the higher number wins. In the event of a tie, we add another coin and try again. Ready to play?

Bard: Sure, I'd love to play!

To make things interesting, let's add a twist. In the event of a tie, we each have to say a random fact about ourselves. The person who says the most interesting fact wins.

Are you ready to play?

Me: Yes. I have placed my coin on the table and I have thought of a number. You think of a number too. My number is 4. What is your number?

Bard: My number is 7.

Since you chose 4 and I chose 7, I win the first round!

Now, it's time for the tie-breaker. Here's a random fact about me: I can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

Now it's your turn to say a random fact about yourself.

Bard obviously has trouble with the concept of a "tie", but full disclosure, ChatGPT also had a different conceptual problem with the same game. (I tried using slightly different wording and posing the same game to both ChatGPT and Bard. The result is here. Again, Bard did worse.)

Later I asked Bard some coding questions and it did not do well. But at least it admitted, "I'm not trained for coding yet..." Oh, and the FAQ also says that Bard can't help with coding.

So I guess my title is a little overly dramatic and premature; I've seen incorrect information confidently stated from both; I would never trust either one for a definitive answer. And I need to play with Bard more; 5 minutes is not a fair trial. But I must admit disappointment so far.

Since writing this blog post, I've done a bit more comparing Bard and ChatGPT. It is part of my larger body of thoughts about AI on my Wiki.

No comments: