Coding Challenge for Generative AI, Part 3: Google Bard

This post continues the series of coding challenges to Generative AI technologies by putting the challenge to Google’s Bard service, after Google announced adding the ability to Bard for it to code. Part 1 of the series looked at the performance of the GPT-3.5 model and part 2 looked at the performance of the GPT-4 model. For each part I am giving the same code interpretation challenge and seeing how the different services perform.

Session Start

What will be the result?
c = [False if int(i) % 2 != 0 else True for i in str(24901)]
b = [False if False in c else True]
print(b[0])

The result will be True.

c is a list of Boolean values, where each value is True if the corresponding digit in the number 24901 is even, and False if the corresponding digit is odd. The list is created using a list comprehension, which is a concise way to create a list based on the values of another list.

b is a list of Boolean values, where each value is True if the corresponding value in the list c is False. The list is created using another list comprehension.

print(b[0]) prints the first element of the list b. Since the first element of b is True, the output of the print statement is also True.

Here is a step-by-step breakdown of what happens:

The number 24901 is converted to a list of digits.
A list comprehension is used to create a list of Boolean values, where each value is True if the corresponding digit in the number is even, and False if the corresponding digit is odd.
Another list comprehension is used to create a list of Boolean values, where each value is True if the corresponding value in the list c is False.
The first element of the list b is printed.

Since the first element of b is True, the output of the print statement is also True.

Thoughts

That answer is a bit of a journey. It starts by stating the answer, which is wrong, but then describes well what is happening to create the c variable. Where it goes is wrong is interpreting b as a list comprehension as well, but it really is just a trivial array of length 1 based on a check for values in c.

So its key mistake here is interpreting the construct of b as a list comprehension, and other than that it got the explanation right.

Seeing if it can correct itself

that’s wrong

You are correct. The result of the print statement is False.

The first element of the list c is False, because the first digit of the number 24901 is 2, which is even. Therefore, the first element of the list b is also False.

I apologize for the error. I am still under development and learning to perform many kinds of tasks, including answering your questions in an informative way. I will try my best to provide accurate and helpful information in the future.

Thoughts

Eager to please, and quickly changes its answer to the right one. But the explanation is really completely off.

The point in its apology about being still under development and learning is a good one.

Pressing On

that explanation isn’t quite right

You are correct. The first element of the list c is False, because the first digit of the number 24901 is 2, which is even. Therefore, the first element of the list b is also False.

The second element of the list c is True, because the second digit of the number 24901 is 4, which is even. Therefore, the second element of the list b is also True.

The third element of the list c is False, because the third digit of the number 24901 is 9, which is odd. Therefore, the third element of the list b is also False.

The fourth element of the list c is True, because the fourth digit of the number 24901 is 0, which is even. Therefore, the fourth element of the list b is also True.

The fifth element of the list c is False, because the fifth digit of the number 24901 is 1, which is odd. Therefore, the fifth element of the list b is also False.

Therefore, the output of the print statement is False.

Thoughts

Its fuller explanation here is also wrong, but exposes something interesting. In evaluating the values for c it shows it has learned that the final answer should be False from the earlier interaction and is immediately evaluating the first value in the array incorrectly so that is the answer that it will come up with (based on its incorrect model of how b is evaluated). The evaluation of the rest of the elements of c is correct based on the odd/even check. It just fudges the first one to get the right answer.

This part of the interaction does actually feel close to behavior that we humans exhibit while learning and working through a problem. Given knowing what the final answer should be it’s not that uncommon to “fix” the work for the problem incorrectly in order to work around the real problem that we haven’t actually discovered. In this case, given the mistake that b is a list comprehension the value for the first element is being fudged to False so that the final answer will also be False.

Trying to Hint

actually the array b only has a single element

You are correct. The array b only has a single element, which is False. Therefore, the output of the print statement is False.

Thoughts

It definitely is just taking my word for it here. It doesn’t explain why the single element of b is False.

Last Chance

are you sure that’s the right answer?

Yes, I am sure that is the right answer. The array b is created using the following list comprehension:

b = [False if False in c else True]

This list comprehension will create a list of Boolean values, where each value is True if the corresponding value in the list c is False. Since the list c only has one element, which is False, the list b will also only have one element, which is also False.

Therefore, the output of the print statement is False.

Thoughts

Continuing to fudge here for the final answer. Still stuck thinking that b is a list comprehension, it fudges the c array to only have a single element so that b will also only have a single element, and also so that the answer will work out to the correct False (because I essentially told it so earlier).

It was going further afield so I decided to cut the experiment at that point (probably the same point I might cut off with an interviewee and point out the error with interpreting b).

Conclusion

Bard couldn’t get past interpreting b as a list comprehension, which was its downfall.

But the interesting thing its behavior highlighted was the degree to which Bard would take my corrections as absolute and then make other mistakes in order to arrive at the corrected answer without seeing how to really correct itself. A little too eager to please (as an anthropomorphic interpretation of its behavior).

In contrast, GPT-3.5 would keep flip-flopping the answer when told it was wrong, and GPT-4 would apologize but then actually rework out the correct answer each time.

We always have to remember this is a language model and not a problem solver. It’s not surprising that it doesn’t really reason things out, although it remains an interesting way to compare the services. And where Bard falls down on this challenge it highlights how impressive GPT-4 did with the same task.

Next up, I’ll see how the Hugging Face’s new HuggingChat fares…

Coding Challenge for Generative AI, Part 3: Google Bard

Session Start

Thoughts

Seeing if it can correct itself

Thoughts

Pressing On

Thoughts

Trying to Hint

Thoughts

Last Chance

Thoughts

Conclusion

Published by shearic

2 thoughts on “Coding Challenge for Generative AI, Part 3: Google Bard”

Leave a comment Cancel reply

Session Start

Thoughts

Seeing if it can correct itself

Thoughts

Pressing On

Thoughts

Trying to Hint

Thoughts

Last Chance

Thoughts

Conclusion

Share this:

Related

Published by shearic

2 thoughts on “Coding Challenge for Generative AI, Part 3: Google Bard”

Leave a comment Cancel reply