“AI” Chatbots as Language Conversation Partners

Large Language Models (LLMs) such as ChatGPT or other “AI” Chatbots can produce coherent and grammatical text, but does this make them good proxy conversational partners for students in language courses? I recently led a roundtable discussion on some of the problems and possibilities my elementary French students and I experienced when using ChatGPT to practice written conversational French. Below are some (anonymized and paraphrased) chat logs from the chatbot conversations, as well as student commentary.

My elementary French students had the option to complete certain “partner chat” activities using ChatGPT. They prompted ChatGPT to “write like a first-year French student”. Hilarity ensued:

PromptChatGPT Reply
(Discussing food, prompted to write like a first-year French student)“I do not often go out to eat, since I am a first-year student and try to maintain my budget.”
“Eh, being a first-year student, I don’t think about calories when I go out to eat.” “For my part, as a first-year student, I don’t have a lot of experience preparing my own food.”
“Do you like to play music?”“No. I am learning French and I am here to help you with the language. Do you have any questions about French?”
“Do you dance well, or poorly?”“As an AI language model, I do not have the physical capacity to dance.”
“Do you like sports?”“As an AI language model, I do not have preferences or the capacity to like sports.”

Student comments:

  • The AI kept writing multiple-paragraph responses even when prompted not to;
  • It gave lengthy and extended responses which may be slightly unrealistic, especially for first-year students;
  • It would not ask questions unless I prompted it to, after which the conversation flowed more naturally;
  • It was boring knowing I was not interacting with or getting to know a real human;

My own experiences:

  • I incorrectly named a painting, and ChatGPT asserted that it was real and made up an intention by the artist;
  • When prompting ChatGPT to write briefer replies, it sometimes goes back to writing lengthier responses after a few messages;
  • I asked ChatGPT to rewrite its response “without using the conditional mood”. It sent the exact same response (containing verbs in the conditional mood), and then asserted that it had removed all of the offending verbs.
Take-aways:
LLMs can:
* Produce grammatical, coherent text almost instantaneously
LLMs cannot:
* Apply metalinguistic constraints from prompts (e.g., “use the past tense”);
* Apply “competence level” constraints (e.g., “write at the novice level”) ;
*Maintain a role-play role completely divorced from its protocol to be a helpful chatbot; or
*Maintain informational consistency in the conversation (i.e., it lies and doesn’t “remember” what it says)

In short: the generalized “AI chatbots” that use predictive text generation, as trained on a broad swath of language data, do not make for good conversation partners. I have no doubt that an LLM could be trained to recognize certain metalinguistic parameters, and companies like Duolingo may be doing just that.

It should be noted, however, that this sort of LLM training has so far involved the exploitation of underpaid human workers in the global south.

And what would the potential benefits be, of the hypothetical end goal: a robot language tutor that always uses i+1 language? Language is a social tool, so students using the conversation-tutor bot would not really be using language socially. The best bot could still only provide decontextualized conversational practice, with no linguistic goal for the learner other than the practice itself. Perhaps useful to an extent, but socially isolating, as its operation presupposes that language development happens entirely internally.

Tags: , , , ,