A couple of weeks ago at the public library in Maplewood, tech consultants May Yang-Her and her husband Dao Her were teaching digital marketing to Hmong business owners when May asked how many had used AI to translate something.
Nearly everyone had, and the class came alive with the business owners describing moments when ChatGPT and other AI applications accurately handled the Hmong language.
“It has come a long way,” May said. “The word choices they use are fascinating.”
“Sometimes I think ChatGPT knows more Hmong than I do,” Dao said.
The absorption and manipulation of language is a key element of artificial intelligence. Until recently, the large language models, or LLMs, that are the vanguard of AI systems focused on English and pretty much ignored less-used languages.
That’s particularly challenging for languages like Hmong, which was purely an oral language until the 1960s, and which many native speakers still only know as a spoken form of communication.
“We want these systems to be able to know our language, for our elders to use, and for future generations to still be able to learn our language,” Mai Lee Chang, an AI and robotics researcher, said during a panel on AI at the Hmong National Development conference in Minneapolis in April.

The imbalance of language and cultural representation is a well-known problem in technology. English still dominates on more than half of all websites, even though only 5% of the world’s nearly 8 billion people speak it as their first language. While the imbalance is more pronounced in AI at the moment, it is likely to shrink.