Computers to learn human language

Highlights

Researchers in Washington have developed a new method for helping computers learn human language naturally. For more than 50 years, linguists and computer scientists have tried to get computers to understand human language by programming semantics as software, researchers said. Now, a University of Texas at Austin linguistics researcher, Katrin Erk, is using supercomputers to develop a new method for helping computers learn natural language.

Researchers in Washington have developed a new method for helping computers learn human language naturally.

For more than 50 years, linguists and computer scientists have tried to get computers to understand human language by programming semantics as software, researchers said. Now, a University of Texas at Austin linguistics researcher, Katrin Erk, is using supercomputers to develop a new method for helping computers learn natural language.

Instead of hard-coding human logic or deciphering dictionaries to try to teach computers language, Erk decided to try a different tactic: feed computers a vast body of texts (which are a reflection of human knowledge) and use the implicit connections between the words to create a map of relationships. "An intuition for me was that you could visualise the different meanings of a word as points in space," said Erk, a professor of linguistics who is conducting her research at the Texas Advanced Computing Center. To create a model that can accurately recreate the intuitive ability to distinguish word meaning requires a lot of text and a lot of analytical horsepower, researchers said.

"The lower end for this kind of a research is a text collection of 100 million words. Erk initially conducted her research on desktop computers, but then she began using the parallel computing systems. Access to a special Hadoop-optimised subsystem allowed Erk and her collaborators to expand the scope of their research. Hadoop is a software architecture well suited to text analysis and the data mining of unstructured data that can also take advantage of large computer clusters, researchers said. "We use a gigantic 10,000-dimensional space with all these different points for each word to predict paraphrases," Erk said.

Print Article

Just In

Computers to learn human language

News

Company

Entertainment

All News