Researchers ‘Text Mine’ The New York Times, Demonstrating Ease Of New Technology

“Performing what a team of dedicated and bleary-eyed newspaper librarians would need months to do, scientists at UC Irvine have used an up-and-coming technology to complete in hours a complex topic analysis of 330,000 stories published primarily by The New York Times.

The demonstration is significant because it is one of the earliest showing that an extremely efficient, yet very complicated, technology called text mining is on the brink of becoming a tool useful to more than highly trained computer programmers and homeland security experts.

“We have shown in a very practical way how a new text mining technique makes understanding huge volumes of text quicker and easier,” said David Newman, a computer scientist in the Donald Bren School of Information and Computer Sciences at UCI. “To put it simply, text mining has made an evolutionary jump. In just a few short years, it could become a common and useful tool for everyone from medical doctors to advertisers; publishers to politicians.”

Text mining allows a computer to extract useful information from unstructured text. Until recently, text mining required a great deal of preparation before documents could be analyzed in a meaningful way.” (ScienceDaily)