“Searching for sudden “bursts” in the usage of particular words could be used to rapidly identify new trends and sort information more efficiently, says a US computer scientist.
Jon Kleinberg, at Cornell University in New York, has developed computer algorithms that identify bursts of word use in documents.
While other popular search techniques simply count the number of words or phrases in documents, Kleinberg’s approach also takes into account the rate at which the word usage increases.
Kleinberg suggests that the method could be applied to weblogs to track new social trends. For example, identifying word bursts in the hundreds of thousands of personal diaries now on the web could help advertisers quickly spot an emerging craze.” New Scientist [via bOing bOing]
The scientists applied the algorithm to State of the Union addresses and, lo and behold, saw evidence of the emergence of the depression, the atomic age, the Communist ‘menace’, etc. At first blush, my response was, “How different is this from the word frequency analysis of Dubya’s 2003 SotU I did in my weblog last month?” Next question: “How different in import is this analysis than, for example, Wired magazine’s ‘wired, tired, expired’ feature?” The authors would respond that watching not so much frequencies as their first derivative, the rate of growth in a word’s frequency, is the significant measure, but is it really a boon? Is this going to identify any trends before we already notice them? I mean really?
