Advances in computer storage have created collections of data so huge that researchers often have trouble uncovering critical patterns in connections among individual items, making it difficult for them to realize fully the power of computing as a research tool. Now, computer scientists at Princeton University have developed a method that offers a solution to this data overload. Using a mathematical method that calculates the likelihood of a pattern repeating throughout a subset of data, the researchers have been able to cut dramatically the time needed to find patterns in large collections of information such as social networks.
Image: The researchers identified groups of patents related to an initial patent for “porous products.” The size of a dot in the illustration reflects the impact of the patent on multiple product groups. (Illustration by Prem Gopalan, Department of Computer Science)