Open Ranking of the World Wide Web Is Available

Discussion in 'BlackHat Lounge' started by JustUs, Feb 12, 2014.

  JustUs

    JustUs Power Member

    May 6, 2012
    Likes Received:
    The Data and Web Science Group of the University of Mannheim and Laboratory for Web Algorithmics of the Universita degli Studi di Milano have put together the first entirely open ranking of more than 100 million sites of the Web. The ranking is based on easily explainable centrality measures applied to a host graph. All data and all software that are used is publicly available. The software is available in C++ and Java. There are also packages available in Matlab and JYthon.

    The DWSG and the University of Mannheim are using the Common Crawl data. Pages are ranked using harmonic centrality with raw Indegree centrality, Katz's index, and PageRank provided for comparison. More information about the web graph is available in a paper (PDF) that will be presented at the World Wide Web Conference in April.

    It is believed that this is the largest source of hyperlink data outside of Microsoft, Google, and Yahoo. It total, the information of the Web Data Commons encompasses 3.5 billion webpages and 128 billion hyperlinks. The entire data set is greater than 330 Gigabytes and would cost about $45 to download from the Amazon S3 servers where it is hosted.

    If your mathematics are not up to snuff, or you are of the school that Computer Science and Computer Programming are not applied mathematics, then this paper, program set, and data are probably not for you. If you have the ability and the determination, on the other hand, then this information is a gold mine for strong link building and improving serp position.

  Pornguy

    Pornguy Regular Member

    Nov 29, 2012
    Likes Received:
    Home Page:
    I would like to see where this goes.