1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Can Anybody Understand This?

Discussion in 'Black Hat SEO' started by Absurtuk, Jun 25, 2014.

  1. Absurtuk

    Absurtuk Regular Member

    Joined:
    Mar 1, 2013
    Messages:
    239
    Likes Received:
    31
    Hey guyz,
    These days I'm reading the paperwork from sergey brin and larry which is the first paper of Google.
    And I dont get the Ranking part which is:


    "Google maintains much more information about web documents than typical search engines. Every hitlist includes position, font, and capitalization information. Additionally, we factor in hits from anchor text and the PageRank of the document. Combining all of this information into a rank is difficult. We designed our ranking function so that no particular factor can have too much influence. First, consider the simplest case -- a single word query. In order to rank a document with a single word query, Google looks at that document's hit list for that word. Google considers each hit to be one of several different types (title, anchor, URL, plain text large font, plain text small font, ...), each of which has its own type-weight. The type-weights make up a vector indexed by type. Google counts the number of hits of each type in the hit list. Then every count is converted into a count-weight. Count-weights increase linearly with counts at first but quickly taper off so that more than a certain count will not help. We take the dot product of the vector of count-weights with the vector of type-weights to compute an IR score for the document. Finally, the IR score is combined with PageRank to give a final rank to the document.
    For a multi-word search, the situation is more complicated. Now multiple hit lists must be scanned through at once so that hits occurring close together in a document are weighted higher than hits occurring far apart. The hits from the multiple hit lists are matched up so that nearby hits are matched together. For every matched set of hits, a proximity is computed. The proximity is based on how far apart the hits are in the document (or anchor) but is classified into 10 different value "bins" ranging from a phrase match to "not even close". Counts are computed not only for every type of hit but for every type and proximity. Every type and proximity pair has a type-prox-weight. The counts are converted into count-weights and we take the dot product of the count-weights and the type-prox-weights to compute an IR score. All of these numbers and matrices can all be displayed with the search results using a special debug mode. These displays have been very helpful in developing the ranking system.
    "
     
  2. Techxan

    Techxan Elite Member

    Joined:
    Dec 7, 2011
    Messages:
    3,093
    Likes Received:
    3,585
    Occupation:
    Local SEOist
    Location:
    TEXAS (you have to yell, its the law.)
    When you cant impress them with brilliance, baffle them with bullshit.
     
  3. Trepanated

    Trepanated Supreme Member

    Joined:
    Sep 18, 2010
    Messages:
    1,395
    Likes Received:
    5,324
    They use multiple factors to determine ranking
    Each factor gets its own score
    Score for each factor is calculated differently
    Each factor can also influence the score of other factors.

    I only read the text once though, so I may have missed something.
     
  4. keywordspot

    keywordspot Jr. VIP Jr. VIP Premium Member

    Joined:
    Dec 17, 2013
    Messages:
    4,284
    Likes Received:
    1,475
    Gender:
    Female
    Occupation:
    contenu visuel
    Location:
    At Hill Station