1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to find backlinks that are blocking spiders?

Discussion in 'Black Hat SEO' started by darkshadow1, Sep 16, 2014.

  1. darkshadow1

    darkshadow1 Registered Member

    Joined:
    Sep 29, 2012
    Messages:
    83
    Likes Received:
    29
    Hi,

    I have a 2 quick question. Will also take pm if it's a touchie subject.

    Essentially I believe that a lot of my competitors are blocking ahrefs, majestic, etc...
    From my understanding there are two ways of doing this.

    1) in the robot.txt
    2) .htaccess

    I don't see anything in the robot.txt file that would indicate such, therefore I believe it is done via .htaccess


    My questions are as follows.

    1) How can I verify if they are blocking spiders?
    I mean ranking for 1st or 2nd position without any deeplinks is a pretty good indicator, but still.... I would like to confirm that technically.

    2) What approach to take to find some of those backlinks?


    Thanks a lot in advance!
     
  2. SEO Power

    SEO Power Elite Member

    Joined:
    Jul 14, 2014
    Messages:
    2,641
    Likes Received:
    681
    Occupation:
    Self employed
    Location:
    Houston, TX
    You didn't see anything in their robots.txt files because they do not place the code in the robots.txt files of their money sites. They do it in the robots.txt files of the sites providing the backlinks to their money sites, usually PBN sites. So, unfortunately, you can't technically verify that.

    The only way to find out what links they are hiding is using different link research tools with the hope that they failed to block at least one link research tool. You can also use the link:domain.com operator to uncover some of their links since they can't hide their backlinks from Google.
     
    • Thanks Thanks x 1
  3. ThopHayt

    ThopHayt Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 25, 2011
    Messages:
    5,841
    Likes Received:
    1,902
    Robot blocking is voluntary, it just flags robots not to crawl, it does not prevent anything if a robot ignores. You could just send your own robots and ignore flags.
     
    • Thanks Thanks x 1
  4. tony_d

    tony_d Elite Member

    Joined:
    Jun 22, 2013
    Messages:
    2,580
    Likes Received:
    3,167
    Location:
    1600 Amphitheatre Parkway, Mountain View CA
    If you want to find all the links, even those on sites blocking bots, you'll need to crawl the entire interwebs yourself, and make sure to disregard robots.txt.
     
    • Thanks Thanks x 1
  5. darkshadow1

    darkshadow1 Registered Member

    Joined:
    Sep 29, 2012
    Messages:
    83
    Likes Received:
    29
    Ok, I guess I misunderstood the source of where the links are being blocked, but it makes perfect sense now, thanks for the correction.

    yeah tried that but not having much success with that.
     
  6. darkshadow1

    darkshadow1 Registered Member

    Joined:
    Sep 29, 2012
    Messages:
    83
    Likes Received:
    29
    Aha, ok will look into that. Sounds promising, Thanks!
     
  7. darkshadow1

    darkshadow1 Registered Member

    Joined:
    Sep 29, 2012
    Messages:
    83
    Likes Received:
    29
    starting to get the picture, will but some real thought into it. Thanks!
     
  8. darkshadow1

    darkshadow1 Registered Member

    Joined:
    Sep 29, 2012
    Messages:
    83
    Likes Received:
    29
    Thanks for the replies. I have basic understanding of how to approach this. If any of you have gone down this route I would like to ask if you are running in your own environment or hosted environment?
    Also, using any proxies for this?
    Any issues with ISP due to all the traffic?
     
  9. SEO Power

    SEO Power Elite Member

    Joined:
    Jul 14, 2014
    Messages:
    2,641
    Likes Received:
    681
    Occupation:
    Self employed
    Location:
    Houston, TX
    The truth is that there's little you can do to uncover the backlink sources of sites blocking link crawlers. The only way that would really work is hacking the money site, adding it to GWT, and verifying it. Then you'll see all it's backlinks through GWT backlinks report. However, that's unethical.
     
  10. boombaby7

    boombaby7 Junior Member

    Joined:
    Aug 26, 2014
    Messages:
    144
    Likes Received:
    28
    thats a lie. From no hat digital.

    Sneaky way to uncover the PBN of someone spider blocking: search ?domain.com? -site:domain.com. This will help you to find any instances of a domain name being being used as the anchor text (quite a common way to link) on PBN sites. For example, to find mentions of our site, search this:

    ?nohatdigital.com? -site:nohatdigital.com
     
  11. boombaby7

    boombaby7 Junior Member

    Joined:
    Aug 26, 2014
    Messages:
    144
    Likes Received:
    28
    obviously isn't perfect, but it uncovers some url anchor text links that majestic or ahrefs may have not picked up.