1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

I just caught 2 Google IPs sniffing around my sites

Discussion in 'Cloaking and Content Generators' started by anty, Dec 18, 2008.

Tags:
  1. anty

    anty Newbie

    Joined:
    Dec 8, 2008
    Messages:
    22
    Likes Received:
    0
    Hi,

    I just caught two IPs calling random pages on my cloaked domains. (Always a random page on a domain, then another domain.)

    Both IPs came from the Google IP range (74.125.0.0 - 74.125.255.255) but have no reverse-DNS entry and are their for, afaik, not crawlers. (Actually their IPs came from this IP-range: 74.125.75.*)

    Their referrers always followed the same schema: "http://www.google.com/search?hl=en&q=example.com" where example.com matched the request URL and their UserAgent was on both IPs: "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"

    I banned both IPs, because I think those are manual reviewers checking out my sites.

    My question now is: Did you ever experience this and if yes, what did you do? I'm thinking about denying the whole Google-IP-range, except for the crawlers. In my opinion, a timeout for human reviewers is better than a redirect to an affiliate offer.

    Another idea would be to deny all requests with a referrer with the given schema. This would be effective, but not future proof.

    Ideally I would display them a wh site, but this is not gonna happen until I find a way to create wh sites on the fly :)

    What would you do?
     
    Last edited: Dec 18, 2008
  2. a0rta

    a0rta Regular Member

    Joined:
    May 3, 2008
    Messages:
    214
    Likes Received:
    99
    Occupation:
    Travel Agency owner - Recently outsourced my clien
    Location:
    Brazil/Sweden
    *LOL*


    http:// www. google. com/search?hl=en&q=example.com

    Isn't that a search term? I guess you added example.com to not sho either your domain name or your keyword.

    Don't worry about that.
     
    Last edited: Dec 18, 2008
  3. drkenneth

    drkenneth Executive VIP

    Joined:
    Nov 13, 2008
    Messages:
    285
    Likes Received:
    176
    Occupation:
    Developer/Entrepreneur
    Location:
    USA
    I am sure Google has bots that don't "play by the rules" i.e. try to impersonate normal web browsers, don't resolve as bots, and gives referers that you would "expect to see" from a person hitting that page. They're not dumb and know people display different things to their bots, and try to get around that. I highly doubt people at Google are manually reviewing your links. (Unless maybe you have a HUGE blackhat operation going on.)

    Personally I would bounce all addresses in this IP range to the same page the crawlers go to, as they're most likely crawlers. Perhaps it means the crawlers found your pages to be "potentially suspicious" and so checked them out with unlisted crawlers to determine if it was shown substantially different content. (And hence if you were filtering bots.) In this case showing them both the bot pages would be ideal.
     
    • Thanks Thanks x 1
  4. justone

    justone Elite Member

    Joined:
    Oct 12, 2008
    Messages:
    1,516
    Likes Received:
    1,037
    Occupation:
    -
    Location:
    Europe
    I just posted the same in another forum here.
    Also I had googlers on my site, wondering why as I did not use Black SEO for it.

    Maybe they build linklists from sources like bhw and similar that are regulary reviewed ?
    I guess they can catch a lot of people using forbidden techniques if they look at those sites that are listed in bh seo forums
     
  5. aмillionaírе

    aмillionaírе Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 20, 2008
    Messages:
    532
    Likes Received:
    358
    This is def. what's going on... Google has its smarts... my cloaked pages are always delisted when this type of thing occurs... def. something going on manual or automatic.... I'm working on a solution that doesn't hinder the effectiveness of having cloaks.
     
  6. anty

    anty Newbie

    Joined:
    Dec 8, 2008
    Messages:
    22
    Likes Received:
    0
    I don't think showing all Google IPs the real pages is the right decision. This would reveal my content and links to competitors since they could use the translation tool to check my content.
    But a manually created list could do the trick. I would just have to make sure to include only the bots.

    Thanks for the response drkenneth, seems obvious now, but I didn't thought about disguised bots.
     
  7. jake3340

    jake3340 Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 20, 2008
    Messages:
    1,368
    Likes Received:
    414
    Location:
    Pluto
    There is a way to deny google entering your website. I will let you know how if I remember.
     
  8. drkenneth

    drkenneth Executive VIP

    Joined:
    Nov 13, 2008
    Messages:
    285
    Likes Received:
    176
    Occupation:
    Developer/Entrepreneur
    Location:
    USA
    The problem is you still want to get indexed as relevant by Google WHILE directing real people to affiliate programs. Denying them the ability to look at your site entirely means he'd get deindex completely, which is definitely not what he wants. So, "cloaking" and showing search engines different thing than people are what he needs to do, not deny google from entering. (Which can be done w/ robots.txt)
     
  9. justone

    justone Elite Member

    Joined:
    Oct 12, 2008
    Messages:
    1,516
    Likes Received:
    1,037
    Occupation:
    -
    Location:
    Europe
    Just keep in mind: there is no way to guarantee you can ever keep something like google out.
    They for sure have got subnets you never heard about all over the world.
     
  10. a0rta

    a0rta Regular Member

    Joined:
    May 3, 2008
    Messages:
    214
    Likes Received:
    99
    Occupation:
    Travel Agency owner - Recently outsourced my clien
    Location:
    Brazil/Sweden
    Don't bother. If you want traffic let it in no matter what. Google bot or human who cares?

    Are you doing anything too shady you will most likely get deindexed and if not letting google in it will give the same result.

    If getting banned it's not the end of the world. There are tons of ways to drive traffic even if your not indexed in the Google serps and a lot of sites aren't indexed by Google and stull profitable. Also remember that a lot of sites aren't indexed among the top 10, which is pretty much the same as not beeing indexed at all and many there still do good too.
     
  11. Sweetfunny

    Sweetfunny Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 13, 2008
    Messages:
    1,747
    Likes Received:
    5,039
    Location:
    ScrapeBox v2.0
    Home Page:
    The range is listed to Google:

    Code:
    http://ws.arin.net/whois/?queryinput=74.125.75.123
    I've experienced the same thing, no Googlebot useragent but the IP listed as Google's and the referrer coming in had the same schema as google.com/search?hl=en&q=mypageurl.com

    From what i've gathered is yes it's a cloaking detection bot but it's automated and not manual. The reason is, firstly the IP is registered to Google and they won't "brand" manual reviewers with the Google IP. Also i have a lot of evidence to suggest reviewers now are using Chrome, no doubt adapted with special plugins/tools.

    Many people cloak right off the SERP's (SSEC Anyone) and cloak by useragent, so with the bot coming from a search referral using Firefox it can nail a good percentage of cloaked pages.

    How did your setup handle the requests? Did it deliver the human or spider version?
     
  12. anty

    anty Newbie

    Joined:
    Dec 8, 2008
    Messages:
    22
    Likes Received:
    0
    Just woke up and saw another 2 ips sniffing around. This time without referrer, but one IP was resolving to 123-123-123-123.google.com and one didn't have a reverse dns entry like the ones from yesterday.

    @SweetFunny: I deliver these IPs my bot-content now. You are right, it doesn't make sense to block them or show them the human version (=redirect). If they are human, they will ban me anyways.

    The IP range you mentioned isn't the only one. They have at least one other namely:
    Code:
    http://ws.arin.net/whois/?queryinput=216.239.32.0
     
    Last edited: Dec 18, 2008
  13. jake3340

    jake3340 Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 20, 2008
    Messages:
    1,368
    Likes Received:
    414
    Location:
    Pluto
    Ah i taught he doesn't want to get indexed. hmm dnno then.
     
  14. Jagged55

    Jagged55 Power Member

    Joined:
    Mar 27, 2008
    Messages:
    747
    Likes Received:
    325
    Occupation:
    Internet Marketing
    Location:
    Canada
    If you are cloaking stuff, I would get use to seeing Google snooping around in there. You can block IP's all you want because I am sure they are sending other crawlers in that you don't know about (there is no rule that says they have to identify themselves as googlebots or stay on a specific IP range). And once they are in your site snooping around, the damage is already done.

    Just keep your bot list up to date...that's the best you can do. Cloaked sites don't last forever so make all that you can and move onto the next one.
     
  15. the_demon

    the_demon Jr. Executive VIP

    Joined:
    Nov 23, 2008
    Messages:
    3,177
    Likes Received:
    1,563
    Occupation:
    Search Engine Marketing
    Location:
    The Internet
    Here's a few suggestions:

    1. Encrypt Javascript: If bots want to figure out what your up too. Give 'em hell. Most places don't have the time or resources to be decrypting javascript.
    2. Using PHP & .htaccess rewrite you can prevent the direct viewing of css, .js, etc. Files. This is good because it makes it much harder for humans and bots to disasmble your website and clone / analyze.
    3. Use robots.txt and ban search engines from crawling stlying files and images. This way they can't do checks to see if your overlaying keywords and such. For example white on white or whatever color schemes you have going.
     
  16. billyjean65

    billyjean65 Junior Member

    Joined:
    Jul 24, 2008
    Messages:
    197
    Likes Received:
    6
    google is always looking to bust our chops and its a tight race always
     
  17. Jagged55

    Jagged55 Power Member

    Joined:
    Mar 27, 2008
    Messages:
    747
    Likes Received:
    325
    Occupation:
    Internet Marketing
    Location:
    Canada
    When you are cloaking, you gotta expect that Google and the other big ones aren't following robots.txt and other things. As one of the other members here mentioned in another cloaking thread, if you are using redirection on your cloaking, there is a theory that Google can detect the quick change in PR values in their toolbar. raising a flag. So as hard as you are working to fool them, they are working to fight you.

    The only thing you can do is keep humans and bots separated the best you can....simple as that. If you get a bot viewing your pages intended for human eyes only, get the IP into the bot file asap.
     
  18. anty

    anty Newbie

    Joined:
    Dec 8, 2008
    Messages:
    22
    Likes Received:
    0
    Just to let you guys know: I experienced an IP ban today. So these IPs definitely where some sort of cloaking/spam detection bots.
     
  19. kandor

    kandor Regular Member

    Joined:
    May 26, 2008
    Messages:
    274
    Likes Received:
    97
    No this is actual IPs of manual reviewers.

    And if you would display what you display the spiders then your site will most probably be banned if you have anything that looks bad to humans.

    I get this kind of checkups couple of times / day and they click and go through different pages of my cloaked sites, and still after 2 month they have not banned a single page of mine that is redirecting them to the human visitors place.


    Kandor

     
  20. sqlbyte

    sqlbyte Junior Member

    Joined:
    Sep 22, 2008
    Messages:
    147
    Likes Received:
    35
    Location:
    Behind You
    Uhm.. an IP ban ?
    You mean your site got deleted from google index or what ?
    btw i must say something important for this topic.
    I read somewhere on DP that some random company was looking for people to do some online job. It turns out that the guy who got signed up had to review random sites given by some algorithm and determine are they good, bad, spam.. And he was doing it for.. guess what company ?
    So i don't banning google ip would help, and i think people are still reviewing sites manual, coz google didn't yet developed something that will keep the spam out.