1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Robots.txt Manipulation

Discussion in 'Black Hat SEO' started by Steezy, Mar 24, 2012.

  1. Steezy

    Steezy Newbie

    Joined:
    Mar 9, 2012
    Messages:
    16
    Likes Received:
    1
    Not sure if this was the best place to post this, but I couldn't find a more fitting section since it's a blackhat technique that I'm looking for.

    First some background:
    Found an online store that was offering an attractive affiliate program, the program had good margins and was a space that I was familiar with. Best of all, they had the option of setting up a store for us, so that we could launch immediately. This meant I didn't have to spend weeks configuring designing a Magento store, or configuring the product feed to work with our site, so it was very attractive to me. We(my business partner and I) decided to go with this option, even though it meant a smaller margin, hey we weren't going to have to host it, design it, or manage inventory, great deal right?

    The problem:
    2 days after the site was set-up and we started doing some off page SEO for it, I set up webmaster tools and realized... hey there's an error at the top saying my pages aren't being indexed due to Robots.txt errors. I looked into it, and it turns out that they have configured the Robots.txt to disallow all robots, in order to prevent duplicate content issues with the search engine... so as not to interrupt their rankings. (and they confirmed this after I contacted them regarding this, saying that they don't want duplicate content issues)


    So here I am, having spent a couple weeks dealing with these guys, and now I find out that the single most important thing that was going to determine our success is not available to me? As you can imagine, I'm a little ticked off. Instead of launching 2 days ago, the last 2 weeks have been wasted (time I could have spent setting up a Magento store and dropshipping their product).

    Here's the question: anyone know a way I can make Google ignore the robots.txt file? Is there some DNS magic I can work in to cloak the url/robots.txt ? I don't have ftp access of course, and cannot edit the htaccess file. Any ideas?



    Also, to note: I'm not trying to screw these guys from their business/sales from search engines, I just feel a little cheated at the moment and would like to have something up at least while I work on getting my own cart/store up in order to config it and dropship their product.
     
  2. Union

    Union Power Member

    Joined:
    Sep 24, 2011
    Messages:
    531
    Likes Received:
    210
    Location:
    USA
    They disallow the robots because of Google, If google see Affilliate links/Page, Say Goodbye to first Page...
    But nothing to worry, USE Search in Google, and find some Codes (usually are java, but you can find other too) that will HIDE your Content, and you can allow robots to look where they want, because they will not have access to see your Affiliate.
     
  3. Steezy

    Steezy Newbie

    Joined:
    Mar 9, 2012
    Messages:
    16
    Likes Received:
    1
    Thanks for the tip, but the issue isn't that I'm trying to hide the fact that I'm working with an affiliate to provide the products/website, but that the affiliate is forcing me to use a specific robots.txt (they are hosting the online store).

    I also don't have the ability to edit much page content, I can edit text on maybe 6 pages (home, about, contact, etc.,) but the text is cleaned for javascript/html.


    I just need a way for Google to stop looking at url/robots.txt , is there something I can do with dns to make that happen? Or something from webmaster tools?
     
  4. Fathom

    Fathom Power Member

    Joined:
    Jul 1, 2011
    Messages:
    518
    Likes Received:
    282
    Location:
    Hertfordshire
    Anyone that sells/gives you an online store and blocks robots.txt is an idiot.

    Duplicate content does not penilize the site of the original content and if I was you I would move on, just my opinion.

    All the best.
     
  5. grav6

    grav6 Junior Member

    Joined:
    Jan 30, 2012
    Messages:
    169
    Likes Received:
    54
    Location:
    England
    If you can't get to the robots.txt then there's nothing you can do, unless you clone the site elsewhere and for purchases link back to the site they're making you use (or something along those lines).
     
  6. Steezy

    Steezy Newbie

    Joined:
    Mar 9, 2012
    Messages:
    16
    Likes Received:
    1

    Agreed Fathom, they're being ignorant about it. It's a shame too, because I could have made us both a good chunk of cash too : /

    I'd move on, but I still think it's a good niche and they're a solid supplier.

    grav6:
    Guess I'll just have to hit the grindstone and get my own store sooner.
     
  7. Diplomat

    Diplomat Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 25, 2011
    Messages:
    872
    Likes Received:
    409
    Occupation:
    CEO
    You can use htaccess.. so if somebody goes to http://yoururl.com/robots.txt it really goes to robots2.txt or something like that

    Code:
    RewriteRule  ^robots.txt$ http://mydomain.com/robots2.txt [L]
    I havent tested it but it should work
     
  8. Steezy

    Steezy Newbie

    Joined:
    Mar 9, 2012
    Messages:
    16
    Likes Received:
    1
    (had to edit your quote since it won't let me post urls yet.


    Thanks but I don't have access to the files on the server (they are hosting it and don't allow ftp access, or any access actually, to the files), and thus can't edit the htaccess file
     
  9. caspka

    caspka Registered Member

    Joined:
    Oct 13, 2011
    Messages:
    59
    Likes Received:
    19
    use the html meta robots tag, not sure but it may overwrite the file robots.
     
  10. biteablegravy

    biteablegravy Registered Member

    Joined:
    Oct 13, 2010
    Messages:
    86
    Likes Received:
    45
    The default meta is index,follow. Typing that out on the page won't change or override robots.txt
     
  11. Steezy

    Steezy Newbie

    Joined:
    Mar 9, 2012
    Messages:
    16
    Likes Received:
    1

    : / I was giving it a shot too, halfway through it I remembered that meta tags are only interpreted in the header anyway. Although with some clever data entry I was able to get the meta tag to show up on every page, just not in the header. Guess it was to no avail, considering bitablegravy's point.
     
  12. biteablegravy

    biteablegravy Registered Member

    Joined:
    Oct 13, 2010
    Messages:
    86
    Likes Received:
    45
    Sorry buddy, live and learn =)

    Those speedbumps make success that much nicer.
     
    • Thanks Thanks x 1