1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[METHOD] Harvesting TENS OF THOUSANDS of HIGH PR Auto-Approve Links!

Discussion in 'Black Hat SEO' started by youssef93, Mar 7, 2011.

  1. youssef93

    youssef93 Senior Member

    Joined:
    Sep 14, 2008
    Messages:
    828
    Likes Received:
    1,148
    Occupation:
    Student, Part-time Online Marketer
    Location:
    Egypt
    Alright folks these are my twists on existing methods that should yield pretty cool results with scrapebox. Let's get started.

    Requirements:
    - Scrapebox, hrefer or any other harvesting tool that can adapt (this tutorial is geared towards scrapebox though)
    - VPS or dedicated server (HIGHLY recommended cause we'll be using tons of bandwidth)

    1. We're gonna start by doing little searches here on the forum and on Google:
    ".edu blog link"
    "auto approve list"
    "PR7 blog link"
    "free PR6 blog link"
    ...etc...etc whatever you can come up with.

    2. Grab the ones that seem most popular. Focus on very rare links that are shared in public...like high PR autoapprove .edus. We want to ensure that they are very heavily spammed.

    3. Compile what you find in one list, remove duplicates and low PRs. I keep PR6 and above in addition to the .govs, .edus...etc if there are some. (I know there won't be many URLs then)

    4. [OPTIONAL] Now here's a cool twist. Visit a couple of the very very best ones you can find. PR7++. You'll notice that they're spammed to death of course. Many of them will have thousands of comments but with 80+ pages and each page has just a hundred or so. Use my PHP script (or anything else you like) to generate all those comment page URLs. Instructions inside:
    Code:
    http://www.mediafire.com/?o5fjd2cy4deadhe
    5. Now repeat step 4 for the best blogs you can find that may have many comment pages.

    6. Compile a file of URLs gathered in step 3, 4 and 5.

    7. Open up the scrapebox link extractor plugin and import the list in step 6.

    8. Choose "outbound links" and start scraping links on the page.

    9. You should end up with tens of thousands of spammer URLs that were all gathered from high PR blogs.

    10. These are your keywords, add them to scrapebox text editor and add "link:" operator before them.

    11. Use custom footprint harvester.

    12. [OPTIONAL] Use blog analyzer addon to filter out none compatible blogs.

    13. [OPTIONAL] Use CrazyFlx's tutorial to deal with massive text files:
    Code:
    http://crazyflx.com/scrapebox-tips/remove-duplicate-domains-urls-from-huge-txt-files-with-ease/
    14. [OPTIONAL] Repeat the whole process with the new list if you find it worthwhile until you scrape the whole web:D;)

    This method has great potential and allows you to scrape billions of URLs. Yes, even if we say 70% or more are duplicates you would still end with millions of unique URLs. Spammers occassionally compile and use their own lists so chances of finding unique URLs is cool. That's my primary scraping method that HAS been mentioned time and time again but with a few additional tweaks. The "comment page" tweak also helps you reach spammer URLs that others don't and thus helping you steal their backlinks while others can't:D

    Enjoy folks!:D
     
    • Thanks Thanks x 16
  2. wahidpolin

    wahidpolin Regular Member

    Joined:
    Dec 25, 2009
    Messages:
    441
    Likes Received:
    259
    this method is excellent,good job.
     
  3. docdizzle

    docdizzle Newbie

    Joined:
    Jul 23, 2010
    Messages:
    42
    Likes Received:
    20
    You can also get complete video tutorials on this method from this guy.... http://scrapeboxtuts.com/
     
  4. AquaClean

    AquaClean Regular Member

    Joined:
    Oct 4, 2010
    Messages:
    260
    Likes Received:
    401
    Great method, been using this for a long time, Also not to hijack, but is anyone else having a problem with SB proxy scraper, it wont filter out the high latency proxies, i select what I want as usual, and it doesn't filter, started doing this the last couple days, and its pretty annoying.

    I use private proxies for posting, and public for scraping, but now i cant even filter my public ones.
     
  5. youssef93

    youssef93 Senior Member

    Joined:
    Sep 14, 2008
    Messages:
    828
    Likes Received:
    1,148
    Occupation:
    Student, Part-time Online Marketer
    Location:
    Egypt
    Thank you all for the great feedback. I just want to stress that the core of this method isn't mine. I just added a few twists and wrote it in a step by step manner as well as recommended tools that would help along the way. :)

    @AquaClean

    Maybe you can open a new thread about your proxy harvester problem? :)
     
  6. docdizzle

    docdizzle Newbie

    Joined:
    Jul 23, 2010
    Messages:
    42
    Likes Received:
    20
    Thanks for pointing out crazyflx's site...he's got some good info there.
     
  7. ampped101

    ampped101 Newbie

    Joined:
    Oct 2, 2010
    Messages:
    24
    Likes Received:
    0
    Harvesting will eventually die, even with spinners!
     
  8. Florist88

    Florist88 Newbie

    Joined:
    Jul 24, 2009
    Messages:
    43
    Likes Received:
    7
    Great method, thanks, am busy testing it out.
    In step 3, do you check and filter the PR of the base domain, or the URL itself ?
     
  9. jvinci

    jvinci Newbie

    Joined:
    Jul 27, 2011
    Messages:
    35
    Likes Received:
    17
    i was wondering the same thing
     
  10. david wright

    david wright Regular Member

    Joined:
    Aug 3, 2010
    Messages:
    229
    Likes Received:
    86
    good info here, thanks brother.