1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Merge List For Scrapebox: Giveaway / Help

Discussion in 'Black Hat SEO' started by fr33sh, Feb 11, 2012.

  1. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    Hey Everyone. I wanted to create a post hopefully to share a idea with all the since I haven't see to much on the topic. I also need some help from more experienced users out there. The idea I have is to take all the list the footprint list that member have compiled and tern them into a merge list using the %KW% command in scrapebox.

    Bare with me if this is completely noobish, but I haven't had the program long. I have been doing a lot of research and haven't seen anything about the merge feature being used in the way i will discuss here.

    I did a search on the forums before I posted this and I didn't find anything. Lastly I hope this formatted right since this is the first thread I started.

    Ok so here's the Merge List:

    HTML:
    ?Powered by wordpress? ?leave a comment? %KW%
    site:.edu ?Powered by wordpress? %KW%
    site:.gov ?Powered by wordpress? %KW%
    site:.net ?Powered by wordpress? %KW%
    site:.info ?Powered by wordpress? %KW%
    site:.org ?Powered by wordpress? %KW%
    site:.biz ?Powered by wordpress? %KW%
    site:.com ?Powered by wordpress? %KW%
    site:.ac.uk ?Powered by wordpress? %KW%
    site:.co.uk ?Powered by wordpress? %KW%
    site:.gov.uk ?Powered by wordpress? %KW%
    site:.ca ?Powered by wordpress? %KW%
    site:.edu.ca ?Powered by wordpress? %KW%
    site:.gc.ca ?Powered by wordpress? %KW%
    site:.au ?Powered by wordpress? %KW%
    site:.edu.ca ?Powered by wordpress? %KW%
    site:.gov.ca ?Powered by wordpress? %KW%
    ?Powered by BlogEngine? %KW%
    site:.edu ?Powered by BlogEngine.NET" %KW%
    site:.gov ?Powered by BlogEngine.NET" %KW%
    site:.net ?Powered by BlogEngine.NET? %KW%
    site:.info ?Powered by BlogEngine.NET? %KW%
    site:.org ?Powered by BlogEngine.NET? %KW%
    site:.biz ?Powered by BlogEngine.NET? %KW%
    site:.com ?Powered by BlogEngine.NET? %KW%
    site:.ac.uk ?Powered by BlogEngine.NET" %KW%
    site:.co.uk ?Powered by BlogEngine.NET" %KW%
    site:.gov.uk ?Powered by BlogEngine.NET" %KW%
    site:.edu.ca ?Powered by BlogEngine.NET" %KW%
    site:.gc.ca ?Powered by BlogEngine.NET" %KW%
    site:.ca ?Powered by BlogEngine.NET" %KW%
    site:.edu.au ?Powered by BlogEngine.NET" %KW%
    site:.gov.au ?Powered by BlogEngine.NET" %KW%
    site:.com.au ?Powered by BlogEngine.NET" %KW%
    site:.au ?Powered by BlogEngine.NET" %KW%
    "powered by Movable Type" %KW%
    site:.edu "powered by Movable Type" %KW%
    site:.gov "powered by Movable Type" %KW%
    site:.net "powered by Movable Type" %KW%
    site:.info "powered by Movable Type" %KW%
    site:.org "powered by Movable Type" %KW%
    site:.biz "powered by Movable Type" %KW%
    site:.com "powered by Movable Type" %KW%
    site:.ac.uk "powered by Movable Type" %KW%
    site:.co.uk "powered by Movable Type" %KW%
    site:.gov.uk "Powered by Movable Type" %KW%
    site:.edu.ca "powered by Movable Type" %KW%
    site:.gc.ca "powered by Movable Type" %KW%
    site:.ca "powered by Movable Type" %KW%
    site:.edu.au "powered by Movable Type" %KW%
    site:.gov.au "powered by Movable Type" %KW%
    site:.com.au "powered by Movable Type" %KW%
    site:.au "powered by Movable Type" %KW%
    "powered by b2evolution" %KW%
    site:.edu "powered by b2evolution" %KW%
    site:.gov "powered by b2evolution" %KW%
    site:.net "powered by b2evolution" %KW%
    site:.info "powered by b2evolution" %KW%
    site:.org "powered by b2evolution" %KW%
    site:.biz "powered by b2evolution" %KW%
    site:.com "powered by b2evolution" %KW%
    site:.ac.uk "powered by b2evolution" %KW%
    site:.co.uk "powered by b2evolution" %KW%
    site:.gov.uk "powered by b2evolution" %KW%
    site:.edu.ca "powered by b2evolution" %KW%
    site:.gc.ca "powered by b2evolution" %KW%
    site:.ca "powered by b2evolution" %KW%
    site:.edu.au "powered by b2evolution" %KW%
    site:.gov.au "powered by b2evolution" %KW%
    site:.com.au "powered by b2evolution" %KW%
    site:.au "powered by b2evolution" %KW%
    "powered by ExpressionEngine" %KW%
    site:.edu "powered by ExpressionEngine" %KW%
    site:.gov "powered by ExpressionEngine" %KW%
    site:.net "powered by ExpressionEngine" %KW%
    site:.info "powered by ExpressionEngine" %KW%
    site:.org "powered by ExpressionEngine" %KW%
    site:.biz "powered by ExpressionEngine" %KW%
    site:.com "powered by ExpressionEngine" %KW%
    site:.ac.uk "powered by ExpressionEngine" %KW%
    site:.co.uk "powered by ExpressionEngine" %KW%
    site:.gov.uk "powered by ExpressionEngine" %KW%
    site:.edu.ca "powered by ExpressionEngine" %KW%
    site:.gc.ca "powered by ExpressionEngine" %KW%
    site:.ca "powered by ExpressionEngine" %KW%
    site:.edu.au "powered by ExpressionEngine" %KW%
    site:.gov.au "powered by ExpressionEngine" %KW%
    site:.com.au "powered by ExpressionEngine" %KW%
    site:.au "powered by ExpressionEngine" %KW%
    "powered by Drupal" %KW%
    site:.edu "powered by Drupal" %KW%
    site:.gov "powered by Drupal" %KW%
    site:.net "powered by Drupal" %KW%
    site:.info "powered by Drupal" %KW%
    site:.org "powered by Drupal" %KW%
    site:.biz "powered by Drupal" %KW%
    site:.com "powered by Drupal" %KW%
    site:.ac.uk "powered by Drupal" %KW%
    site:.co.uk "powered by Drupal" %KW%
    site:.gov.uk "powered by Drupal" %KW%
    site:.edu.ca "powered by Drupal" %KW%
    site:.gc.ca "powered by Drupal" %KW%
    site:.ca "powered by Drupal" %KW%
    site:.edu.au "powered by Drupal" %KW%
    site:.gov.au "powered by Drupal" %KW%
    site:.com.au "powered by Drupal" %KW%
    site:.au "powered by Drupal" %KW%
    "Drupal is a registered trademark of Dries Buytaert."%KW%
    site:.edu "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.gov "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.net "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.info "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.org "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.biz "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.com "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.ac.uk "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.co.uk "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.gov.uk "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.edu.ca "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.gc.ca "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.ca "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.edu.au "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.gov.au "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.com.au "Drupal is a registered trademark of Dries Buytaert." %KW%
    site:.au "Drupal is a registered trademark of Dries Buytaert." %KW%
    
    
    
    The whole idea I had was to search all the blog platforms by scrapebox in all the major english speaking countries for a specific keyword all at one time. I wanted to this instead of doing multiple searches.

    For my test I used Lil Wayne as my keyword. Since I knew he would bring a lot of results.

    So put my keyword in and load the merge list. Then I ran my scrape. The problem I ran into was.. I only got a little over 1,000 results.

    Does anyone know what went wrong? Thanks.
     
  2. ksy213

    ksy213 Power Member

    Joined:
    Apr 24, 2011
    Messages:
    583
    Likes Received:
    152
    Are you using public proxies?
     
  3. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    Yes I am.
     
  4. ksy213

    ksy213 Power Member

    Joined:
    Apr 24, 2011
    Messages:
    583
    Likes Received:
    152
    Sometimes, public proxies turn to dead and useless. Then usually you won't get the max results.
     
  5. jubu211

    jubu211 Newbie

    Joined:
    Feb 20, 2011
    Messages:
    34
    Likes Received:
    16
    using site:, inurl and such get proxies banned much much much faster than regular searches. That is your issue.
     
  6. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    True! I was thinking the same thing but I tested them afterwards and they were still good.
     
  7. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    This is true. But I tested them after running the merge list and the were still good.
     
  8. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    Also btw I scrape for my own public proxies, I dont use the built in sources. Another thing I do as well is, I only use yahoo. since it seem to be much nicer on proxies. I only use google when i need to use a google specifc search operator. Like .. or if i need to search within a time period that only google offers. Like search last 24 hrs.
     
  9. Gyuman82

    Gyuman82 Elite Member

    Joined:
    Nov 15, 2011
    Messages:
    1,832
    Likes Received:
    1,122
    Occupation:
    SEO Specialist
    Location:
    Los Angeles
    Home Page:
    Sounds like a proxy issue.

    I just plugged in your footprints above for "lil wayne" and got over 20,000 results in a couple of minutes. I stopped it because it looked like it was gonna run forever lol

    I have 250 private proxies I use to scrape though.

    But yeah sounds like a proxy error.
     
  10. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    You know I've been looking into proxies and vps services. It's just right now i have not mad anything off im. The services I was looking at were
    buyproxies for proxies and bermanhosting for vps.

    What are you using for your proxies?
    PS. I'm glad you got some results from the merge list. I hope you found it helpful. :D
     
    Last edited: Feb 11, 2012
  11. Gyuman82

    Gyuman82 Elite Member

    Joined:
    Nov 15, 2011
    Messages:
    1,832
    Likes Received:
    1,122
    Occupation:
    SEO Specialist
    Location:
    Los Angeles
    Home Page:
    For proxies I use Squid Proxies. Not the best by far, but if you need a lot (which I do), they are probably best bang for the buck. I have 250, so if I got "Good" proxies it would cost a TON.

    So for Scrapebox I just use Squid Proxies as it is good enough for what I need.

    For VPS I use FDCservers.net. Probably a little more expensive than what others use, but I don't mind paying more to avoid the whole "server down" BS a lot of people have with using cheaper versions.

    I have never had a problem with them, and I can run 6 instances with no issue for about $75 a month.

    I list my whole Scrapebox set-up here:

    Gyuman82's Scrapebox Set-Up



    And yes, thanks for the 1,000s of Lil Wayne URLS :D
     
  12. fr33sh

    fr33sh Registered Member

    Joined:
    Jan 29, 2012
    Messages:
    98
    Likes Received:
    4
    Just out of curiosity are a lot of people doing this, because I really haven't seen it discussed to much.