1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[Tutorial] Scrape proxies using Scrapebox

Discussion in 'Proxies' started by Kitsune, Feb 20, 2013.

  1. Kitsune

    Kitsune Junior Member

    Joined:
    Mar 2, 2008
    Messages:
    136
    Likes Received:
    30
    This is a method which I used to scape proxies back when I couldn't afford to buy proxies for google harvesting.
    While google has become a lot stricter when it comes to banning public proxies this method should get enough proxies to do some harvesting.

    1. First of all you need some proxies to start with. So head over to BHW proxy list subforum and grab all the proxy lists which have been added today and yesterday. Load them in the scrapebox proxy manager and start testing them. I use 100 connections and timeout 30 seconds for this test however you may want to adjust this depending on your internet connection.

    2. Once that is done save all the google passed proxies and put them in the harvester. After that crate a txt file which contains
    Code:
    "%KW%"
    and then use it as a custom footprint. Also set the harvester to scrape URLs which are created/updated in the last 24 hours. You can scrape without proxies this time because there won't be so many searches that your ip would get banned.

    3. Now take all of the harvested URLs and save them to a txt file. Then open the proxy harvester and press ''Add source'' → ''By importing a list of urls''. Now harvest proxies from all of the URLs.

    4. Now you have a bunch of proxies and you need to test them. To make this process quicker in the proxy manager configuration check off ''No google test''. I used 200 connections and 30s timeout. You may want to use lower connection count if your internet connection speed isn't good. Since there are a lot of proxies in your list checking will take quite a while. I checked 40k proxies and it took me a bit less than 1 hour.

    5. Now keep only the anonymous proxies and save them in a file. If you don't want to use these proxies for search engine scraping then now you are done. If you do then go into proxy manager configuration and enable testing against google and set the connection count to 50 or less and set the timeout to 45s. Now check the proxies.

    6. Now you have a list of proxies which you can use for scraping. However remember to keep the anonymous proxy list. After scraping for a while some of the proxies will get banned. Then you take the anonymous proxy list and recheck the proxies - some proxies which were previously banned can now be unbanned.

    BONUS
    If proxies from blackhatworld aren't enough here is a list of proxies which get updated every day. You can use these lists in scrapebox proxy harvester.

    Code:
    http://checkerproxy.net/all_proxy
    http://www.ultraproxies.com/anonymous.html?datatype=Medium
    http://www.ultraproxies.com/high-anonymous.html?datatype=High
    http://www.ultraproxies.com/https.html?datatype=HTTPS=1
    http://www.ultraproxies.com/http.html?datatype=HTTP=1
    http://proxyserverlist.blogspot.com/
    http://www.scrapeboxproxies.net/
    http://www.pr0xies.org/
    http://new-fresh-proxies.blogspot.com/
    http://ssl-proxy-server.blogspot.com/
    http://proxies.my-proxy.com/proxy-list-s2.html
    http://proxies.my-proxy.com/proxy-list-s1.html
    http://www.freeproxy.ch/proxylight.txt
    http://proxy-level.blogspot.com/
    I hope that you find this useful. Let me know if you have any questions. :)
     
    • Thanks Thanks x 14
  2. Neo240

    Neo240 Power Member

    Joined:
    Sep 13, 2011
    Messages:
    543
    Likes Received:
    329
    Occupation:
    Hitman
    Location:
    Vacation
  3. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    12,242
    Likes Received:
    9,030
    Occupation:
    PROVIDING PROXIES FOR SEO URL SCRAPERS
    Location:
    BHW
    Home Page:
    nice tutorial - just 1 question.
    what your showing people to do, is what they should be doing
    already if they scrape public proxies or use public proxies.
    the method you show inst how to do it but how its done for new
    people using using scrapebox as experiences user would no this

    and public url lists are provided for scraping proxies are also provided here
    regular already but 1 more never hurts
    http://www.blackhatworld.com/blackhat-seo/proxies/464079-proxie-scraping-links.html
    sample

    Code:
    http://proxyserverlist.blogspot.com/
    http://proxyserverlist.blogspot.com/2013/02/l1l2l3ssl-http-proxy-server-list_5380.html
    http://proxyserverlist.blogspot.com/2013/02/l1l2l3ssl-http-proxy-server-list_7041.html
    http://proxyserverlist.blogspot.com/2013/02/l1l2l3ssl-http-proxy-server-list_750.html
    http://proxyserverlist.blogspot.com/2013/02/l1l2l3ssl-http-proxy-server-list_8442.html
    http://shakzone.blogspot.com/2013/02/20-02-13-anonymous-proxy-servers-650.html
    http://shakzone.blogspot.com/2013/02/20-02-13-anonymous-proxy-servers-list.html
    http://shakzone.blogspot.com/2013/02/20-02-13-anonymous-proxy-servers-list_19.html
    http://shakzone.blogspot.com/2013/02/20-02-13-high-anonymous-elite-proxy.html
    http://shakzone.blogspot.com/2013/02/20-02-13-high-anonymous-elite-proxy_19.html
    http://shakzone.blogspot.com/2013/02/20-02-13-http-google-approved-proxy.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-list-3298.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-list-3317.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-list-3370.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-list-4640.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-list-5090.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-mixed-3212.html
    http://shakzone.blogspot.com/2013/02/20-02-13-l1l2l3-http-proxies-mixed-3318.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-3100.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-3700.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-list.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-list_20.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-list_5405.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-list_66.html
    http://shakzone.blogspot.com/2013/02/20-02-13-scrapebox-proxy-servers-list_9426.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-170.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-226.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-262.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-list-191.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-list-236.html
    http://shakzone.blogspot.com/2013/02/20-02-13-speed-l1-elite-proxies-list-332.html
    http://www.alohatube.com/top/anal+fisting
    http://www.************.com/f120/200-l1-l2-proxies-20-feb-2013-a-90001.html
    http://www.listsock.net/
    http://www.listsock.net/2013/02/1922013-list-sock-proxy-1181.html
    http://www.papatyafrm.com/guncel-proxy-334/20022013-guncel-proxy-listesi-hergun-guncel-proxy-listesi-245220/sayfa-6
    http://www.pr0xies.org/
    http://www.pr0xies.org/2013/02/19-02-13-high-anonymous-elite-l1-proxy_3773.html
    http://www.pr0xies.org/2013/02/20-02-13-anonymous-proxy-servers-780.html
    http://www.pr0xies.org/2013/02/20-02-13-high-anonymous-elite-l1-proxy_20.html
    http://www.proxyfire.net/forum/showthread.php?goto=newpost&t=63672
    http://www.proxyfire.net/forum/showthread.php?goto=newpost&t=63674
    http://www.proxyfire.net/forum/showthread.php?p=135497
    http://www.proxyfire.net/forum/showthread.php?t=63663
    http://www.proxyfire.net/forum/showthread.php?t=63672
    http://www.proxyfire.net/forum/showthread.php?t=63673
    http://www.proxyfire.net/forum/showthread.php?t=63674
    http://www.proxyfire.net/forum/showthread.php?t=63674&goto=newpost
    http://www.proxz.com/proxy_list_anonymous_us_0.html
    http://www.robotproxy.com/2013/02/200-scrapebox-passed-http-proxies_20.html
    http://www.scrapeboxproxies.net/
    http://www.seodesert.com/f19/2013-02-19-217-scrapebox-passed-http-proxies-freshly-verified-w-screenshot-44375.html
    http://www.websitevalue.us/www/tam-tam.fr
    http://www.websitevalue.us/www/westinrome.com
    
     
    Last edited: Feb 20, 2013
  4. Kitsune

    Kitsune Junior Member

    Joined:
    Mar 2, 2008
    Messages:
    136
    Likes Received:
    30
    @proxygo Well yea i wasn't really sure at what level of detail i should have written this tutorial. Will keep this in mind when writing more tutorials in the future.
     
  5. mrankin

    mrankin Jr. VIP Jr. VIP

    Joined:
    Oct 17, 2008
    Messages:
    1,247
    Likes Received:
    575
    Location:
    Australia
    Home Page:
    Just a word of warning to all, as of yesterday (21st Feb 2013), Scrapebox's proxy checker incorrectly shows anonymous/transparent proxies. This means that if you're using public proxies and think they're anonymous, they may not be. Also, if you're using private proxies (like mine), they can show up as not being anonymous which is incorrect.

    I strongly recommend you double check your proxies using a 3rd party tool. Personally I use Lagado's Proxy Checker which I have found to be the most accurate.

    I suspect this issue may be fixed in future versions, but better safe than sorry, especially if the source of the proxies is not known.
     
  6. Seeker85

    Seeker85 Newbie

    Joined:
    Apr 30, 2013
    Messages:
    38
    Likes Received:
    2
    I dont get#2... you just post %KW% anywhere is the text file.. just once and not one in the beginning of proxy list and one at the end? I assume I leave out quotations marks? Sorry for the basic questions but this is my first time using scrapebox.
     
  7. HighTop123

    HighTop123 Junior Member

    Joined:
    Nov 13, 2012
    Messages:
    198
    Likes Received:
    49
    Is there anyway to ensure one can just scrape UK proxies?
     
  8. proxygo

    proxygo Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 2, 2008
    Messages:
    12,242
    Likes Received:
    9,030
    Occupation:
    PROVIDING PROXIES FOR SEO URL SCRAPERS
    Location:
    BHW
    Home Page:
    dont waste your time - if u get 10 ude be lucky
     
    • Thanks Thanks x 1