1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Simple Script to Get Your Proxy List Automatically From Proxy URLs

Discussion in 'Proxies' started by spiritfly, Jan 25, 2015.

  1. spiritfly

    spiritfly Regular Member

    Joined:
    Apr 30, 2011
    Messages:
    271
    Likes Received:
    119
    Intro

    I'm in the process of setting up my own list scraping system with Scrapebox. I decided to go for a premium proxy list (port scanned) and so I subscribed to Wylde's Premium Proxy list. I'm not affilaited to his service, just stating what I went with. I haven't even tested it yet properly to have an opinion about it, but from a few runs it seemed decent for the 10 bucks monthly price so I will stick with it for a while and see.

    He provides the proxies through an URL which is updated every 2-3 hours with new set of 8-9k proxies. I was wondering how to update this list for a while with scrapebox, but it seems like scrapebox does not support live proxies updates from URL out of the box. This was a shock to me as most software (like GSA for example) can do this very easily. The only type of source for updating your proxy list while the url harvester is running is locally saved txt file.

    It seems like scrapebox can still update from urls as well, but it's a bit complicated to setup. To do this you have to own the automator addon(which costs $20). Then create a job for harvesting proxies with it and run it in a separate scrapebox instance than the one you use for scraping urls. Well I won't go into details with the actual process because it is well described here (yeah I apologized for my bad attitude there, it was late and I was hungry and all the bad stars were aligned in a weird way in the galaxy.. I do have a lot of respect for Scrapebox devs)

    Anyway I decided that it is not the route I wanted to take. Running a second instance for scrapebox with automator just for scraping urls once in 2-3 hours would be an overkill(and eat unnecessary resources). So I decided to create a small vbscript which saves the proxy list in a txt file and feed scrapebox from that. This script will be scheduled to run with windows scheduler every 2 hours. It's the most simple way to do this in windows without wasting too much precious resources. Especially if you're on a tight budget VPS.

    Script

    So it seems like scrapebox 2.0 can only be set to update the proxies on the fly(while harvesting urls) from a .txt file NOT url directly. So here is my solution to convert your proxy URLs to txt file.

    Actually it's very simple to create this script. All you have to do is:

    1. open a new .txt file,
    2. paste the code below inside
    3. save that file with a .vbs extension !! (select in notepad save as.., select all files and type in yourfilename.vbs )

    Code:
    Set args = WScript.Arguments
     
    '// you can get url via parameter like line below
    '//Url = args.Item(0)
     
     Url = "http://YOUR.PROXY.URL.HERE"
    dim xHttp: Set xHttp = createobject("Microsoft.XMLHTTP")
    dim bStrm: Set bStrm = createobject("Adodb.Stream")
    xHttp.Open "GET", Url, False
    xHttp.Send
     
    with bStrm
        .type = 1 '//binary
        .open
        .write xHttp.responseBody
        .savetofile "C:\SELECT_YOUR_FOLDER\proxies.txt", 2 '//overwrite
    end with
    replace:

    YOUR.PROXY.URL.HERE --- with your proxy url
    C:\SELECT_YOUR_FOLDER\proxies.txt --- with the folder you wish to save your proxies.txt file into.

    Whenever you run this VBS file, it will scrape all of the proxies from the specified URL and will save them to the txt file you designated in your script. Go ahead and test it by double-clicking it to see if it works.

    Now to make it update the proxies automatically you can set the windows scheduler for this VB Script you just created. Setting up a task scheduler job in windows is quite easy. I won't go into much details because there are numerous guides about it if you use google, but basically you want to set it to run your vsb file every time your url proxies are updated. So your proxy provider should be able to tell you that. In my case it is 2 hours.

    After you set your scheduler to update your proxies you need to set your scrapebox 2.0 as well. So when running your url harvester in scrapebox you can tell it to get proxies from that txt file where your proxies are being saved and set it to update on X minutes which again in my case would be 120 minutes.

    So there you have it, quick and easy way to update your proxies without wasting precious resources. If you know a bit more VB scripting you can go ahead and add more sources, filter the proxies and so on.. But I will leave that up to you because my proxy list is already cleared up and ready for prime time! :)
     
  2. DjProgressive

    DjProgressive Newbie

    Joined:
    Nov 30, 2009
    Messages:
    15
    Likes Received:
    0
    thanks for the info. Ill try it.
     
  3. ilim72

    ilim72 Regular Member

    Joined:
    Sep 2, 2010
    Messages:
    452
    Likes Received:
    230
    Beside our own software that can run 100s threads and very fast we have been using Proxy Multiplyer which is reliable as well.
     
  4. proxygo

    proxygo Jr. VIP Jr. VIP

    Joined:
    Nov 2, 2008
    Messages:
    18,356
    Likes Received:
    10,059
    Occupation:
    PROVIDING PROXIES FOR GSA SCRAPING.
    Location:
    BHW
    Home Page:
    i dont follow, the guy is talking a script for proxie scraping
    your talking about a product you sell, how do they match up?
     
  5. ilim72

    ilim72 Regular Member

    Joined:
    Sep 2, 2010
    Messages:
    452
    Likes Received:
    230
    Can you read well? If you read carefully,thread is about getting updated proxies from URL and I mentioned that product Proxy Multiplier does that well as well. I have nothing to do with them just recommended it as I use it myself, there is no affiliate links or anything. So relax and don't try to get into thread just to show your bitterness.