1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

PHP Script for scraping proxies from HideMyAss

Discussion in 'PHP & Perl' started by hardc0der, Apr 23, 2014.

  1. hardc0der

    hardc0der BANNED BANNED

    Joined:
    Dec 13, 2013
    Messages:
    130
    Likes Received:
    27
    Hi guys,

    For those of you who want public proxies, I've made a PHP script that scrapes all the public proxies from hidemyass.com/proxy-list/
    It parses all the pages until the last one and grab all the proxies into a .txt file.

    Enjoy

    Download link:
    http://s000.tinyupload.com/index.php?file_id=66384768044350271654
     
    • Thanks Thanks x 2
  2. seeplusplus

    seeplusplus Power Member

    Joined:
    Aug 18, 2008
    Messages:
    511
    Likes Received:
    163
    Shame .
     
  3. salmanseo982

    salmanseo982 Regular Member

    Joined:
    Jan 28, 2014
    Messages:
    465
    Likes Received:
    40
    file was deleted from server plz upload again some ware else
     
  4. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    Here it is:
    Code:
    <?php
    
    
    /**
     * Free PHP script for scraping proxies from HideMyAss
     * Works from console too with php get_proxies.php
     * It will create a file named proxies.txt where will save proxies
     * It parses all pages with proxies from HideMyAss
     */
    
    
    $baselink 	= "http://hidemyass.com/proxy-list";
    $result 	= xg($baselink);
    
    
    preg_match('#<div class="pagination">(.*?)</div#msi', $result['result'], $pagination);
    preg_match_all('#<li>.*?</li>#msi', $pagination[1], $pages);
    
    
    $proxies = array();
    $ff = fopen("proxies.txt", "a+");
    for($i=1;$i<=count($pages[0])+1;$i++) {
    	
    	$link = $baselink . "/" . $i;
    	$result = xg($link);
    	preg_match('#<table id="listtable".*?</thead.*?>(.*?)</table#msi', $result['result'], $content);
    
    
    	$rows = explode("<tr", $content[1]);
    	array_shift($rows);
    
    
    	foreach($rows as $key => $row) {
    		$cols = explode("<td", $row);
    		array_shift($cols);
    		preg_match('#<style>(.*?)</style>#msi', $cols[1], $styles);
    		preg_match_all('#\.(.*?){display:.*?(.*?)}#msi', $styles[1], $style);
    
    
    		$classes = array();
    		for($j=0;$j<count($style[1]);$j++) {
    			$classes[$style[1][$j]] = $style[2][$j];
    		}
    
    
    		$cols[1] = preg_replace('#>.*?<style>.*?</style>#msi', '', $cols[1]);
    		$cols[1] = preg_replace('#<(div|span) style="display:none">.*?</(div|span)>#', '', $cols[1]);
    		foreach($classes as $class => $value) {
    			if($value == "none") {
    				$cols[1] = preg_replace('#<(div|span) class="'.$class.'">.*?</(div|span)>#', '', $cols[1]);
    			}
    		}
    		$proxy_ip   = strip_tags(preg_replace('#\s+#msi', '', $cols[1]));
    		$proxy_port = strip_tags(preg_replace('#>|\s+#msi', '', $cols[2]));
    
    
    		$proxies[] = $proxy_ip.':'.$proxy_port;
    		fwrite($ff, $proxy_ip.':'.$proxy_port."\r");
    	}
    }
    fclose($ff);
    
    
    function xg($url, $show_header = 0)
    {
    
    
    	$ch = curl_init();
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36");
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    	curl_setopt($ch, CURLOPT_HEADER, $show_header);
    	curl_setopt($ch, CURLINFO_HEADER_OUT, true);
    	curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
    	curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
    	$result['result'] = curl_exec($ch);
    	$result['info']   = curl_getinfo($ch);
    	return $result;
    }
    
    
    var_dump($proxies);
    
    
    ?>
    
     
    • Thanks Thanks x 3
  5. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,143
    You may want to change this
    to this
    :)
     
  6. vebxperts

    vebxperts Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 15, 2008
    Messages:
    1,657
    Likes Received:
    339
    Awesome share OP :)

    Here is a little input for further decorating the output (line by line):

    After following line:
    Add this line:

    Replace this line:
    With:
     
    Last edited: Jun 8, 2014
  7. Mr Daniel

    Mr Daniel Newbie

    Joined:
    Jun 23, 2014
    Messages:
    43
    Likes Received:
    5
    Awesome Share OP. Great. Solved a lot of problems for me
     
  8. member8200

    member8200 Regular Member

    Joined:
    Aug 9, 2014
    Messages:
    469
    Likes Received:
    33
    Thank you for this share. Made some adjustments with the script and it works fine.
     
  9. zoomsixx

    zoomsixx Senior Member Premium Member

    Joined:
    Apr 29, 2010
    Messages:
    882
    Likes Received:
    460
    Occupation:
    SEO, Social Marketing
    Location:
    BHW!
    I just get a message saying array(0) { } when running this. Anyone have some advice for a PHP amateur?

    Code:
    [COLOR=#000000][FONT=Times New Roman]array(0) { }[/FONT][/COLOR]