1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

15$ for whoever can fix this simple PERL scraping script

Discussion in 'PHP & Perl' started by XoRaK, Aug 9, 2011.

  1. XoRaK

    XoRaK Regular Member

    Joined:
    Oct 28, 2009
    Messages:
    303
    Likes Received:
    250
    Occupation:
    social worker
    Location:
    Belgium
    I have a PERL script that needs some minor changes to scrape .us domains from
    http://newlydomains.com/domain-2011-08-03-us-1.html

    currently it doesn't work because there is something wrong with it.
    15$ for whoever can fix it.

    PHP:
    #!/usr/bin/perl
    use strict;
    use 
    LWP::Simple;

    # Change the name of the output file
    my $filename "domainlist.txt";
    # Change the name of the URL to fetch
    my @urls = ( "http://newlydomains.com/domain-2011-08-03-us-1.html"



     
    );
    # Change this to 0 if you want files to be overwritten
    my $overwrite 1;
    print 
    "Content-type: text/html\n\n";

      if(
    $overwrite eq 1)
        {
        if(-
    e $filename)
          {
          
    $filename$filename.'.'.time;
          }
        }

    open (OUTFILE">"$filename) or die ("Cannot open file ".$filename."\n");

    foreach 
    my $url (@urls)
      {
      
    my $page get($url);
      
    my(@results) = (join '\n'$page) =~  m|"top">(.+?)\.us</td>|gi;
      if(
    scalar(@resultsgt 0)
        {
        foreach 
    my $result(@results)
          {
          print 
    OUTFILE "$result.us\n";
          }
        }
      else
        {
        print 
    "No \".us\" domains found at $url\n";
        }
      }
      
    close OUTFILE;
      print 
    "Results saved to <a href=\"$filename\">$filename</a>\n";

     
  2. XoRaK

    XoRaK Regular Member

    Joined:
    Oct 28, 2009
    Messages:
    303
    Likes Received:
    250
    Occupation:
    social worker
    Location:
    Belgium
    someone on DP fixed it for me, thx anyway!

    I know, DP...

    Greetz

    XoRaK