1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

suppression list script

Discussion in 'PHP & Perl' started by jbrewski, Feb 13, 2012.

  1. jbrewski

    jbrewski Junior Member

    Joined:
    May 12, 2011
    Messages:
    116
    Likes Received:
    32
    Location:
    USA
    Looking for a suppression list that will compare 2 files and echo out all lines that were not matched.

    I have a script that does currently work, but the lists I need to scrub has 2.6Mil lines/records (30MB txt file) & the suppression list has 5k lines/records. When I try to run this I got a memory error on the page, so I cut the file down to 100k (1MB) records and then got a timeout for 30 seconds error.

    any one have any suggestions? Here is the code that works with smaller records:

    Code:
    Reads file: good.txt<br />
    compares them to bad.txt<br />
    Removes bad emails from good emails txt<br />
    output...<br /><br />
    
    <?php
    
    //grab lists as string
    $goodr =  file_get_contents("good.txt");
    $badr = file_get_contents("bad.txt");
    
    
    //split list into array
    $good = explode("\n",$goodr);
    $bad = explode("\n",$badr);
    
    $final = array();
    
    //loop through and find any bad emails in the good list
    foreach($good as $g){
        if(in_array($g,$bad)){
            //remove the bad email from the list
            //unset($b);
        }else{
            $final[] = $g;
        }
    }
    
    echo '<textarea style="width:100%;height:100%;">';
    foreach($final as $g){
    echo $g . "\n";
    }
    echo '</textarea>';
    ?>
     
  2. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,155
    Trying to use file_get_contents() with a huge file will result in creating a huge array and reaching the memory limit.

    You should create a loop to read say 10.000 lines on each go until the end of the file.

    An example of how to read by line without having to load all the file:
    Code:
    $file = new SplFileObject('myfile.txt');
    $file->seek(9999);     // Seek to line no. 10,000
    echo $file->current(); // Print contents of that line
    
     
    • Thanks Thanks x 1
    Last edited: Feb 13, 2012
  3. jbrewski

    jbrewski Junior Member

    Joined:
    May 12, 2011
    Messages:
    116
    Likes Received:
    32
    Location:
    USA
    thank you, I will test this out.
     
  4. dtang4

    dtang4 Regular Member

    Joined:
    Apr 7, 2010
    Messages:
    291
    Likes Received:
    43
    You can add
    HTML:
    set_time_limit(0);
    at the top of the script to eliminate the 60 second timeout constraint.