[GET] Bulk Link Checker - (great to check the effectivness of link bundles you buy)

SpamHat

Junior Member
Joined
Apr 27, 2009
Messages
151
Reaction score
67
( Disclaimer: This is a quick script written in 5 mins, not a work of art. It's a quick and dirty link checker - no more, no less. :) )

What It Does:

A few days ago I wrote a quick script in php to check a list of URLs for a string (my url).
I wanted to check some of the link services here and on other forums but didn't feel like manually checking 300 urls to see if my links were there.
The script will visit each URL in turn checking the page for a unique string and report back on what it found, how many of each etc.

How To Use It:


  1. Change the $checkFor variable to the string to check for. This should be your url, or just any unique word/phrase you're looking for.
  2. Paste in all the urls you want to check into the variable $urlsToCheck. It doesn't matter about extra space at the beginning and end. They should be 1 per line.
  3. Upload to your server and run the script in your web browser. You can also use LAMP/XAMPP if you like.

The Script:

It's here for a quick sniff, copy and paste. I thought about uploading as a file also, but if I make any changes it will be easier like this.

It just uses straight cURL to grab the urls, not curl multi. Might do one with curl multi (much faster) if anyone likes this one.


PHP:
<?php
set_time_limit(0);

// Quick link checker script
// http://www.blackhatworld.com/blackhat-seo/black-hat-seo-tools/133787-get-bulk-link-checker-great-check-effectivness-link-bundles-you-buy.html
// by SpamHat

// How to use:
// 1. Change the $checkFor variable to the string to check for. This should be your url, or just any unique word/phrase you're looking for.
// 2. Paste in all the urls you want to check into the variable $urlsToCheck. It doesn't matter about extra space at the beginning and end. They should be 1 per line, like the example below.
// 3. Run the script in your web browser.


$checkFor = "google";

$urlsToCheck = "

http://www.google.com
http://www.yahoo.com
http://www.dmoz.org

";








/////////////////////////////////////////////////////////////////////////////////
// No need to edit below here unless you feel like messing around
/////////////////////////////////////////////////////////////////////////////////



$urls = explode("\n", trim(str_replace("\r", "", $urlsToCheck)));

$goodNum = 0;
$goodUrls = array();
$badNum = 0;
$badUrls = array();
$timeoutNum = 0;
$timeoutUrls = array();
$total = count($urls);

echo "<style>* {font-family:Verdana;} p,span {font-size:10px;} .r,.r a {color:red;} .b,.b a {color:blue;} .g,.g a {color:green;}</style>";
echo "<hr><h1>Checked URLs</h1>";

$i=0;
foreach($urls as $url) {
    
    $html = get($url);
    
    if(empty($html)) {
        $col = "b";
        ++$timeoutNum;
        $timeoutUrls[] = $url;
    } elseif(strstr($html, $checkFor)) {
        $col = "g";
        ++$goodNum;
        $goodUrls[] = $url;
    } else {
        $col = "r";
        ++$badNum;
        $badUrls[] = $url;
    }
    
    echo "<span class='$col'><a href='$url'>$url</a></span><br/>";    
    
    ++$i;
    // if($i >= 10) break;
    
}

echo "<hr><h1>Results</h1>";
echo "<p class='g' style='font-size:20px;'>Good: <b>$goodNum</b> / $total</p>";
echo "<p class='r' style='font-size:20px;'>Bad: <b>$badNum</b> / $total</p>";
echo "<p class='b' style='font-size:20px;'>TimedOut: <b>$timeoutNum</b> / $total</p>";


echo "<hr><h1>Good Urls ($goodNum)</h1><span>";
foreach($goodUrls as $url) echo "<a class='g' href='$url'>$url</a><br/>";
echo "</span>";

echo "<hr><h1>Bad Urls ($badNum)</h1><span>";
foreach($badUrls as $url) echo "<a class='r' href='$url'>$url</a><br/>";
echo "</span>";

echo "<hr><h1>Timed-Out Urls ($timeoutNum)</h1><span>";
foreach($timeoutUrls as $url) echo "<a class='b' href='$url'>$url</a><br/>";
echo "</span>";



function get($url) {

    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)");
    curl_setopt($curl, CURLOPT_HTTPHEADER, array("Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5", "Cache-Control: max-age=0", "Connection: keep-alive", "Keep-Alive: 300", "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7", "Accept-Language: en-us,en;q=0.5", "Pragma: "));
    curl_setopt($curl, CURLOPT_ENCODING, "gzip,deflate");
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_TIMEOUT, 10);
    curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
    $html = curl_exec($curl);
    curl_close($curl);
    return $html;

}

?>
So there you go.

May help you get an overview on the effectiveness of some of these link batch's you buy.
I would advise checking once your link work's done, then after a few days and comparing results.

If you have any problems with it post here and I'll probably be able to help.

:)
 
Thanks Can Be Used to do a variety of things

Just started running the script on wamp lets see
 
Thanks Can Be Used to do a variety of things

Just started running the script on wamp lets see

Yeah I thought it might be useful for other stuff too.
Might make a quick web interface for ease of use later.

Lemmie know how you get on.
 
^^ And why would that be?

Gotta tell me what happens or I can't help.

It does work, btw :D
 
Wow, really thanks, this is what i am looking for. Actually, i am checking my Angela's links one by one when I find out your post, great help. Can't wait to try.
 
Works great...the links are shown with different colours as per norm it is RGB:D
Thanks

Features to add would be :a report at the end in TXT
2.Multiple keywords

Other way to use this is track your software or digital products.
Find whether is shared in another forum
 
Will make a nicer version with curl_multi, web interface and reports when I've got a spare hour then :)

@ST - when you say multiple keywords what exactly are you looking for? Would it not just be easier to run the script twice?

btw, thanks or rep appreciated :)
 
^^ If PHP's throwing a set time limit error at you then you're on some crappy shared hosting with limited script execution time, most likely.

That's your problem and not the script's, sorry.

Try it on LAMP/XAMPP locally and it'll work fine.

Or alternatively remove that line, but the script will be limited to whatever your host allows (like 30 seconds or something probably) so you won't get the most out of the script at all. Run it locally.
 
^^ Yes.... when they timeout.

Change this line
Code:
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
To this
Code:
curl_setopt($curl, CURLOPT_TIMEOUT, 60);
and it may take longer, but they shouldn't timeout.
 
Hey SpamHat. Thanks for the script.

It works perfectly fine for me!

1 Request!

Make the script so that a user can enter values from the link and not by editing the values.!
 
^^ I already updated with with curl_multi and a simple GUI.

Will post it on the blog once it's cleaned up.
 
dude, this is freakin cool. thank you tons..

to all of you that cant figure this out... STOP... RELAX... READ HIS DIRECTIONS.

Ive rarely done anything like this, and google a few things I didnt understand and came up with the answer in 5 minutes.. had it working in 10, and almost done CHECKING the work ive outsourced..


KILLER script. kudos, thanks, rep, everything.
 
works really well for me, will be using this quite a lot so thank you very much.
 
I mean I am using it to find my ads!! The ads are rotated with different URL and keywords in them...

So if the script checks all the keywords i want it to check for every url.Becasue I have no clue what ad is posted on that particular URL its done randomly so would be nice if it can rotate all the keywords I used for adverts
 
This works great. I have used it to check a few links I have paid for and the script worked perfectly.
 
Worked perfectly for me on the 1st shot. Thanks alot for this, saves me a bunch of time. Now if you want to take it to the next level and build something I would love to pay for, here's a description of my optimal dream link checking software

LIST OF WHAT MY DREAM LINK CHECKING SOFTWARE WOULD DO
1.) Allow me to enter the keyword to search for just like yours does.
2.) Allows me to enter a list of urls just like yours does
3.) Produces a report that shows me the PR of each of the pages where the link was found
4.) Lists the # of outgoing links on each of those pages.

That would be hella sweet. Let me know if you can script something like this and if I can help you fund the project :) Thanks again
 
nice script, it would be useful to make if search links in a page
 
Back
Top