How to scrape Title & Description from Yahoo Answers

 

Results 1 to 5 of 5
Hello. How Can I scrape Title & description from Yahoo Answers and export it into ...
  1. #1
    issem10's Avatar
    issem10 is offline Junior Member
    Join Date
    Aug 2010
    Posts
    146
    Thanks
    6
    Thanked 9 Times in 9 Posts

    Default How to scrape Title & Description from Yahoo Answers

    Hello. How Can I scrape Title & description from Yahoo Answers and export it into txt or similar format? Anyone know software or script to do it?




  2. #2
    shubhamm is offline Registered Member
    Join Date
    Jan 2010
    Location
    BHW
    Posts
    97
    Thanks
    44
    Thanked 16 Times in 15 Posts

    Default Re: How to scrape Title & Description from Yahoo Answers

    You have to make it fully i don't think there is any script for it

    let me tell you Process

    Input URL by using textbox or string as it in PHP/Perl Category

    go to that url using CURL

    get response from it

    Use regex for title & Description

    Simple and also use useragent in Curl method

  3. #3
    Lyscer's Avatar
    Lyscer is offline Jr. VIP
    Join Date
    Jun 2012
    Posts
    109
    Thanks
    10
    Thanked 40 Times in 23 Posts

    Default Re: How to scrape Title & Description from Yahoo Answers

    Quote Originally Posted by shubhamm View Post
    You have to make it fully i don't think there is any script for it

    let me tell you Process

    Input URL by using textbox or string as it in PHP/Perl Category

    go to that url using CURL

    get response from it

    Use regex for title & Description

    Simple and also use useragent in Curl method
    This is exactly how I would do it too. If you are looking to do a mass amount of these, you will more than likely find a pattern or some unique thing once you have done a couple that will allow you to easily progress from one Q/A to another.

  4. #4
    wkirk is offline Junior Member
    Join Date
    Apr 2011
    Posts
    138
    Thanks
    85
    Thanked 60 Times in 37 Posts

    Default Re: How to scrape Title & Description from Yahoo Answers

    I think there is a programming/scripting section here somewhere but here you go.. made it as detailed and simple as I could.

    Code:
    <?
    $url=$argv['1'];
    // target y answers url
    
    $ch = curl_init(); 
    curl_setopt($ch, CURLOPT_URL, "$url");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $output = curl_exec($ch);
    curl_close($ch);      
    
    // page is now in $output variable, lets extract the title
    
    $title_starts_at_position=stripos($output,"<title>") + 7;
    $title_ends_at_position=stripos($output,"</title>");
    $length_of_the_title=$title_ends_at_position-$title_starts_at_position;
    $title_text=substr($output,$title_starts_at_position,$length_of_the_title);
    $title_without_yahoo_answers_suffix=str_replace(" - Yahoo! Answers","", $title_text);
    echo "\nTitle: $title_without_yahoo_answers_suffix"."\n";
    
    
    // same with description
    
    $description_starts_at_position=stripos($output,'<div class="content"') + 20;
    $remaining_content_from_start_of_description=substr($output,$description_starts_at_position,1000);
    $description_ends_at_position=stripos($remaining_content_from_start_of_description,"</div>");
    $description=substr($remaining_content_from_start_of_description,1,$description_ends_at_position -1);
    
    echo "Description: ".$description."\n";
    
    ?>
    How to use:

    Click image for larger version. 

Name:	yanswers.png 
Views:	61 
Size:	34.6 KB 
ID:	34111

  5. #5
    CodingAndStuff's Avatar
    CodingAndStuff is offline Junior Member
    Join Date
    May 2012
    Location
    You can't have my bots. Sorry :'(
    Posts
    183
    Thanks
    23
    Thanked 63 Times in 41 Posts

    Default Re: How to scrape Title & Description from Yahoo Answers

    I'd use the "Simple HTML Dom Parser" library found here: http://simplehtmldom.sourceforge.net/

    Then you'd do something like this:

    Code:
    <?php
    require_once("simple_html_dom.php");
    $question_id = "Put the question ID here";
    $html = file_get_html("http://answers.yahoo.com/question/" . $question_id);
    
    $title = $html->find('h1.subject');
    $description = $html->find('div.content');
    
    echo "Title is: " . $title . " and Description is: " . $description;
    ?>
    I didn't test it, but it should work fine.


Similar Threads

  1. Replies: 0
    Last Post: 10-31-2012, 09:11 AM
  2. No description and incorrect title?
    By azxten in forum White Hat SEO
    Replies: 1
    Last Post: 10-04-2009, 09:50 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  




BlackHatWorld on Twitter BlackHatWorld on FaceBook


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108