1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping from a JSON help (novice scraper run into issues)

Discussion in 'General Scripting Chat' started by ad9051, May 15, 2013.

  1. ad9051

    ad9051 Registered Member

    Joined:
    Nov 7, 2012
    Messages:
    57
    Likes Received:
    3
    Hello I am trying to scrape the contents from a website, the data I require from the webpage however is hidden in a JSON. I have tried loading the JSON but it returns 10 random users, where as the website contains 1400 total users . Is it still possible to scrape data from JSON's ?

    note. this scraping isn't of personal details, no emails, no names or anything like that (just in case anyone questions the morality of the scraping)

    Any help or comments would be greatly appreciated.

    Kind regards

    Alan
     
  2. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    If you are not afraid of a little server side Javascript take a look at Substack's "json-scrape" package. I use this a lot and it works very well.

    Code:
    https://github.com/substack/json-scrape
     
    • Thanks Thanks x 1
  3. ad9051

    ad9051 Registered Member

    Joined:
    Nov 7, 2012
    Messages:
    57
    Likes Received:
    3
    Thanks for the quick reply Mr Blue, Is this the only way to scrape from JSON or is there other ways? I am hoping to create a script to scrape the website every x mins on repeat so using a programme to manually scrape the JSON is far from ideal.

    Many thanks

    Alan
     
  4. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    There are many ways to skin a cat :) Do you have a platform and language preference? Automating your script to run can easily be achieved using cron jobs in Linux. Feel free to pm me the url of the site in question. I'll gladly take a look and give you my opinion.

    Cheers!

     
  5. ad9051

    ad9051 Registered Member

    Joined:
    Nov 7, 2012
    Messages:
    57
    Likes Received:
    3
    Thanks Mr Blue, Ill send you a DM :)

     
  6. ad9051

    ad9051 Registered Member

    Joined:
    Nov 7, 2012
    Messages:
    57
    Likes Received:
    3
    Ahh apparently I need to have 15 posts on the forum before I can DM ppl. Not sure if you can dm me? If you can skype chat or email you, if that is agreeable with you. Cheers Alan
     
  7. Zapdos

    Zapdos Power Member

    Joined:
    Oct 22, 2011
    Messages:
    597
    Likes Received:
    708
    Location:
    Eastern North Carolina
    http://php.net/manual/en/function.json-decode.php
    Include that with a cron job and voila.

    edit:
    Some pseudo-code:

    Code:
    $url = "http://www.website.com/json";
    $contents = json_decode(file_get_contents($url));
    $username = $contents['user']['username'];
    
     
  8. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    950
    Likes Received:
    662
    Occupation:
    Web/Bot Developer
    PM sent. Add me to Skype.

    Cheers!