1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

php curl

Discussion in 'Other Languages' started by blackbob1, Aug 1, 2013.

Tags:
  1. blackbob1

    blackbob1 Newbie

    Joined:
    Jul 30, 2013
    Messages:
    5
    Likes Received:
    0
    I'm using php curl to do some webpage scraping (cant post url) and it working on root domain name but I seem to be having a problem with the page I want to scrape and I can only think it's related to the url structure. I cant and dont really want to pot the exact url but it looks like this [domain].co.uk/word-word-123-c.asp wold this cause a problem with curl and how do I get around it?

    note:
    I dont use php
     
  2. TheeAriGrande

    TheeAriGrande Regular Member

    Joined:
    Jul 14, 2013
    Messages:
    270
    Likes Received:
    151
    Location:
    Candlestick Park
    I'm a little lost. What exactly is the problem? PM me if you don't want to post the URL, I'll help you out. I have plenty experience scraping sites.
     
  3. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,288
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
  4. LukeX

    LukeX Newbie

    Joined:
    Aug 28, 2013
    Messages:
    19
    Likes Received:
    2
    Location:
    West Coast
    apologies if this is incorrect, but shouldn't this be in the scripting section? You may get more help from people who know PHP there.

    Back on topic:
    there's a ton of scrapping scripts out there that can probably do what you want. You may have some luck finding one similar to what you want and then pulling the code out of it. If you don't know php, a great way to learn is to look at working code and figure out how to make it do what you want.

    Best of luck!
     
  5. somethingclever

    somethingclever Newbie

    Joined:
    Nov 26, 2008
    Messages:
    12
    Likes Received:
    3
    Occupation:
    Anything that puts money in my pocket
    Home Page:
    an easy approach could be
    PHP:
    file_get_contents("[domain].co.uk/word-word-123-c.asp")
    but curl does have better benefits
     
  6. reapV

    reapV Registered Member

    Joined:
    Jan 27, 2014
    Messages:
    56
    Likes Received:
    10
    Try your code on a simple site first before you move to the site you really want to scrape. Maybe there is some other stuff causing trouble on that site you want to scrape.
     
  7. solventnine

    solventnine Junior Member

    Joined:
    Dec 4, 2009
    Messages:
    113
    Likes Received:
    16
    Try executing curl or wget from the command line. If something on the site is blocking you, you'll be able to see your errors immediately without having to go through wherever the web server is logging them. Alternately, start reading through your error logs (if you're using cPanel, they should be easily accessible to you either directly through cPanel or in FTP).
     
  8. netbull2007

    netbull2007 Newbie

    Joined:
    Mar 7, 2014
    Messages:
    8
    Likes Received:
    6
    it's good idea to tell us what is your problem. but if the url you want to scrap is secure (httpS) add the following OPT
    Code:
    curl_setopt($cn, CURLOPT_SSL_VERIFYHOST,  0);
    curl_setopt($cn, CURLOPT_SSL_VERIFYPEER, 0);