1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

php curl

Discussion in 'Other Languages' started by blackbob1, Aug 1, 2013.

Tags:
  1. blackbob1

    blackbob1 Newbie

    Joined:
    Jul 30, 2013
    Messages:
    5
    Likes Received:
    0
    I'm using php curl to do some webpage scraping (cant post url) and it working on root domain name but I seem to be having a problem with the page I want to scrape and I can only think it's related to the url structure. I cant and dont really want to pot the exact url but it looks like this [domain].co.uk/word-word-123-c.asp wold this cause a problem with curl and how do I get around it?

    note:
    I dont use php
     
  2. TheeAriGrande

    TheeAriGrande Regular Member

    Joined:
    Jul 14, 2013
    Messages:
    270
    Likes Received:
    153
    Location:
    Candlestick Park
    I'm a little lost. What exactly is the problem? PM me if you don't want to post the URL, I'll help you out. I have plenty experience scraping sites.
     
  3. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,347
    Likes Received:
    1,802
    Location:
    www.Indexification.com
    Home Page:
  4. LukeX

    LukeX Newbie

    Joined:
    Aug 28, 2013
    Messages:
    22
    Likes Received:
    2
    Location:
    West Coast
    apologies if this is incorrect, but shouldn't this be in the scripting section? You may get more help from people who know PHP there.

    Back on topic:
    there's a ton of scrapping scripts out there that can probably do what you want. You may have some luck finding one similar to what you want and then pulling the code out of it. If you don't know php, a great way to learn is to look at working code and figure out how to make it do what you want.

    Best of luck!
     
  5. somethingclever

    somethingclever Newbie

    Joined:
    Nov 26, 2008
    Messages:
    24
    Likes Received:
    4
    Gender:
    Male
    Occupation:
    Anything that puts money in my pocket
    Location:
    in the Ether
    Home Page:
    an easy approach could be
    PHP:
    file_get_contents("[domain].co.uk/word-word-123-c.asp")
    but curl does have better benefits
     
  6. reapV

    reapV Registered Member

    Joined:
    Jan 27, 2014
    Messages:
    56
    Likes Received:
    10
    Try your code on a simple site first before you move to the site you really want to scrape. Maybe there is some other stuff causing trouble on that site you want to scrape.
     
  7. solventnine

    solventnine Junior Member

    Joined:
    Dec 4, 2009
    Messages:
    113
    Likes Received:
    16
    Try executing curl or wget from the command line. If something on the site is blocking you, you'll be able to see your errors immediately without having to go through wherever the web server is logging them. Alternately, start reading through your error logs (if you're using cPanel, they should be easily accessible to you either directly through cPanel or in FTP).
     
  8. netbull2007

    netbull2007 Newbie

    Joined:
    Mar 7, 2014
    Messages:
    8
    Likes Received:
    6
    it's good idea to tell us what is your problem. but if the url you want to scrap is secure (httpS) add the following OPT
    Code:
    curl_setopt($cn, CURLOPT_SSL_VERIFYHOST,  0);
    curl_setopt($cn, CURLOPT_SSL_VERIFYPEER, 0);
    
     
  9. ComputerHelp808

    ComputerHelp808 Newbie

    Joined:
    Nov 8, 2017
    Messages:
    15
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    In home computer tutor and website developer
    Location:
    Honolulu
    You could use the get_file_contents just to see if you get an error. I have a curl function but don't get an error from it if the website is protected.

    Rick
     
  10. bigot

    bigot Registered Member

    Joined:
    May 9, 2017
    Messages:
    54
    Likes Received:
    16
    Gender:
    Male
    Occupation:
    Programmer
    Location:
    Canada

    [​IMG]

    Did you mean file_get_contents?
     
    • Thanks Thanks x 1
  11. itz_styx

    itz_styx Jr. VIP Jr. VIP

    Joined:
    May 8, 2012
    Messages:
    633
    Likes Received:
    291
    Occupation:
    CEO / Admin / Developer
    Location:
    /dev/mem
    Home Page:
    would help if you tell us what the actual error is otherwise nobody can help you really :)

    edit: should have checked the date of the OP :p
     
  12. bigot

    bigot Registered Member

    Joined:
    May 9, 2017
    Messages:
    54
    Likes Received:
    16
    Gender:
    Male
    Occupation:
    Programmer
    Location:
    Canada
    Lmao I completely missed it. I blame ComputerHelp808