1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Some simple PHP/RegEx Questions

Discussion in 'PHP & Perl' started by altschule, Nov 19, 2010.

  1. altschule

    altschule Regular Member

    Joined:
    Sep 1, 2010
    Messages:
    282
    Likes Received:
    185
    Location:
    Sector 9
    I have a few years of PHP experience under my belt, but I don't have a lot of experience with doing this exact type of work. I don't need hand holding, merely suggestions on what would be the best way to go about this. I know I'm going to need some regex's, which I'm horrible at. If anyones got any example source code it'd be awesome.

    Basically, for personal use, I'm wanting to take a page and import it using curl, or whatever and find all the links, which i'm going to analyze, find out which ones are the ones that i want, then i'm wanting to import those pages one by one and scrape a certain area of text in between a certain < div > area and then i'll save it to the database.

    The analyzing, and saving to database and whatnot I can handle no problem, but can you fellas help me out with how I should import the pages using CURL and how I could find the certain blocks of text that I need using regex's? Thanks in advance.
     
  2. heiska

    heiska Junior Member

    Joined:
    Dec 5, 2008
    Messages:
    138
    Likes Received:
    169
    • Thanks Thanks x 1
  3. cnick79

    cnick79 Jr. VIP Jr. VIP

    Joined:
    Jun 10, 2010
    Messages:
    653
    Likes Received:
    341
    Location:
    Google's SandBox
  4. altschule

    altschule Regular Member

    Joined:
    Sep 1, 2010
    Messages:
    282
    Likes Received:
    185
    Location:
    Sector 9
  5. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,148
    I used simplehtmldom in the past, it can get VERY UGLY with memory.

    Use Zend 's framework DOM implementation. Much better. Not perfect, but very nice.
     
  6. altschule

    altschule Regular Member

    Joined:
    Sep 1, 2010
    Messages:
    282
    Likes Received:
    185
    Location:
    Sector 9
    Do you have any preferred resources I could check out on this? Id appreciate it.

    This will be a one time scrape once I get everything fine tuned and worked out, and I'm doing it from a server setup on a local box, so I'm not too extremely concerned about a mem hog.
     
  7. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,148
    • Thanks Thanks x 1