1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

crawling website with ocasional redirects

Discussion in 'PHP & Perl' started by Mutikasa, Sep 14, 2015.

  1. Mutikasa

    Mutikasa Power Member

    Joined:
    May 23, 2011
    Messages:
    581
    Likes Received:
    207
    I am trying to scrape website which have popups or whatever. So sometimes my parser "simple_html_dom" for php gets the html of a popup instead of the page I need.
    How can I get around this?
     
  2. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    12,020
    Likes Received:
    10,814
    Occupation:
    WHEREZ MA
    Location:
    BITCOINS AT?
    Home Page:
    Filter out pages outside $target_domain

    if (!preg_match("mydomain.com",$that_shit)) redirect("matt cutts home");
     
    • Thanks Thanks x 1
  3. Mutikasa

    Mutikasa Power Member

    Joined:
    May 23, 2011
    Messages:
    581
    Likes Received:
    207
    cool, so I just keep on reloading until i get the page i want
     
  4. Mutikasa

    Mutikasa Power Member

    Joined:
    May 23, 2011
    Messages:
    581
    Likes Received:
    207
    It doesn't work. It keeps bringing me back to the popup for eternity until maximum execution time is exceeded
     
  5. Mutikasa

    Mutikasa Power Member

    Joined:
    May 23, 2011
    Messages:
    581
    Likes Received:
    207
    I had to add cookie to the crawler because the popup is only one per session
     
  6. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    12,020
    Likes Received:
    10,814
    Occupation:
    WHEREZ MA
    Location:
    BITCOINS AT?
    Home Page:
    You may be doing something wrong. If in doubt, hire a freelancer to do this for you.