1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Regex to extract urls from google search?

Discussion in 'White Hat SEO' started by symss, Jul 20, 2014.

  1. symss

    symss Regular Member

    Joined:
    Feb 14, 2009
    Messages:
    216
    Likes Received:
    206
    Hi everyone , anyone knows an updated regex to extract google url frm the source code of results page?

    I got this till now but it's not matching the right thing

    Code:
    (?<=><cite>).*?(?=</cite)
     
  2. IMCapitalist

    IMCapitalist Jr. VIP Jr. VIP Premium Member

    Joined:
    Jan 21, 2014
    Messages:
    1,090
    Likes Received:
    60
    Location:
    Google Universe
    This would be a help for you
    Code:
    http://www.dsquared-media.com/regular-expressions-google-search/
     
    • Thanks Thanks x 1
  3. symss

    symss Regular Member

    Joined:
    Feb 14, 2009
    Messages:
    216
    Likes Received:
    206
    I'm actually looking for a regex to extract google links from searches , I am not looking to use regular expressions inside searches.


    Thanks for the help btw , modified title to make it more clear.
     
    Last edited: Jul 20, 2014
  4. divok

    divok Senior Member

    Joined:
    Jul 21, 2010
    Messages:
    1,015
    Likes Received:
    634
    Location:
    http://twitter.com/divok
    what scraper are you using , regex depends on the language you are using . For eg your above regex is invalid in js as it does not support lookbehind.

    If are you scraping urls from google serps try gscraper free version . If you are coding your own scraper , learn more about XPath .
    Actually in this case you are burning resources using regex .

    try playing with scraper plugin on chrome for a start .:cool:
     
  5. Nitros

    Nitros Power Member

    Joined:
    Jan 30, 2009
    Messages:
    573
    Likes Received:
    295
    Get source code of google search page -> paste it on http://regex101.com and try regex expressions there ;)