1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

I've been working straight since yesterday trying to get this to work. I'm a noob to RegEx

Discussion in 'General Programming Chat' started by simpleonline1234, Nov 17, 2011.

  1. simpleonline1234

    simpleonline1234 Junior Member

    Joined:
    Jan 26, 2010
    Messages:
    169
    Likes Received:
    13
    I've been working straight since yesterday trying to get this to work. I'm a noob to RegEx and I've tested out about 5 different RegEx "builders" but each of them require you to navigate through the options to build the Regex...each of them has failed when I try to use them.

    Is there an application out there free/paid where you select the line you want to grab and the RegEx is auto generated from that hightlight rather than having to try to build the line of code?

    What I want to grab is this line of code:

    Code:
    <script type="text/javascript" src="[URL="http://www.blackhatworld.com/blackhat-seo/view-source:http://www.google.com/recaptcha/api/challenge?k=6LemUQQAAAAAAC6mNwmiXb8ZwmUU0R9Z5v_yZ5xl"]hxxp://www.google.com/recaptcha/api/challenge?k=6LemUQQAAAAAAC6mNwmiXb8ZwmUU0R9Z5v_yZ5xl[/URL]">
    Of course this line changes since it's a captcha so I only need to grab the link up to the k?= location.

    I can do this using the web browser control but I'm attempting to speed things up by using a webrequest.

    Any ideas on the RegEx software or how to grab this exact line?

    Thanks

    EDIT: Might help if you see my code as well so here goes it:

    Code:
           Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://hubpages.com/user/new/")
            Dim response As System.Net.HttpWebResponse = request.GetResponse
            Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream)
            Dim rssourcecode As String = sr.ReadToEnd
    
            '("src=([""'])(.*?)\1") 
            '<script type=""text/javascript"" src=""http://www.google.com/recaptcha/api/challenge?k=.*/"">
            Dim r As New System.Text.RegularExpressions.Regex("<script[\s\S]*?/script>")
            Dim matches As MatchCollection = r.Matches(rssourcecode)
            For Each itemcode As Match In matches
                TextBox1.Text = TextBox1.Text & itemcode.Value & vbNewLine
            Next
    [/CODE]
     
  2. Rushdie

    Rushdie BANNED BANNED

    Joined:
    Feb 2, 2009
    Messages:
    1,378
    Likes Received:
    1,720
    here you are:
    http://rubular.com/
     
  3. ijof9

    ijof9 Power Member

    Joined:
    Mar 27, 2010
    Messages:
    536
    Likes Received:
    594
    Occupation:
    CTO
    Location:
    Western Europe
    Code:
    Dim txt
    txt ="<script type=""text/javascript"" src=""http://www.google.com/recaptcha/api/challenge?k=6LemUQQAAAAAAC6mNwmiXb8ZwmUU0R9Z5v_yZ5xl""><script type=""text/javascript"" src=""http://www.google.com/recaptcha/api/challenge?k=6LemUQQAAAAAAC6mNwmiXb8ZwmUU0R9Z5v_yZ5xl"">"
    
    Dim reg
    reg = "<script type=""text\/javascript"" src=""http:\/\/www.google.com\/recaptcha\/api\/challenge\?k=(.*?)"">"
    
    Dim r
    Set r = New RegExp
    r.Pattern = reg
    r.IgnoreCase = True
    Dim m
    Set m = r.Execute(txt)
    	
    
    MsgBox(m.Item(0).subMatches.Item(0))
    
    Save as file.vbs, works.
     
  4. menzow

    menzow Junior Member

    Joined:
    Apr 20, 2010
    Messages:
    141
    Likes Received:
    101
    Here is the regex:
    http://rubular.com/r/t1Rq1TorbP
    Code:
    /<script type=\"text\/javascript\" src=\"hxxp:\/\/www.google.com\/recaptcha\/api\/challenge\?k\=[A-Z0-9].*\">/
    Put your line somewhere in the <head></head> copied from BHW.
    It picked it up exactly. :)