1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Need Regular Expressions help with my expression

Discussion in 'General Programming Chat' started by simpleonline1234, Jun 19, 2012.

  1. simpleonline1234

    simpleonline1234 Junior Member

    Joined:
    Jan 26, 2010
    Messages:
    169
    Likes Received:
    13
    I have a string of text that I am trying to match in my application.

    This is the series that I am matching

    //images.slashdot.org/hc/45/7ee10c139d5a.jpg

    The only constant is going to be the amount of digits between the first two backslashes but then the actual jpg can vary.

    I've tried this but I am not able to grab anything.

    /hc/45/b6ffeb799a4e.jpg
    \/\/images\.slashdot\.org\/[a-zA-Z0-9]\/[a-zA-Z0-9]\/[a-zA-Z0-9]\.jpg

    Any ideas on where my expression is going wrong?
     
    Last edited: Jun 19, 2012
  2. ugjunk

    ugjunk Jr. VIP Jr. VIP Premium Member

    Joined:
    Jan 1, 2011
    Messages:
    2,340
    Likes Received:
    721
    Location:
    Los Angeles
    Home Page:
    Try this one :

    Code:
    /\bimages.slashdot.org/[a-zA-Z0-9]*/[a-zA-Z0-9]*/[a-zA-Z0-9]*.jpg\b/g
    It's matching your current link right now. I can further tweak it and modify for you if you need.
     
  3. b0xys

    b0xys Registered Member

    Joined:
    Apr 18, 2010
    Messages:
    52
    Likes Received:
    39
    try this
    (?<=\/\/images\.slashdot\.org\/hc\/45\/).*(?=\.jpg)

    or this
    \/\/images\.slashdot\.org\/hc\/45\/.*\.jpg
     
  4. amazonian raider

    amazonian raider Junior Member

    Joined:
    Mar 6, 2012
    Messages:
    114
    Likes Received:
    23
    Your regex is only matching a single character between each of the slashes and then a single character for the name of the jpg. Use the asterisks like ugjunk and that element of the regex will repeat as many times as it needs to before the next element is found.

    Since you said the number of characters between the slashes is constant, you could also put :

    [a-zA-Z0-9][a-zA-Z0-9]

    Between each of the slashes rather than:

    [a-zA-Z0-9]*

    The first option will match any two character long letter-number combination. With the asterisk it will match any N character long combination (meaning if there were 3, 4, 5, 72, etc characters between the slashes the asterisk would still find the match). This may or may not be an issue depending on the exact URLs you need it to match or not match.
     
  5. simpleonline1234

    simpleonline1234 Junior Member

    Joined:
    Jan 26, 2010
    Messages:
    169
    Likes Received:
    13
    I'm coming up with an error of "Object reference not set to an instance of an object."

    This may be easier for me this way. What would be the RegEx to grab anything and everything between two backslashes like this /everythinghere/

    I know there is some sort of wild card like an * or a way to say anything in between the backslashes I want to grab it regardless of what is it is.

    Thanks again guys.
     
  6. ugjunk

    ugjunk Jr. VIP Jr. VIP Premium Member

    Joined:
    Jan 1, 2011
    Messages:
    2,340
    Likes Received:
    721
    Location:
    Los Angeles
    Home Page:
    Use this expression code and it would grab anything in between those two slashes.

    Code:
    /\b[a-zA-Z0-9]*\b/g
     
    Last edited: Jun 20, 2012
  7. simpleonline1234

    simpleonline1234 Junior Member

    Joined:
    Jan 26, 2010
    Messages:
    169
    Likes Received:
    13
    Okay so I've tried all the advice given and I still don't get the result. Here is my actual code toggled to my button.

    Code:
    [COLOR=#ffffff][SIZE=2][SIZE=2]
    
    Private[/SIZE][/SIZE][SIZE=2][SIZE=2]Sub[/SIZE][/SIZE][SIZE=2] Button8_Click([/SIZE][SIZE=2][SIZE=2]ByVal[/SIZE][/SIZE][SIZE=2] sender [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] System.Object, [/SIZE][SIZE=2][SIZE=2]ByVal[/SIZE][/SIZE][SIZE=2] e [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] System.EventArgs) [/SIZE][SIZE=2][SIZE=2]Handles[/SIZE][/SIZE][SIZE=2] Button8.Click
    [/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] request [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] HttpWebRequest = HttpWebRequest.Create(URLofSite.Text)
    [/SIZE][SIZE=2][SIZE=2]Dim[/SIZE][/SIZE][SIZE=2] response [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] HttpWebResponse
    response = [/SIZE][SIZE=2][SIZE=2]CType[/SIZE][/SIZE][SIZE=2](request.GetResponse, HttpWebResponse)
    [/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] webstream [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] Stream = response.GetResponseStream
    [/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] streamreader [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]New[/SIZE][/SIZE][SIZE=2] StreamReader(webstream, Encoding.UTF8)
    [/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] html [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]String[/SIZE][/SIZE][SIZE=2] = streamreader.ReadToEnd
    sourcecode = html.ToString
    SourceViewer.Text = sourcecode.ToString
    [/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] sitekey [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]String
    [/SIZE][/SIZE][SIZE=2][SIZE=2]Dim[/SIZE][/SIZE][SIZE=2] value [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]String[/SIZE][/SIZE][SIZE=2] = RawCaptcha.Text
    [/SIZE][SIZE=2][SIZE=2]Dim[/SIZE][/SIZE][SIZE=2] m [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2] Match = Regex.Match(SourceViewer.Text, CaptchaFinal.Text, RegexOptions.IgnoreCase)
    [/SIZE][SIZE=2][SIZE=2]If[/SIZE][/SIZE][SIZE=2] (m.Success) [/SIZE][SIZE=2][SIZE=2]Then
    [/SIZE][/SIZE][SIZE=2][SIZE=2]Dim[/SIZE][/SIZE][SIZE=2] key [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]String[/SIZE][/SIZE][SIZE=2] = m.Groups(1).Value
    sitekey = (key.ToString)
    [/SIZE][SIZE=2][SIZE=2]
    End[/SIZE][/SIZE][SIZE=2][SIZE=2]If
    [/SIZE][/SIZE][SIZE=2][SIZE=2]
    Dim[/SIZE][/SIZE][SIZE=2] mymove [/SIZE][SIZE=2][SIZE=2]As[/SIZE][/SIZE][SIZE=2][SIZE=2]String[/SIZE][/SIZE][SIZE=2] = [/SIZE][SIZE=2][SIZE=2]"http:"[/SIZE][/SIZE][SIZE=2] & sitekey.ToString
    
    MessageBox.Show(mymove.ToString)
    [/SIZE][SIZE=2][SIZE=2]
    End[/SIZE][/SIZE][SIZE=2][SIZE=2]Sub
    [/SIZE][/SIZE][/COLOR]

    Here is my app in action.

    [​IMG]
     

    Attached Files:

    Last edited: Jun 20, 2012
  8. andee

    andee Regular Member

    Joined:
    Jul 24, 2010
    Messages:
    218
    Likes Received:
    83
    why dont you try widening it a little, it might be easier to match

    <img scr.*? width=127 height =70

    then just remove the extra match text
    like
    string = string.replace("<img scr", "")
    string = string.replace("width=127 height = 70, "")
    string = string.trim
     
  9. ugjunk

    ugjunk Jr. VIP Jr. VIP Premium Member

    Joined:
    Jan 1, 2011
    Messages:
    2,340
    Likes Received:
    721
    Location:
    Los Angeles
    Home Page:
    I tried this site while I was building expression for you it worked great.
    http://gskinner.com/RegExr/

    Try those expression I suggested and you will see it picks them up perfectly.
     
  10. theMagicNumber

    theMagicNumber Regular Member

    Joined:
    May 13, 2010
    Messages:
    345
    Likes Received:
    195
    The working regex in VB.NET

    Code:
     Dim Captcha As String = System.Text.RegularExpressions.Regex.Match(htm, "images\.slashdot\.org/hc/\d+/(.+?\.jpg)").Groups(1).Value
    
    .(dot) matches everything except newline, unless you specify Text.RegularExpressions.RegexOptions.Singleline.
    The other option is using [\w\W] which matches everything including newline.
    You have to use them with combination of +?("non greedy" .+? or [\w\W]+?) or they will match the entire string.
    Hope this helps.
     
    Last edited: Jun 20, 2012
  11. pro?po:coil

    pro?po:coil Newbie

    Joined:
    Jul 2, 2012
    Messages:
    12
    Likes Received:
    0
    i solved it in js and java with a working example but the BHW moderation system blocked the message so i can't put code/links here :/
    working example url: jsfiddle POINT net SLASH 69qMG
    change the POINT to a . and SLASH to a /
     
    Last edited: Jul 2, 2012