1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[C#] Why does this regex not work?

Discussion in 'C, C++, C#' started by Coron, Jan 14, 2012.

  1. Coron

    Coron Newbie

    Joined:
    Jan 13, 2012
    Messages:
    23
    Likes Received:
    0
    I'm pretty tired and I've probably missed something, but why does the regex below not work?
    Code:
                Regex captchaRegex = new Regex("src=\"example/Captcha?ctoken=(.*?)\"");
                Match captchaMatch = captchaRegex.Match("<img src=\"example/Captcha?ctoken=Get this\" width=\"200\" height=\"70\" alt=\"Visual verification\">");
    In case you're wondering, I'm trying to get the captcha image on Google. It doesn't match, I don't get it!!

    Also I had to replace an URL in the code with 'example' because I'm new.
     
  2. a_z_0_9

    a_z_0_9 Junior Member

    Joined:
    Jul 18, 2011
    Messages:
    110
    Likes Received:
    21
    Try this regex
    <img.+?src=[\"']example/Captcha\?ctoken=(.+)?[\"']>
     
  3. a_z_0_9

    a_z_0_9 Junior Member

    Joined:
    Jul 18, 2011
    Messages:
    110
    Likes Received:
    21
    try this updated
    <img.+?src=[\"']example/Captcha\?ctoken=(?<val>.+)?[\"'] width=["](.*)?["] height=["](.*)?["] alt=["](.*)?["]>

    then you can get the value using captchaMatch.Groups["val"].Value
     
  4. Coron

    Coron Newbie

    Joined:
    Jan 13, 2012
    Messages:
    23
    Likes Received:
    0
    Thanks for the reply. After changing example to my URL it didn't work. I'm very unfamiliar with regex and I'm not sure if I need to add brackets to the URL as well?

    Here is the full regex:
    src="hxxxx://accounts.google.xxx/Captcha?ctoken=(.*?)"

    edit: I'll try your new reply
     
  5. kaidoristm

    kaidoristm Power Member

    Joined:
    Feb 13, 2009
    Messages:
    561
    Likes Received:
    726
    Occupation:
    Freelancer
    Location:
    Estonia
    Home Page:
    Escape ? in your regex
     
  6. a_z_0_9

    a_z_0_9 Junior Member

    Joined:
    Jul 18, 2011
    Messages:
    110
    Likes Received:
    21
    try this updated one for actual source
    <img.+?src=[\"'].*?\?ctoken=(?<val>.+)?[\"'] width=["](.*)?["] height=["](.*)?["] alt=["](.*)?["]>

    captchaMatch.Groups["val"].Value
     
  7. jazzc

    jazzc Moderator Staff Member Moderator Jr. VIP

    Joined:
    Jan 27, 2009
    Messages:
    2,468
    Likes Received:
    10,143
    Code:
    ResultString = Regex.Match(SubjectString, "ctoken=(.*?)\\\"").Groups[1].Value;
    
     
  8. osmose01

    osmose01 Registered Member

    Joined:
    Jun 21, 2009
    Messages:
    77
    Likes Received:
    5
    is the captcha number showing up on the image url , if it really the case it will be really dumb from google and i guess that where you have a problem with your regex.The value might be showing up in your browser but it not really the case.
     
  9. tynman

    tynman Newbie

    Joined:
    Mar 11, 2009
    Messages:
    1
    Likes Received:
    0
    Occupation:
    Teacher
    Look into RegexBuddy. One of the best $30 I ever spent. The time it saves you in learning and debugging regex is amazing!
     
  10. theMagicNumber

    theMagicNumber Regular Member

    Joined:
    May 13, 2010
    Messages:
    345
    Likes Received:
    195
    Code:
      Regex captchaRegex = new Regex("Captcha[?]ctoken=(.+?)\"");
                Match captchaMatch = captchaRegex.Match("<img src=\"example/Captcha?ctoken=Get this\" width=\"200\" height=\"70\" alt=\"Visual verification\">");
                string getThis = captchaMatch.Groups[1].Value;
    
    ? is a regex operator, you should escape it with []
     
  11. Satan Claus

    Satan Claus Regular Member Premium Member

    Joined:
    Aug 25, 2010
    Messages:
    217
    Likes Received:
    193
    Try something like this (Second result, haven't tested against yours string)

    [\<]img[\ ]src[\=][\"]([^\=]*)[\=]([^\"]*)[\"][\ ]width[\=][\"]([^\"]*)[\"][\ ]height[\=][\"]([^\"]*)[\"][\ ]alt[\=][\"]([^\"]*)[\"][\>]