1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Creating a simple regex for Imacros...help this noob out!

Discussion in 'General Programming Chat' started by Frankie4Fingers, Dec 25, 2011.

  1. Frankie4Fingers

    Frankie4Fingers Power Member

    Joined:
    Jan 8, 2009
    Messages:
    676
    Likes Received:
    214
    Hey guys,

    I'm going crazy on this...

    Basically I want Imacros to extract from the source page the code highlighted in green:

    [...]"auth_token": "B5T7puQiUTuHw4LMzIBx7xs0h-V8MTMyNDg3Mzk5OEAxMzI0ODUyMzk4", "video_id" [...]


    So, what's the regex code I should put after the Imacros command SEARCH SOURCE=REGEXP: ?

    Thanks!
     
    Last edited: Dec 25, 2011
  2. sirgold

    sirgold Supreme Member

    Joined:
    Jun 25, 2010
    Messages:
    1,260
    Likes Received:
    645
    Occupation:
    Busy proving the Pareto principle right
    Location:
    A hot one
    Let's assume you're using javascript as a helper to the free edition of iMacros. That's what you'd need:

    Code:
    <script>
    var str='"auth_token": "B5T7puQiUTuHw4LMzIBx7xs0h-V8MTMyNDg3Mzk5OEAxMzI0ODUyMzk4", "video_id"';
    var patt1=/"auth_token": "(.[^"]*)"/i;
    
    var url = str.match(patt1)[1];
    alert(url);
    </script>
    
    Explanation so it might be clearer.

    The pattern '(.[^"]*)' defines a non greedy search (stop at first ") while the 2 parenthesis instructs the regex to group what you're looking for.

    str.match(patt1)[1] <- an array is returned where [1] is clearly the first occurrence of the grouped regex.

    Likewise with vbs, php, sed, grep, basically whatever supports standard regex. HTH!
     
    • Thanks Thanks x 1
    Last edited: Dec 26, 2011
  3. Frankie4Fingers

    Frankie4Fingers Power Member

    Joined:
    Jan 8, 2009
    Messages:
    676
    Likes Received:
    214
    Hey man,

    thanks for the answer. Unfortunately I'm getting an error while trying to adapt your code to iMacros syntax.


    I tried this:

    SEARCH SOURCE=REGEXP:/"auth_token": "(.[^"]*)"/i

    and also this:

    SEARCH SOURCE=REGEXP:"/"auth_token": "(.[^"]*)"/i"

    but it keeps returning me a wrong format error.


    Here's an example of a correct format for the SEARCH command:

    SEARCH SOURCE=REGEXP:"_get[Tt]racker\\(([^)]+)\\)"



    Can you point me in the right direction?

    Thanks again.
     
  4. sirgold

    sirgold Supreme Member

    Joined:
    Jun 25, 2010
    Messages:
    1,260
    Likes Received:
    645
    Occupation:
    Busy proving the Pareto principle right
    Location:
    A hot one
    First thing that comes to my mind would be to skip the native SEARCH SOURCE command entirely to rather take advantage of the iMacros EVAL statement in order to process the previous js code, so you can basically cut-paste what I wrote.

    URL: wiki.imacros.net/EVAL

    If you wanna use straight iMacros:

    Code:
    SEARCH SOURCE=REGEXP:"\"auth_token\": \"(.[^\"]*)\"" EXTRACT="Rexex is: $1"
    PROMPT {{!EXTRACT}}
    
    Alternatively, if you need to retrieve that value from the DOM (it seems that you need to...) you could do something like this:

    iMacros code:
    Code:
    TAG POS=1 TYPE=HTML ATTR=* EXTRACT=HTM
    
    or to be consistent and write it all in javascript:
    Code:
    iimPlay("CODE:\n TAG POS=1 TYPE=HTML ATTR=* EXTRACT=HTM\n");
    

    Javascript code:
    Code:
    if ( iimGetLastExtract().search("auth_token") >= 0 ) {
    
    var str=iimGetLastExtract();
    var patt1=/"auth_token": "(.[^"]*)"/i;
    var url = str.match(patt1)[1];
    alert(url); // do something with url
    
    }
    
    I'd recommend to write your entire macros in js instead of iim, though. But this should help.
     
    • Thanks Thanks x 1
    Last edited: Dec 26, 2011
  5. Frankie4Fingers

    Frankie4Fingers Power Member

    Joined:
    Jan 8, 2009
    Messages:
    676
    Likes Received:
    214
    Tried this one:

    Code:
    TAG POS=1 TYPE=HTML ATTR=* EXTRACT=HTM
    SET !VAR1 EVAL("<script>
    var str='"auth_token": "B5T7puQiUTuHw4LMzIBx7xs0h-V8MTMyNDg3Mzk5OEAxMzI0ODUyMzk4", "video_id"';
    var patt1=/"auth_token": "(.[^"]*)"/i;
    
    var url = str.match(patt1)[1];
    alert(url);
    </script>")
    PROMPT {{!VAR1}}
    and this one

    Code:
    TAG POS=1 TYPE=HTML ATTR=* EXTRACT=HTM
    SET !VAR1 EVAL("if ( iimGetLastExtract().search("auth_token") >= 0 ) {
    
    var str=iimGetLastExtract();
    var patt1=/"auth_token": "(.[^"]*)"/i;
    var url = str.match(patt1)[1];
    alert(url); // do something with url
    
    }")
    PROMPT {{!VAR1}}
    but it still doesn't work... :confused:
     
  6. Frankie4Fingers

    Frankie4Fingers Power Member

    Joined:
    Jan 8, 2009
    Messages:
    676
    Likes Received:
    214
    Oh wait, I noticed now that you edited your post...using this:


    Code:
    SEARCH SOURCE=REGEXP:"\"auth_token\": \"(.[^\"]*)\"" EXTRACT="Rexex is: $1"
    PROMPT {{!EXTRACT}}
    it works exactly like I wanted. Thanks very much for your help, I'm going to give you rep :)
     
  7. sirgold

    sirgold Supreme Member

    Joined:
    Jun 25, 2010
    Messages:
    1,260
    Likes Received:
    645
    Occupation:
    Busy proving the Pareto principle right
    Location:
    A hot one
    Yep I figured you'd love a native solution. :p This last one I added is written in straight iMacros language.

    The others need to be run as js macros that's what I generally do because, at the end of the day, if the macro is complex enough, it's gonna be way more readable and you can use functions and all the nifty little things of a real programming language. :D

    Glad it helped, merry xmas! :)
     
  8. Frankie4Fingers

    Frankie4Fingers Power Member

    Joined:
    Jan 8, 2009
    Messages:
    676
    Likes Received:
    214
    Merry Xmas to you, too!

    Yeah, I'm all for native solutions as I am a lil' coding goat :D

    Speaking of which, If I can bother you with one last thing, I'm having trouble with the !LOOP function of Imacros...

    Basically I wrote this:

    Code:
    VERSION BUILD=7401110 RECORDER=FX
    SET !ERRORIGNORE YES
    SET !DATASOURCE C:\notes.txt
    SET !LOOP 1
    SET !DATASOURCE_COLUMNS 1
    SET !DATASOURCE_LINE {{line}}
    TAB OPEN
    TAB T={{!LOOP}}
    URL GOTO=http://myurl.com
    TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:id CONTENT={{!COL1}}
    TAG POS=2 TYPE=INPUT:TEXT ATTR=NAME:id CONTENT={{!COL1}}
    TAG POS=3 TYPE=INPUT:TEXT ATTR=NAME:id CONTENT={{!COL1}}
    TAG POS=4 TYPE=INPUT:TEXT ATTR=NAME:id CONTENT={{!COL1}}
    ....
    
    I wanted to fill the forms of the page taking the lines from notes.txt file in subsequent order, the problem is that the loop function isn't working and it keeps filling the forms just with the 1st line of the file...I also tried to interchange !COL1 with !LOOP but the results is the same...any suggestions? :confused:
     
  9. sirgold

    sirgold Supreme Member

    Joined:
    Jun 25, 2010
    Messages:
    1,260
    Likes Received:
    645
    Occupation:
    Busy proving the Pareto principle right
    Location:
    A hot one
    Hey bro, here it seems the error lies with the SET !DATASOURCE statement that should look something like:

    Code:
    SET !DATASOURCE_LINE {{!LOOP}}
    
    Also you're assigning a new tab per line in the loop that will be a hefty price to pay if you have a lot of lines in that datasource.. ;)

    I'd suggest to get rid of the TAB T={{!LOOP}} directive and rather split the datasource file into smaller chunks. In order to work in parallel you can fire up multiple instances of your browser/player running this macro that will ask with a PROMPT statement which datasource file you wanna load.