1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Best way to code text randomization {one|two|three|etc}

Discussion in 'Black Hat SEO' started by andreybl, Mar 30, 2010.

  1. andreybl

    andreybl Registered Member

    There isn't really coding forum here but maybe this is the most suitable subforum.

    Those who have done such, how do you separate the { } part from other text? Exploding the | is a piece of cake even if recursive, but you need to separate that {} first. Or have you all just wrote custom code that calculates the amount of recursives and goes through the text character by character?

    Code example in your preferred language would be nice :bored:
     
  2. TheLoser

    TheLoser Registered Member

    Joined:
    Feb 5, 2009
    Messages:
    72
    Likes Received:
    13
    Huh?

    Do you mean how to get rid of the encapsulating {} characters? If that's what you mean, a simple replace of { and } characters with an emptry string should suffice.
     
  3. TheLoser

    TheLoser Registered Member

    Joined:
    Feb 5, 2009
    Messages:
    72
    Likes Received:
    13
    ** DELETE THIS POST MODS, thanks **
     
    Last edited: Mar 30, 2010
  4. andreybl

    andreybl Registered Member

    No, of course not. Usually every seo program allows your to type your desired text like this:

    {Get|Obtain|Find|Discover} the {new|perfect|amazing} {diet|weight loss {|program}|way to lose your weight}

    And the program randomizes your text among those words, for example giving sentences like:
    Get the perfect weight loss
    Discover the amazing weight loss program
    Find the new diet
    and so on..

    I need to implement this same thing.
     
  5. Classytallpaul

    Classytallpaul Newbie

    Joined:
    Oct 6, 2009
    Messages:
    20
    Likes Received:
    27
    Occupation:
    Owner of Winclub.ca - Roulette Coaching
    Location:
    Toronto, Canada
    Use Regex
     
  6. kaidoristm

    kaidoristm Power Member

    Joined:
    Feb 13, 2009
    Messages:
    561
    Likes Received:
    726
    Occupation:
    Freelancer
    Location:
    Estonia
    Home Page:
    Im pretty sure that this is being converted to json
     
  7. HotNasty

    HotNasty Newbie

    Joined:
    Jan 9, 2010
    Messages:
    8
    Likes Received:
    3
    Here's a basic implementation in python. It goes through the text character by character counting the braces, and finds the string in braces that is most deeply nested within braces. Then it spins the string (ie. randomly selects a substring from the choices separated by |'s), replaces the spun text in the original text and then starts over until there are no more braces.

    Hope this helps. Hit the thanks if it does ;)

    Code:
    import random
    import re
    
    class SpyntaxError(Exception):
        pass
    
    def spin(text):
        while True:
            start = 0
            highest = 0
            level = 0
            highestchunk = None
    
            for index, char in enumerate(text):
                if char == '{':
                    level += 1
                    if level >= highest:
                        highest = level
                        start = index
                    lastmark = index
                elif char == '}':
                    if level == highest:
                        highestchunk = (start, index)
    
                    level -= 1
    
                    if level < 0:
                        raise SpyntaxError('too many closing braces')
    
                    lastmark = index
    
            if not level == 0:
                raise SpyntaxError('too many opening braces')
    
            if highestchunk:
                random.choice(text[highestchunk[0]+1:highestchunk[1]].split('|')), text[highestchunk[1]+1:]
    
                highesttext = text[highestchunk[0]+1:highestchunk[1]]
    
                text = text[:highestchunk[0]] + random.choice(highesttext.split('|')) + text[highestchunk[1]+1:]
            else:
                return text
    
    
     
    • Thanks Thanks x 2
  8. Kid Shaleen

    Kid Shaleen Regular Member

    Joined:
    Oct 29, 2009
    Messages:
    250
    Likes Received:
    63
    Thanks for the code. I've not used Python but given the simplicity of your code I may well learn enough to run things.

    But can your code handle null strings? In other words if I have a nested "{}" or "{ }" will it substitute that, i.e. delete the material?

    The reason I ask is that I've been looking over google patents on detecting duplicate content and I've been wanting something that can spin paragraphs and sentences as well as words. As part of this I want to be able to randomly kill both as well in order to alter things like the paragraph per article, words per paragraph, etc. in the search engine's dup content checking.
     
  9. andreybl

    andreybl Registered Member

    Thank you very much HotNasty!
     
  10. HotNasty

    HotNasty Newbie

    Joined:
    Jan 9, 2010
    Messages:
    8
    Likes Received:
    3
    Here's a better one using regex. And yes it does handle null strings, if I understand what you mean. It will get rid of "{}" by just deleting them. Is that what you want.

    Code:
    import random, re
    
    def spin(text):
        while True:
            matches = re.findall('(\{([^\{]*?)\})', text)
            for match in matches:
                text = text.replace(match[0], random.choice(match[1].split('|')), 1)
            if not matches:
                break
    
        return text
    
     
    • Thanks Thanks x 1
  11. andreybl

    andreybl Registered Member

    That was perfect :)

    Here's HotNasty's code converted to PHP function too

    Code:
    function spin($string) {
       while(true) {
          if(!preg_match_all('/({([^\{]*?)\})/', $string, $matches))
             break;
          foreach($matches[2] as $i => $match) {
             $parts = explode('|', $match);
             $string = str_replace($matches[0][$i], $parts[rand(0, count($parts)-1)], $string);
          }
       }
       return $string;
    }
     
  12. Talsin

    Talsin Regular Member

    Joined:
    Mar 18, 2010
    Messages:
    385
    Likes Received:
    107
    I would be very interested in a spyntax compiler than could hande running very serious large deep nested spun articles (im talking like 100+ pages of text to crawl through for various spins) and then outputs them into text files or excel files does anyone know of a program like that ?