Best way to code text randomization {one|two|three|etc}

andreybl

Registered Member
Joined
Oct 12, 2009
Messages
92
Reaction score
11
There isn't really coding forum here but maybe this is the most suitable subforum.

Those who have done such, how do you separate the { } part from other text? Exploding the | is a piece of cake even if recursive, but you need to separate that {} first. Or have you all just wrote custom code that calculates the amount of recursives and goes through the text character by character?

Code example in your preferred language would be nice :bored:
 
Huh?

Do you mean how to get rid of the encapsulating {} characters? If that's what you mean, a simple replace of { and } characters with an emptry string should suffice.
 
** DELETE THIS POST MODS, thanks **
 
Last edited:
No, of course not. Usually every seo program allows your to type your desired text like this:

{Get|Obtain|Find|Discover} the {new|perfect|amazing} {diet|weight loss {|program}|way to lose your weight}

And the program randomizes your text among those words, for example giving sentences like:
Get the perfect weight loss
Discover the amazing weight loss program
Find the new diet
and so on..

I need to implement this same thing.
 
Here's a basic implementation in python. It goes through the text character by character counting the braces, and finds the string in braces that is most deeply nested within braces. Then it spins the string (ie. randomly selects a substring from the choices separated by |'s), replaces the spun text in the original text and then starts over until there are no more braces.

Hope this helps. Hit the thanks if it does ;)

Code:
import random
import re

class SpyntaxError(Exception):
    pass

def spin(text):
    while True:
        start = 0
        highest = 0
        level = 0
        highestchunk = None

        for index, char in enumerate(text):
            if char == '{':
                level += 1
                if level >= highest:
                    highest = level
                    start = index
                lastmark = index
            elif char == '}':
                if level == highest:
                    highestchunk = (start, index)

                level -= 1

                if level < 0:
                    raise SpyntaxError('too many closing braces')

                lastmark = index

        if not level == 0:
            raise SpyntaxError('too many opening braces')

        if highestchunk:
            random.choice(text[highestchunk[0]+1:highestchunk[1]].split('|')), text[highestchunk[1]+1:]

            highesttext = text[highestchunk[0]+1:highestchunk[1]]

            text = text[:highestchunk[0]] + random.choice(highesttext.split('|')) + text[highestchunk[1]+1:]
        else:
            return text
 
Here's a basic implementation in python. It goes through the text character by character counting the braces, and finds the string in braces that is most deeply nested within braces. Then it spins the string (ie. randomly selects a substring from the choices separated by |'s), replaces the spun text in the original text and then starts over until there are no more braces.

Hope this helps. Hit the thanks if it does

Thanks for the code. I've not used Python but given the simplicity of your code I may well learn enough to run things.

But can your code handle null strings? In other words if I have a nested "{}" or "{ }" will it substitute that, i.e. delete the material?

The reason I ask is that I've been looking over google patents on detecting duplicate content and I've been wanting something that can spin paragraphs and sentences as well as words. As part of this I want to be able to randomly kill both as well in order to alter things like the paragraph per article, words per paragraph, etc. in the search engine's dup content checking.
 
Here's a better one using regex. And yes it does handle null strings, if I understand what you mean. It will get rid of "{}" by just deleting them. Is that what you want.

Code:
import random, re

def spin(text):
    while True:
        matches = re.findall('(\{([^\{]*?)\})', text)
        for match in matches:
            text = text.replace(match[0], random.choice(match[1].split('|')), 1)
        if not matches:
            break

    return text
 
That was perfect :)

Here's HotNasty's code converted to PHP function too

Code:
function spin($string) {
   while(true) {
      if(!preg_match_all('/({([^\{]*?)\})/', $string, $matches))
         break;
      foreach($matches[2] as $i => $match) {
         $parts = explode('|', $match);
         $string = str_replace($matches[0][$i], $parts[rand(0, count($parts)-1)], $string);
      }
   }
   return $string;
}
 
I would be very interested in a spyntax compiler than could hande running very serious large deep nested spun articles (im talking like 100+ pages of text to crawl through for various spins) and then outputs them into text files or excel files does anyone know of a program like that ?
 
Back
Top