A long long time ago google leaked that uniqueness was determined by 13 consecutive words being the same. Since we don't know the offset and we have to worry about boundary conditions we need to make every 6th word different to safely create uniqueness. This has always worked for me and even passes copyscape in all the tests I've done.
The 13 words leak always made sense to me because english has about 250,000 words and about 25,000 are commonly used on the web. So we are conservatively looking at a 13 digit base 25,000 number... which is astronomical. Changing every sixth word guarantees that at least one changed digit will occur in the digest no matter what offset is used.
Of course a human quality rater could see through this, but if your concern is algorithmic uniqueness then this will probably do the trick.