Scrapebox - How To Check Site's Language?

No not necessarily. But you could kind of do it.

So you could use the page scanner and then make definition files with common words. So like for English put

The
and
of
etc...

Just a few common words, the top 5 or 10 words used in the english language and then if they are found its english and most any page would have the word "the" or "and"


Do the same for other languages that you want to check for.

Now this can work if the site is controlled by the site only, meaning if its a company site fine. But if they allow user input that will skew the whole thing. Like if a user can leave a comment/feedback etc.. Then you might have a chinese site but some user leaves a comment in french and then that gets picked up and so the site shows as french.

but else it could work ok, just be aware of some false positives potentially.

 
No not necessarily. But you could kind of do it.

So you could use the page scanner and then make definition files with common words. So like for English put

The
and
of
etc...

Just a few common words, the top 5 or 10 words used in the english language and then if they are found its english and most any page would have the word "the" or "and"

That was smart... Thanks!
 
Back
Top