Sorry if this is posted somewhere but ive just searched for 15min and cannot for ...
-
Quick Scrapebox Question about inurl:
Sorry if this is posted somewhere but ive just searched for 15min and cannot for the life of me find the answer im looking for.
Ok, so we know with scrapebox if we want a list of websites with "dogs" in the url we use: inurl:dogs....
Which might bring up:
www.dogsarefun.com
however its also bringing the results from web 2.0's, article dir's etc which are useless to me:
http://www.articlespam.com/dogs-are-...st-friend.html
What is the google search for scraping websites that only have the keyword in the TLD.
so it would only return top level domains with the keyword in instead of including inner page results?
I.e.
www.*****dogs**.com
www.**dogs************.nl
www.***dogs****.co
etc etc
At the moment i have to scrape the list, then run through excel to grab what i want and keep doing that but i have a funny feeling if i searched correctly i could get alot more tailored results.
Hope someone can enlighten me quickly
-
-
-
Re: Quick Scrapebox Question about inurl:
I just had a quick try with
which brings up all .com's. You could add other tld and in this way make sure you only get the actual domains. You may as well add a "www" in front to disable subdomains with "dogs".
Hope this helps.
-
The Following User Says Thank You to Stufferizer For This Useful Post:
-
Re: Quick Scrapebox Question about inurl:
Thanks, that semi worked, its still bringing up some inner pages but its certainly better than the results i was getting.
I didnt know you could use wildcards for google.
Also how would you do it for 2 keywords?
inurl:. *dog*kennels*.com ??? (also, is the . required after the : )
-
-
Re: Quick Scrapebox Question about inurl:
I was only looking for the same thing yesterday and came across this thread. I didn't have the patience to continue on testing with it but it seems it does work
-
The Following User Says Thank You to Rua999 For This Useful Post:
-
Re: Quick Scrapebox Question about inurl:

Originally Posted by
Thesiege84
Also how would you do it for 2 keywords?
inurl:. *dog*kennels*.com ??? (also, is the . required after the : )
If you only have two different keywords, I would just go for both possibilities, i.e.
Code:
inurl:*dog*kennels*.com
and
Code:
inurl:*kennels*dog*.com
as different footprints.
If you have a lot more to check for, maybe a little excel worksheet is what you would like to use.
There you should have the possibility to generate all relevant combinations using a randomize function.
The . is not required, you should probably even leave it out to be certain to get all domains without subdomain, i.e. dogs.c0m.
-
The Following 2 Users Say Thank You to Stufferizer For This Useful Post:
BlackHatMack (01-24-2013),
Thesiege84 (01-25-2013)
-
Re: Quick Scrapebox Question about inurl:
What if i wanted to use YouTube to do the same thing
-
-
Re: Quick Scrapebox Question about inurl:

Originally Posted by
Rua999
I was only looking for the same thing yesterday and came across
this thread. I didn't have the patience to continue on testing with it but it seems it does work

Thanks, i took a look and it looks a very longwinded way of doing it with zone files etc.

Originally Posted by
Stufferizer
If you only have two different keywords, I would just go for both possibilities, i.e.
Code:
inurl:*dog*kennels*.com
and
Code:
inurl:*kennels*dog*.com
as different footprints.
If you have a lot more to check for, maybe a little excel worksheet is what you would like to use.
There you should have the possibility to generate all relevant combinations using a randomize function.
The . is not required, you should probably even leave it out to be certain to get all domains without subdomain, i.e. dogs.c0m.
I just tried: inurl:*kennels*dog*.com
However if you try it you will see the second result is: www.retailmenot.com › Pets › Dog Supplies › Dog Crates
im trying to idolate only tld's, dogs or kennels isnt in the url. Maybe you could have another look for me?
I'd REALLY appreciate it!
-
-
Re: Quick Scrapebox Question about inurl:
Dear Thesiege84,
the following weird situation occurs for me and I can now fully understand your problem:
If I am searching for
Code:
inurl:*dogs*kennels.com
, I get very decent results:
dogs-kennels-g-test.gif
However, if I am searching for
Code:
inurl:*dogs*kennels*.com
with one additional *, I get basically the url's to the keywords kennels and dogs. I really don't get why but what works in all cases is just one of the keywords, i.e.
Code:
inurl:*kennels*.com
kennels-g-test.gif
So there seems to be a problem with too many *. To solve this problem, I suggest you search for each keyword alone and sort out duplicates using excel (or via scrapebox' duplicate filter). I can understand that g puts a stop on too many wildcards etc. But what I absolutely not understand is the following. If I put exactly the same footprint into scrapebox and try to harvest (without proxies, to be comparable), this is what I get:
kennels-scrapeb0x-test.png
The result has absolutely nothing to do with my exact same search above - I think you experience the same problem.
Maybe there is a problem with scrapebox or my usage of it. Hopefully someone of the more experienced scrapebox users can give us a hint here?!
-
-
Re: Quick Scrapebox Question about inurl:

Originally Posted by
BlackHatMack
What if i wanted to use YouTube to do the same thing
Hey BHMack, what do you mean by doing the same thing? Do you want to scrape youtube vid urls?
Here you could just insert the footprint of youtube (see image, cannot post links):
youtube-scrape.png
and you are ready to go. But be sure to disable Options -> Automatically Remove Duplicate Domains since you will otherwise always get only one link
-
-
Re: Quick Scrapebox Question about inurl:
Thanks for putting so much effort into helping me.
Ive gone down the route of each single word instead of keyword. After combining all the results and processing them in excel it actually doesnt look that bad. Im going to use these for guest blogging so someone will have to manually go through the list anyway so its not too bad.
Thanks for the help!
EDIT: Shame Google don't have intld:
Last edited by Thesiege84; 01-25-2013 at 07:31 PM.
-
The Following User Says Thank You to Thesiege84 For This Useful Post:
Similar Threads
-
By CyberWizard in forum Black Hat SEO
Replies: 3
Last Post: 09-04-2012, 02:47 AM
-
By maineyak in forum Making Money
Replies: 5
Last Post: 10-28-2011, 11:08 AM
-
By trinpol in forum Black Hat SEO Tools
Replies: 0
Last Post: 09-19-2011, 05:11 PM
-
By outsidethesquare in forum Black Hat SEO Tools
Replies: 2
Last Post: 12-23-2010, 12:49 AM
-
By junkie2 in forum Black Hat SEO Tools
Replies: 0
Last Post: 10-15-2010, 06:31 PM
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
Bookmarks