1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox & Site:

Discussion in 'Black Hat SEO' started by msoon77, Jun 10, 2017.

  1. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Hi guys,

    Just wondering. Have any of you managed to scrape Google for more than 600 URLs using just "site:" with Scrapebox?

    I've tried several domains and even changed proxy providers, and the results always seem to come back at around 570 URLs.
     
  2. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,810
    Likes Received:
    2,030
    Gender:
    Male
    Home Page:
    Google soft caps for advanced queries, in the 300 to 600 range typically. Ive seen results lower then 300 for some queries. They figure if you haven't found it by X hundred your not going to find what your looking for. Bear in mind google is geared for a real person using a browser.

    All you have to do is tack on keywords, like

    site:domain.com
    site:domain.com a
    site:domain.com b
    site:domain.com c
    site:domain.com 1
    site:domain.com 2
    site:domain.com 3
    site:domain.com green
    site:domain.com cloud
    etc...

    That nets you different sets of results from the google database and then you just remove duplicate urls when you are done.
     
    • Thanks Thanks x 3
  3. jamie3000

    jamie3000 Supreme Member

    Joined:
    Jun 30, 2014
    Messages:
    1,377
    Likes Received:
    626
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    Advanced search operators like site need massive delays in http requests. As in 90 sec plus the last time I did it if I remember correctly
     
    • Thanks Thanks x 1
  4. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    T

    Thanks Loopline. Much appreciated.

    Just curious though, am I going about this the right way, if I just want to find out how many and which URLs are in Google's index? Or is there a better method besides what we've discussed here?
     
  5. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    T
    Thanks Jamie. What happens when you go less than 90 seconds? Banned Proxies?
     
  6. jamie3000

    jamie3000 Supreme Member

    Joined:
    Jun 30, 2014
    Messages:
    1,377
    Likes Received:
    626
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    Depends on how many queries you want to send but its normally a captcha challenge. Best of luck sir :)
     
  7. Nargil

    Nargil Jr. VIP Jr. VIP

    Joined:
    May 10, 2012
    Messages:
    4,661
    Likes Received:
    2,997
    Location:
    Europe
    Home Page:
    A tip - Do not scrape Google. It's pointless honestly. Add Yahoo and Bing and you are good to go.
     
  8. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Thanks guys for chiming in :)
     
  9. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,810
    Likes Received:
    2,030
    Gender:
    Male
    Home Page:
    You can use the google competition finder to find out how many results show up for a given query like site:domain.com but as to find which ones show up, using the site operator is probably the best way.

    Alternatively you would crawl the site with scrapebox and run all resulting urls thru the index checker, but its probably a LOT easier to do the site: query.

    Reason being of course is that doing site: you can get up to 100 results per query and with index checker is 1 query per url, so your going to get blocked up to 100 times faster or its going to take up to 100 times longer to complete.
     
  10. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    T
    Thanks Loopline. Never tried Google competition finder before - just realised it was part of Scrapebox. I'll give it a shot!
     
  11. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,810
    Likes Received:
    2,030
    Gender:
    Male
    Home Page:
    Your welcome. Cheers!