1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox Check Indexed - How Accurate Is it?

Discussion in 'Black Hat SEO Tools' started by msoon77, Jun 1, 2017.

  1. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Hi guys,

    I've been trying to get a bulk list of URLs checked for indexation, just to see exactly which pages of a client's website is being indexed, and which are not.

    Scrapebox's Check Indexed feature returned a list of Yes and No's but it doesn't seem accurate. Manually checking with the search operator "inurl:" half the time, returns different results.

    I contacted Scrapebox support and they said that their tool uses "info:".

    Any opinions as to how accurate this is?
     
  2. ebiz101

    ebiz101 Jr. VIP Jr. VIP

    Joined:
    Feb 16, 2010
    Messages:
    579
    Likes Received:
    135
    I'd say they pretty accurate maybe like 90% but they are some web 2.0 that for some reason eventhough it's indexed Scrapebox still label them as not index like foursquare, zotero, vimeo, etc
     
  3. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    I'm seeing quite inconsistent results on my side.

    For instance, when I use the info: operator, in this way - info:domain.com/au/page1.html (Scrapebox says Yes, indexed).

    Google returns:

    http://domain.com/US/page1.html instead. This is a country variant page page obviously.

    But I want to know specifically if the /AU/ page is indexed. In this case using a "inurl:" operator, Google returns nothing (not indexed).

    Shouldn't Scrapebox be using "inurl:" instead of "info:" ?
     
  4. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
  5. living2xl

    living2xl Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2011
    Messages:
    1,634
    Likes Received:
    352
    Occupation:
    Sippin dat juice - Shout it louder!
    Location:
    Not sleeping!
    Home Page:
    scrapebox for checking domain indexed is quite poor as it checks the exact url even with root domain entered
    e.g.
    if you enter
    domain.com as just root domain
    it will check if domain.com is present
    but if
    www.domain.com/*subpage
    bla.domain.com/*subpage
    is still there
    it will still say its not indexed

    right now what I do
    site:domain.com
    for all domains I want to check
    and then run harvester instead
    and then use the feature:
    remove all urls containing from file:
    and remove it from the original list or trim
     
  6. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Hey! that's a pretty good idea. Better than using info: !

    But doing a site: will not return a full index either. But yeah great idea cross checking with "remove all urls containing from file".

    Ideally, do you think Scrapebox should be using inurl: instead? Is there not a way to configure Scrapebox to check a list of URLs using "inurl:" ?
     
  7. living2xl

    living2xl Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2011
    Messages:
    1,634
    Likes Received:
    352
    Occupation:
    Sippin dat juice - Shout it louder!
    Location:
    Not sleeping!
    Home Page:
    doesnt matter Sherlock
    As I said I was checking if the DOMAIN was indexed
    and it is shows ANY page using my method then the DOMAIN is INDEXED

    but for your issue you may need to twist and use head :)

    INURL is more inaccurate for this as it will give many permutations except for your url even if using exact url
     
  8. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Hi Living2XL,

    I tried your site:domain.com method and there is a problem. Here are my steps:

    1. Screaming frog crawl, stored all main URLs in txt file.

    2. Scrapebox - ran site:domain.com with Harvester

    3. "Remove URLs containing entries from ... " - loaded in the .txt (source URL list).

    4. Looks like what happened, is the command trimmed the Harvester list instead which doesn't help me identify which URLs are not indexed. In fact it should be trimming the SOURCE list.

    Can you please advise?
     
  9. living2xl

    living2xl Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2011
    Messages:
    1,634
    Likes Received:
    352
    Occupation:
    Sippin dat juice - Shout it louder!
    Location:
    Not sleeping!
    Home Page:
    Root domain is not URLS

    As I said I am referring to root domain indexed not urls. It will clearly not work for urls as it would have results for not indexed urls as opposed to root domain for site: would have 0 results if the root domain is not indexed
     
  10. SEO

    SEO Jr. VIP Jr. VIP

    Joined:
    Jan 6, 2017
    Messages:
    822
    Likes Received:
    612
    If I understand what you're looking for, then this should fix it:
    1. Export your harvested list to text.
    2. Import/Replace the harvested list with your screaming frog source list.
    3. Run "Remove URLs containing entries from ... " using the harvester list you exported in step 1 of this list.
    What's left in the harvester should be URLs not found during your scrape.
     
    • Thanks Thanks x 1
  11. Shaunm

    Shaunm Jr. VIP Jr. VIP

    Joined:
    Mar 27, 2014
    Messages:
    1,965
    Likes Received:
    957
    Home Page:
    There was a thread about this over on the GSA forum a few month back. Scrapebox is not perfect but it is more accurate than the way SER checks if a link is indexing in my experience so I stuck with it.
     
  12. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Hey Shaunm - any chance you have a link to that thread?
     
  13. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Thank goodness. Someone who understands :D Thanks bud. This is definitely what I was looking for. Should have thought of reversing the lists earlier. Nice.
     
  14. Shaunm

    Shaunm Jr. VIP Jr. VIP

    Joined:
    Mar 27, 2014
    Messages:
    1,965
    Likes Received:
    957
    Home Page:
    It starts about half way through this thread.
     
  15. msoon77

    msoon77 Registered Member

    Joined:
    Jan 16, 2010
    Messages:
    71
    Likes Received:
    2
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Australia
    Shaunm - thank you kind sir. Appreciate all the help.
     
  16. SEO

    SEO Jr. VIP Jr. VIP

    Joined:
    Jan 6, 2017
    Messages:
    822
    Likes Received:
    612
    You're very welcome. Glad I could help ;)