1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How google knows when to stop indexing?

Discussion in 'White Hat SEO' started by jovica888, Mar 13, 2016.

  1. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    I have website with about 6-7 billion pages. There is no spam or something like that. My website is like IP addresses lookup...

    I saw in my Webmaster tools that there is about 42 milion pages

    http://i.imgur.com/swkOnfG.png

    This number is not even 1% of all my pages. So I want to know what should I do with my website to get more pages indexed?
     
  2. Zwielicht

    Zwielicht Moderator Staff Member Moderator Jr. VIP

    Joined:
    Aug 31, 2013
    Messages:
    7,506
    Likes Received:
    13,270
    Gender:
    Male
    Occupation:
    Death
    Location:
    Riverside, California
    Home Page:
    • Make sure your site doesn't have too much downtime.
    • Increase your site speed.
    • Ensure that you don't have canonicalisation issues.
    • Check your robots.txt.
    • Consider that Google, for one reason or another, doesn't find the other pages worthy of being indexed.
    • https://support.google.com/webmasters/answer/34441?hlen

    I mean, this is just a shot in the dark considering I don't have much to work off of nor have I worked on an IP lookup site, but perhaps it's just the nature of your site's niche.
     
    Last edited: Mar 13, 2016
  3. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Well I think I just need to wait but I don't know what to do in meantime
    I made Sitemap
    http://i.imgur.com/j7yeAv5.png
    Only 33 milion indexed of 2 bilion pages...
     
  4. archon10

    archon10 BANNED BANNED

    Joined:
    Oct 10, 2011
    Messages:
    1,181
    Likes Received:
    11,553
    lol ain't no way you have billions of quality pages. You're either going to get slapped with a pure spam penalty, or Google won't index all of your pages.
     
    • Thanks Thanks x 5
  5. tb303

    tb303 Senior Member

    Joined:
    Dec 18, 2011
    Messages:
    804
    Likes Received:
    480
    I agree. If you have billions of auto-generated pages (going from OP's ip lookup example) then eventually you will trip the duplicate content filters.
     
  6. ChanzGrande

    ChanzGrande Elite Member

    Joined:
    Feb 16, 2008
    Messages:
    2,487
    Likes Received:
    1,178
    Occupation:
    Accountant
    Location:
    Northern Woods Counting Money
    This is billions of billions of pages that are essentially exactly the same. They're never going to be indexed. There is nothing you can do as this is a result of "thin" content. You want more pages indexed? Create a site with content worth indexing. It's just silly to expect these pages to rank. You should be thankful that any of that content got indexed, and not surprised within a couple years when it gets removed from the index due to it's blatant uselessness.

    Congrats on your preliminary success here though. You must get some decent traffic from the millions of useless pages you have indexed, although ... really how valuable is this visitor? I think most of them don't pay the fees for the more advanced lookup features, and they are unlikely to click ads, so you got a site with millions-billions of pages that almost nobody will ever find even if they were indexed because IP-Lookup niche is both saturated, and only relevant generally to one person at one time <-- There's really nothing to that at all.

    Convert your website to something of value, and G will index its contents.

    One thing you could try perhaps is to add a couple dynamic content sections to your template layout of your site, and add BLURB extracts based off spintax content. In this way if you added valuable article/text/image content that rotated regularly and automatically in TWO spots, then when G came to any given page it would see DIFFERENT content on all the pages. While some of this content might repeat for other IP Lookup queries, for the most part it might add enough deviation from one page to the next while G is crawling them to create the impression of slightly less "thin" content.
     
    • Thanks Thanks x 1
    Last edited: Mar 14, 2016
  7. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Thank you guys for all advices. I have success with that website I run and I don't think it will ever be penalized because some of BIG websites like "http://whatismyipaddress.com/ip/8.8.8.8" have the same structure, have alexa 3000 and PR 7.

    My website already makes about 500$ per month and have about 80.000 alexa. Google never deleted my links from its index. Check this trend
    http://i.imgur.com/jCOcp3Q.png

    What if I buy some website wuth Pagerank 4-5 with same niche and do 301 redirect to my website. Would this help?
    I think I just need to w8...
     
  8. archon10

    archon10 BANNED BANNED

    Joined:
    Oct 10, 2011
    Messages:
    1,181
    Likes Received:
    11,553
    You realize that PR and Alexa have nothing to do with whether or not you'll get a penalty, right?

    You're never going to get all of those pages indexed.
     
    • Thanks Thanks x 5
  9. tb303

    tb303 Senior Member

    Joined:
    Dec 18, 2011
    Messages:
    804
    Likes Received:
    480
    This is good advice. Increasing the content v repetition ratio of your site will help you more than links.
     
  10. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Maybe I should make something like "Last Checked IPs" and then dynamically generate list of RANDOM 50-100 ip adresses. This is very fast process and list will always be different on each page...

    EDIT

    http://i.imgur.com/Ygt9ucA.png
     
    • Thanks Thanks x 1
    Last edited: Mar 14, 2016
  11. ChanzGrande

    ChanzGrande Elite Member

    Joined:
    Feb 16, 2008
    Messages:
    2,487
    Likes Received:
    1,178
    Occupation:
    Accountant
    Location:
    Northern Woods Counting Money
    This is a start, but do consider just two dynamic content sections. Even if one of them is this IP table ... yes it will be different on all pages, but it will also just be a table of IP addresses. Over time G will easily see this as a continuation of your previous attempt at thin content. By adding a second text based dynamic table/widget/section to the template you will be able to add actual keyword rich contextual content that helps G differentiate any given page from another.

    This doesn't have to be really complicated. It's not even about volume of words. It's simply about deviation/delineation of page content. So for example you could write 5 small blurbs of about 150-200 words, spin them with TBS or another quality spinner, and generate a dynamic table of relevant text content with around 500-1000 entries. Put this in a .csv file or SQL database and let your site pull one query to each page, or let them rotate even completely dynamically. If using a completely dynamic approach then each time the page was crawled it would generate a completely different text content for that widget/section of your page. However, I think it would even be fine to just let it lock a particular entry from the .csv file or database as the "permanent" text for the box on any given page.

    The option is yours for implementation as either a dynamic ever changing text box, or one that locks down by post basically, but by adding this text box - you will delineate yourself from the hundreds of other sites just like yours out there. I learned this strategy from large porn sites, so I'm confident it does have value.
     
  12. Bane Bentley

    Bane Bentley Jr. VIP Jr. VIP

    Joined:
    Jun 13, 2013
    Messages:
    181
    Likes Received:
    36
    Try using an indexation service, Inozemec from this forum runs a few nice services.
     
  13. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Ok I founded 10-15 articles about IP addresses, Domain names, Whois, Name Registrant, Private IP, Dinamic IP....
    I copied few sentence from original article and put noffolow link to original article...
    And made rotation. Now PHP randomly choose 1 article and do echo in footer...
    Is this good???
     
  14. tb303

    tb303 Senior Member

    Joined:
    Dec 18, 2011
    Messages:
    804
    Likes Received:
    480
    Its a start, but to me 15 article snippits spread over 6 billion (was it?) pages doesnt sound like it will go far.

    Do you have geo-location data being displayed on on your IP pages? Maybe you could extend that somehow by scraping some additional info about the location and displaying it as well when you generate the page.

    Also, is the external link necessary? thats a lot of new OBL's.
     
  15. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Yes I have. I have google maps API and have information about location of every IP

    That is nofollow URL. I wanted something like "fair use". I took few sentence of text and link to original page. Is this problem?
     
  16. seowolve

    seowolve Regular Member

    Joined:
    Feb 6, 2016
    Messages:
    222
    Likes Received:
    19
    Google knows everything.
     
    • Thanks Thanks x 1
  17. Entr0py

    Entr0py Newbie

    Joined:
    Mar 15, 2016
    Messages:
    9
    Likes Received:
    1
    Occupation:
    something?
    Location:
    Somewhere
    check your downtime + robots.txt
    best of luck :D
    ~Entr0py
     
  18. jovica888

    jovica888 Regular Member

    Joined:
    Dec 15, 2011
    Messages:
    360
    Likes Received:
    79
    Downtime are good. I have VPS and checking logs daily. Robots txt is ok too. I don't know what should I do with robots.txt...

    I have little drop in earnings ATM. I think that is because of weekend. I will check situation in Tuesday or Wednesday...
     
  19. lokiys

    lokiys Regular Member

    Joined:
    Jan 21, 2011
    Messages:
    214
    Likes Received:
    37
    Location:
    127.0.0.1
    As per my knowledge it simple means that your other pages has no value, or have less value than others.
    What your webmastertools is telling ? Does not you have any errors there ?