1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Pages get indexed but no cache, what does it mean?

Discussion in 'Black Hat SEO' started by nobita, Aug 15, 2009.

  1. nobita

    nobita Registered Member

    Joined:
    Jan 16, 2009
    Messages:
    52
    Likes Received:
    2
    I build serveral free blogs to put links to my main sites. I found that many of them are get indexed by GG ( found when search with inurl: ) but no cache. Why? Does GG know that the blog posts are low quality? If yes, by which factor it use to detemine this.
     
  2. keinehabe

    keinehabe Supreme Member

    Joined:
    Nov 4, 2008
    Messages:
    1,207
    Likes Received:
    472
    Gender:
    Male
    Occupation:
    -= CEO =-
    Location:
    Heaven
    Home Page:
    well no ... actually I`m love the
    Code:
    <meta name="robots" content="noarchive"/>
    
    it's far away better :) .... from my experience and tests over the time it's seems noache it's forcing the pages to be faster crawled ( it's can be just a false ! but I`m thinking like this ) ... let me to explain in short some ideas :
    How the pages ( sites ) are indexed ? easy :
    1. the spider come to the url
    2. spider will crawl ( read it )
    3. date's are sent to the datacenter
    4. date's will be added to the index ...
    5. date's will be display for the users

    I hope everyone understand this small picture right ? well the cache what means actually ? means the named page will have here on the data center a '' replica '' / an exact copy of the page ..... that noarchive tag helping you and anyone else from loads of things , for example I don't wanna have my layout / page '' saved '' by them ! easy like that , even for pages which can have on the code some dirty stuff's which I don't wanna '' share '' with everyone ;) ... yes this can happens guys and gals everyone from here love to share nice things with others , even I like it but you know what ? some stuff's I prefer to keep them personal ;) why ? call me selfish and that's it's if you like I don't care ... And one more thing it's a good way to hide cloaked content of course :why: .... Evey one knows most strong back links come from '' long cached urls '' at least when we are speaking by G' well ... that's can be true .... but guess what I don't like to give anyone links :) if I can I`ll take all links from everywhere and i`ll point them to my own site lol - yeah I know one more time selfish ... guess what I DON'T CARE - :) ...

    In short words nocache it's cool , it's much more hard to rank with it I know it's involve loads of work , and can be '' silly '' but ... from what I tested over 2 years it's seems it's far away better ;)
     
  3. nobita

    nobita Registered Member

    Joined:
    Jan 16, 2009
    Messages:
    52
    Likes Received:
    2
    I think you're speaking about your main site which you don't wish to be cached. But what I'm speaking about the the free blogs that I create just to build links to my main site. So, I don't care if anyone will copy it of not. I just want G to crawl the links. And , as G don't cache them it may indicated G havn't actually crawled it yet.
     
  4. keinehabe

    keinehabe Supreme Member

    Joined:
    Nov 4, 2008
    Messages:
    1,207
    Likes Received:
    472
    Gender:
    Male
    Occupation:
    -= CEO =-
    Location:
    Heaven
    Home Page:
    to be back here since one of the forum users just pm me a question about this ... Peoples ... here are some things to be made clear . First of all cache control and no archive control over the sites . It's true for linkwells / splog's / and everything involved into '' link building process '' none care if the pages will be copied / duplicated and so on ....

    On the google , when you are searching for any keyword you will see something similar with this picture
    Code:
    http://img142.imageshack.us/img142/8782/examplest.jpg
    It's called '' cache '' by them just because the page it's '' saved '' on their servers in the process of indexing ( an actual copy of the page ) .

    One more point , I don't know who told you guys a page after it's crawled it's means will be actually added to the index :) THAT'S WRONG ! crawling and indexing process it's different ! Actually most of G' folks who speak and explain to large audience about their dirty business always trying to make the difference between their 3 step of the process
    A => crawling / B => indexing / C => delivery request of the queries from their index !

    It's not important about them and I think about anyone how this processes are going to go from technically point of view but you guys have to understand this processes since your main goal it's to manipulate the results of the queries ;) ... from here will come your traffic so maybe the sales or clicks which you will dream about ....

    Now ... the second thing ... The '' cache-control '' from the browsers :
    Code:
    <META HTTP-EQUIV="CACHE-CONTROL" CONTENT="NO-CACHE"> 
    This piece of code from the files delivered by the server it's controlling how the date's will be used after the request . Here are some things which need to be said to understand how and why they are '' a must '' :
    HTTP 1.1. Allowed values = PUBLIC | PRIVATE | NO-CACHE | NO-STORE.
    Public - may be cached in public shared caches
    Private - may only be cached in private cache
    no-Cache - may not be cached
    no-Store - may be cached but not archived

    The directive CACHE-CONTROL:NO-CACHE indicates cached information should not be used and instead requests should be forwarded to the origin server. This directive has the same semantics as the PRAGMA:NO-CACHE.
    Clients SHOULD include both PRAGMA:NO-CACHE and CACHE-CONTROL:NO-CACHE when a no-cache request is sent to a server not known to be HTTP/1.1 compliant.
    It may be better to specify cache commands in http than in meta statements, where they can influence more than the browser, but proxies and other intermediaries that may cache information.


    So via this specific requests served by the server spiders will consider to act in accordance with them , can be controlled how often the content delivered by the server it's '' set to expire '' / '' set to be cache '' and so on ... but this it's just about how the document's are delivered by the server and will be '' read '' by the agent who made the request ....

    technically speaking such directives can be mixed and sometime it's a must to be mixed BUT , at the end of the day the hole process must be understand backwards just because you need the results ;) .... that's the point to understand the full process lol .

    To answer to your question nobita , the truth it's you will never know actually with any command ( query on google / yahoo or msn ) how much links was already '' seen '' and '' counted '' for your main site , you know why ? well because first of all the '' index '' it's permanently updated , second because even if the page it's crawled it's not means will be actually included into the index , and 3'th even after including one page into the index it's not means will be here forever ;)
     
  5. hunar

    hunar Regular Member

    Joined:
    Jan 1, 2009
    Messages:
    479
    Likes Received:
    178
    Location:
    Minnesota
    This has been happening to me as well. Well With about 8 of my sites. The problem is for me at least, I use the plugin Google XML Sitemaps for wordpress. It has been changing my robots.txt to disallow all robots for searching my site. So my robots.txt file was set to disallow all which if you know how the google xml sitemaps plug in works it generates the robots.txt file for you. I'm thinking that the plugin isn't compatible with wordpress 2.84 or whatever.
     
  6. nobita

    nobita Registered Member

    Joined:
    Jan 16, 2009
    Messages:
    52
    Likes Received:
    2
    But in that case, we're quite sure that the links on that page are not counted.

    And in the case that the page are indexed but not cached, G must have some reasons to don't cache it. And it's likely that, with the same reason, the links on that page are not counted as well. Do you think so?

    I'm talking about free blogs with blogsome, livejournal, multiply, etc. And some of them are cached but mostly not. I think there is nothing to do with the meta tag or the robots.txt.

    To give more details, says I created a blog abc.livejournal.com and add 2 posts so I have at least 3 urls;

    abc.livejournal.com
    abc.livejournal.com/post1.html
    abc.livejournal.com/post2.html

    The situation is only some of the pages are indexed, says abc.livejournal.com/post1.html.
    The root page and the post2.html are not even indexed. But even the post1 is indexed, it's not cached.
     
  7. keinehabe

    keinehabe Supreme Member

    Joined:
    Nov 4, 2008
    Messages:
    1,207
    Likes Received:
    472
    Gender:
    Male
    Occupation:
    -= CEO =-
    Location:
    Heaven
    Home Page:
    well no ... like you put the problem right now it's not like that man , the link it's a link , it's not matter if the page is indexed , if the page it's blue or red :) IMPORTANT it's the link .... also it's not important if the link it's do-follow or no-follow , STILL it's important to be a link .

    ALL links are counted , some of them have weight more than other ones , some of them can count for PR phrasing other links can count just for the anchor , other links don't count for the anchor .... ALL LINKS which you can bring to your site it's counted like '' backlinks '' .

    This are just false rumors and bullshiets spread by the folks who's need something for what they have to talk , spiders from search engines '' reading '' the pages , then transmit the information's which they got to the data centers , on the data centers the information's which spiders bring here it's '' cataloged '' then the complex algorithms add the information's to the index for future use for the queries . The spiders know when they see a link from page A to page B , that's important , the connections between the pages , and you must understand that simple fact ALL LINKS COUNTING . Also so important are the back links from same '' niche '' / '' on topic '' ... but that's another story ....

    What I can tell you with sure ALL LINKS COUNT ! it's not mater from where they are , where they are hosted , what color have the page where they are , or the gender of the owner from that page lol ...