1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Java Tip: Cache content with my class...

Discussion in 'Black Hat SEO' started by ipopbb, Feb 25, 2008.

  1. ipopbb

    ipopbb Power Member

    Joined:
    Feb 24, 2008
    Messages:
    626
    Likes Received:
    844
    Occupation:
    SEO & Innovative Programming
    Location:
    Seattle
    Home Page:
    Nerd Level: 5 (you need to be a savvy programmer to benefit from this post)

    Sorry to all you PHP fiends out there... I work in java for enterprise scalability. Plus my chops are weak in PHP because it has been years since I've used it last. Perhaps one of the kind PHP gurus can port my class if popularity demands?

    Why cache?

    If you scrape content from other people's website then you will want to cache the content to both speed up your website and to not piss off your content sources. Failure to cache usually gets you in trouble in the fullness of time if you like to work at a very large scale like I do... (websites with millions of auto-generated pages) You only need to cache a few hundred entries usually... The popular stuff is usually a small amount of overall content.

    ConcurrentCacheMap

    It implements ConcurrentMap which implements Collection, supports generics, FIFO queue behavior when max size is reached, and optional time based expiration of cached entries. It has thousands of hours of uptime in my applications and is used in websites that see a million+ visitors a day. Also... no one else has this class... I'm a HUGE fan of concurrent thread pooling and wrote this class myself to cut out the 500 pound gorilla classes that do caching as an application layer.

    Example 1: A cache that supports 500 entries and FIFO's when the limit is hit.

    Code:
    
    public class MyClass {
        private final ConcurrentCacheMap<String,Object> cache;
        public MyClass(){ cache= new ConcurrentCacheMap<String,Object>(500); }
        public Object doSomeShit(String input){
            Object result = null;
    	if(cache.containsKey(input)){ result = cache.get(input); } 
    	else { //do the heavy lifting here
    	cache.put(input,result); }
            return result;
        }
    }
    
    Example 2: A cache that supports 500 entries and FIFO's when the limit is hit and expires entries older than 15 minutes but checks for expired entries at most every 2 minutes.

    Code:
    
    public class MyClass {
        private final ConcurrentCacheMap<String,Object> cache;
        public MyClass(){ cache= new ConcurrentCacheMap<String,Object>(500,1000L * 60L * 15L,1000L * 120L); }
        public Object doSomeShit(String input){
            Object result = null;
    	if(cache.containsKey(input)){ result = cache.get(input); } 
    	else { //do the heavy lifting here
    	cache.put(input,result); }
            return result;
        }
    }
    
    And of course... here is the source code for the class.

    Code:
    
    import java.util.Collection;
    import java.util.Map;
    import java.util.Set;
    import java.util.concurrent.ConcurrentHashMap;
    import java.util.concurrent.ConcurrentLinkedQueue;
    import java.util.concurrent.ConcurrentMap;
    public class ConcurrentCacheMap<K,V> extends Thread implements ConcurrentMap<K,V> {
        private ConcurrentLinkedQueue<K> queue;
        private ConcurrentHashMap<K,V> cache;
        private ConcurrentHashMap<K,Long> cacheDates;
        private Long lastSpringCleaning;
        private Long timeToLive;
        private Long springCleaningInterval;
        private Integer size;
        public ConcurrentCacheMap(){
            queue = new ConcurrentLinkedQueue<K>();
            cache = new ConcurrentHashMap<K,V>();
            size = 5000;
            lastSpringCleaning = System.currentTimeMillis();
            cacheDates = new ConcurrentHashMap<K,Long>();
            timeToLive = 1000L * 60L * 15L;
            springCleaningInterval = 1000L * 60L;
        }
        public ConcurrentCacheMap(int size){
            queue = new ConcurrentLinkedQueue<K>();
            cache = new ConcurrentHashMap<K,V>();
            this.size = size;
            lastSpringCleaning = System.currentTimeMillis();
            cacheDates = new ConcurrentHashMap<K,Long>();
            timeToLive = 1000L * 60L * 15L;
            springCleaningInterval = 1000L * 60L;
        }
        public ConcurrentCacheMap(int size,long timeToLive){
            queue = new ConcurrentLinkedQueue<K>();
            cache = new ConcurrentHashMap<K,V>();
            this.size = size;
            lastSpringCleaning = System.currentTimeMillis();
            cacheDates = new ConcurrentHashMap<K,Long>();
            this.timeToLive = timeToLive;
            springCleaningInterval = 1000L * 60L;
        }
        public ConcurrentCacheMap(int size,long timeToLive,long springCleaningInterval){
            queue = new ConcurrentLinkedQueue<K>();
            cache = new ConcurrentHashMap<K,V>();
            this.size = size;
            lastSpringCleaning = System.currentTimeMillis();
            cacheDates = new ConcurrentHashMap<K,Long>();
            this.timeToLive = timeToLive;
            this.springCleaningInterval = springCleaningInterval;
        }
        public V get(Object key){ V result = cache.get(key); housekeeping(); return result; }
        public boolean isCached(K key){ housekeeping(); return cache.containsKey(key);}
        public void clear(){ queue.clear(); cache.clear(); cacheDates.clear();}
        public Set<K> keySet(){ housekeeping(); return cache.keySet(); }
        public Collection<V> values(){ housekeeping(); return cache.values();}
        public Set<Map.Entry<K,V>> entrySet(){ housekeeping(); return cache.entrySet(); }
        public V replace(K key, V value){ housekeeping(); return cache.replace(key,value); }
        public boolean replace(K key, V oldValue,V newValue){ housekeeping(); return cache.replace(key,oldValue,newValue); }
        public void setMaxSize(Integer size){ this.size = size; }
        public void setTimeToLive(Long timeToLive){ this.timeToLive = timeToLive; }
        public void setSpringCleaningInterval(Long springCleaningInterval){ this.springCleaningInterval = springCleaningInterval; }
        public Integer getMaxSize(){ return size; }
        public Long getTimeToLive(){ return timeToLive; }
        public boolean containsKey(Object key){return cache.containsKey(key);}
        public boolean containsValue(Object value){return cache.containsValue(value);}
        public boolean isEmpty(){return cache.isEmpty();}
        public int size(){ return cache.size(); }
        public Long getSpringCleaningInterval(){ return springCleaningInterval; }
        public V remove(Object key){
            V result = cache.remove(key);
            queue.remove(key);
            cacheDates.remove(key);
            housekeeping();
            return result;
        }
        public boolean remove(Object key,Object value){
            boolean result = false;
            if(cache.get(key).equals(value)){
                result = cache.remove(key,value);
                queue.remove(key);
                cacheDates.remove(key);
            }
            housekeeping();
            return result;
        }
        public V putIfAbsent(K key,V value){
            V result = null;
            if(!cache.containsKey(key)){
                cache.put(key,value);
                queue.add(key);
                cacheDates.put(key,System.currentTimeMillis());
            } else { result = cache.get(key);}
            housekeeping();
            return result;
        }
        public V put(K key,V object){
            V result = null;
            if(!cache.containsKey(key)){
                cache.put((K)key,(V)object);
                queue.add((K)key);
                cacheDates.put((K)key,System.currentTimeMillis());
            } else {
                result = cache.get(key);
                cache.replace((K)key,(V)object);
                cacheDates.replace((K)key,System.currentTimeMillis());
            }
            housekeeping();
            return result;
        }
        public void putAll(Map<? extends K,? extends V> map){
            housekeeping();
            for(Object o : map.entrySet()){
                Map.Entry me = (Map.Entry) o;
                put((K)me.getKey(),(V)me.getValue());
            }
        }
        public void housekeeping(){
            Thread t = new Thread();
            t.start();
        }
        public void run(){
            while(queue.size() > size){
                Object key = queue.poll();
                if(key != null){
                    cache.remove(key);
                    queue.remove(key);
                    cacheDates.remove(key);
                }
            }
            Long now = System.currentTimeMillis();
            if(lastSpringCleaning + springCleaningInterval < now){
                springCleaning(now);
            }
        }
        public void springCleaning(Long now){
            Object key;
            for(Map.Entry<? extends K,Long> me : cacheDates.entrySet()){
                if(me.getValue() + timeToLive < now){
                    key = me.getKey();
                    cache.remove(key);
                    queue.remove(key);
                    cacheDates.remove(key);
                }
            }
        }
    }
    
    
     
  2. cooooookies

    cooooookies Senior Member

    Joined:
    Oct 6, 2008
    Messages:
    1,008
    Likes Received:
    216
    Nice share, man, just discovered this old thread. What a pity that here are only a few java programmers around.

    BTW: do you know apache nutch? I recently started with it and am really amazed. I threw away my own spider/se/content cache since nutch is just amazing quick and reliable.
     
  3. Gunner Steele

    Gunner Steele Newbie

    Joined:
    Jan 30, 2009
    Messages:
    31
    Likes Received:
    19
    I'm not a java programmer, but if I understand what you're doing correctly, I have an idea for a super-fast, albeit application-layer, cache map. Check out redis:

    code (dot) google (dot) com/p/redis/wiki/README

    Pretty lightweight, and is supported by almost every language under the sun..

    GS