Caching with EhCache - Part I

The need for caching is quite obvious and I'll not insist on it in this post. Usually a much lower response time than querying a database. Also saving the resources and not hitting the database, which can be used to handle other requests, should provide a clear picture of the benefits of caching. The exact benefit of using a cache however will depend totally on a particular case, - it is not wise to cache objects that are changing all the time in the back end database if you'll have much more trouble trying to stay in sync than gain performance, if badly used, you might end up with less performance than using the database-.

We'll start by using EhCache to implement a second level cache for Hibernate and help us with data retrieval from a MySQL database. We'll also be using Spring. Any database will do, since we're actually interesting in the caching part and we're working with entities after all.

The entity that we'll be using in our tests is Player which looks like this.

package com.balamaci.domain.entity;

import org.hibernate.annotations.Cache;  
import org.hibernate.annotations.CacheConcurrencyStrategy;  
import org.hibernate.annotations.GenericGenerator;

import javax.persistence.*;  
import java.io.Serializable;

@Entity
@Table(name = "PLAYERS")
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Player implements Serializable {

    @Id
    @GeneratedValue(generator = "INCREMENT")
    @GenericGenerator(name = "INCREMENT", strategy = "INCREMENT")
    @Column(name = "ID")
    private Long id;

    @Column(name = "NAME")
    private String name;

    @Column(name = "AGE")
    private Integer age;

    @Column(name = "NICKNAME")
    private String nickName;

.....
/* setters and getters for properties */
....
}
<?xml version="1.0" encoding="UTF-8"?>  
<beans ...>

    <!-- Resolves ${...} placeholders from app.properties-->
    <bean id="propertyConfigurer"
        class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
        <property name="location">
            <value>classpath:/app.properties</value>
        </property>
    </bean>

    <!-- Import contexts -->
    <import resource="classpath:serviceContext.xml" />
    <import resource="classpath:persistenceContext.xml" />
</beans>  

File persistenceContext.xml :

<beans ....>

    <!-- Database -->
    <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
        <property name="driverClassName">
            <value>com.mysql.jdbc.Driver</value>
        </property>

        <property name="url">
            <value>${jdbc.url}</value>
        </property>

        <property name="username">
            <value>${jdbc.username}</value>
        </property>

        <property name="password">
            <value>${jdbc.password}</value>
        </property>
    </bean>

    <bean id="sessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean">
        <property name="annotatedClasses">
            <list>
                <value>com.balamaci.domain.entity.Player</value>
            </list>
        </property>

        <property name="dataSource">
            <ref local="dataSource"/>
        </property>

        <property name="hibernateProperties">
            <props>
                <prop key="hibernate.dialect">org.hibernate.dialect.MySQL5Dialect</prop>
                <prop key="hibernate.show_sql">true</prop>
                <prop key="hibernate.cache.use_second_level_cache">true</prop> <!-- This is very important -->
                <prop key="hibernate.cache.use_query_cache">true</prop> <!-- This is very important also for using query caches -->
                <prop key="hibernate.cache.provider_class">net.sf.ehcache.hibernate.SingletonEhCacheProvider</prop>
                <prop key="hibernate.generate_statistics">true</prop>
            </props>
        </property>

    </bean>

    <bean id="hibernatePersistenceDao" class="com.balamaci.domain.dao.HibernatePersistenceDao">
        <property name="sessionFactory">
            <ref bean="sessionFactory"/>
        </property>
    </bean>
</beans>  

The important properties to look for are hibernate.cache.provider_class which has been set to SingletonEhCacheProvider. This means that we can get a reference to this "global" cache manager by calling the static method CacheManager.getInstance(), and if more than one hibernate configurations are used, it will be only one cache provider used for all of them.
Also use_second_level_cache is the property that actually determines that the second level cache is to be used or not. If you need to disable caching, just set this property to false.

File serviceContext.xml :

<beans ...>

    <bean id="transactionManager"
        class="org.springframework.orm.hibernate3.HibernateTransactionManager">
        <property name="sessionFactory" ref="sessionFactory" />
    </bean>

    <bean id="abstractService"
        class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean"
        abstract="true">
        <property name="transactionManager" ref="transactionManager" />
        <property name="transactionAttributes">
            <props>
                <prop key="get*">PROPAGATION_SUPPORTS,readOnly</prop> <!-- Specifying readOnly here helps by not creating a transaction, we'll discuss later why this is an improvement -->
                <prop key="add*">PROPAGATION_REQUIRED</prop>
            </props>
        </property>
    </bean>

    <bean id="persistenceService" parent="abstractService">
        <property name="target" ref="persistenceServiceTarget" />
    </bean>

    <bean id="persistenceServiceTarget" class="com.balamaci.service.DefaultPersistenceService">
        <property name="hibernatePersistenceDao" ref="hibernatePersistenceDao" />
    </bean>
</beans>  

Now let's have the code to retrieve all players from database:

// --- The service interface PersistenceService --
public interface PersistenceService {  
    public List<Player> getAllPlayers();
}

// --- The implementation of the service class DefaultPersistenceService ---
public class DefaultPersistenceService implements PersistenceService {

    public HibernatePersistenceDao hibernatePersistenceDao;

    public List<Player> getAllPlayers() {
        return hibernatePersistenceDao.getAllPlayers();
    }

    public void setHibernatePersistenceDao(HibernatePersistenceDao hibernatePersistenceDao) {
        this.hibernatePersistenceDao = hibernatePersistenceDao;
    }
}

// --- The DAO class ---
public class HibernatePersistenceDao extends HibernateDaoSupport {  
    public List<Player> getAllPlayers() {
        Criteria crit = getSession().createCriteria(Player.class);
        return crit.list();
    }
}

// --- The main function ---
public static void main(String[] args) {  
        ClassPathXmlApplicationContext appContext =
                new ClassPathXmlApplicationContext(new String[] {
                        "applicationContext.xml"
                });

        PersistenceService persistenceService = (PersistenceService) appContext.getBean("persistenceService");
        List<Player> lstPlayers = persistenceService.getAllPlayers();
}

EhCache can be configured by creating an ehcache.xml file. Caches are characterized by a set of different properties:

  • eternal to be set to true or false means that the cache should be refreshed or not, on some interval. In fact it means that the cache expires and is invalidated and should be refreshed from the database. For example there are caches for which you know that they are not updated in the database(or do not care) and those you would declare them eternal, and others that are likely to change and you would like to sometimes re-query the database for such changes and those you would declare and set a timeToLiveSeconds to a value after which the cache expires. Caches with eternal=true ignore the timeToLiveSeconds property.

  • maxElementsInMemory should be self explanatory. Caches can sometimes grow to become huge beasts which take up a whole lot of memory. You can limit the number of elements to a maximum. However when you reach that maximum level, other older entries are evicted - that is the default lru(least recently used) eviction strategy, but it can be changed to a lfu(least frequently used) or fifo(first in first out) mode of eviction for example-. Or you can even chose to "spill" the overflow of elements to a disk store by setting overflowToDisk=true property. Take care, as the disk can be quite slow and thus you might introduce a big performance bottleneck that would cause the cache to perform even worse than a call to a database. You may want to check the cache statistics for any caches that have overflowed to the disk store in case of bad cache performance as retrieving elements that were persisted to disk are many times slower than those retrieved from memory.


A model you can think of to understand, a cache is like a map, a collection of pairs with generic methods of put(key, value) and get(key) to retrieve the value associated with the key
. I should point out that the value referenced by a key can be a list of objects and not only a single object.

The Entity Cache and Query Cache

When using Hibernate with EhCache one can distinguish between two types of caches.

  • The Entities cache where a particular entity is referenced by it's primary key and that key is used to retrieve the particular instance of the . For example by issuing getSession.get(Player.class, id), Hibernate will look in it's second level cache for the entry with the key and not go to the database to retrieve that particular entity.
  • The Query cache. In many cases, querying for data(issue a select statement in the database) can be very costly(time and cpu intensive). The select may take a long time to complete although it returns only one item. Maybe we can improve the situation if we take advantage of the fact that the same slow query we are about to execute was already run a second ago, and since it had the same parameters, we could use the same result and not actually do the query. This is the query cache, it keeps the results of the queries, and if the same query with the same parameters is issued again, instead of executing the query, the results from the cache are used. More specifically the key of the cache is the select statement with parameters and as the value is kept the list of entities returned from the query. The thing to be aware of, is that the values of the keys in the cache are not list of the entities instances, but lists of ids of the entities. So when you receive the list of ids, Hibernate has to do an intermediate step to build the list of entities. Again if the entities with ids returned from the query cache are not in the entity cache(or maybe are expired) and Hibernate must retrieve them one by one from the database, this could take more time than not using the cache at all - so you may want to think about this if you experience slow queries-. Query caches are not enabled by default. You must explicitly say query.setCacheable(true), also remember to have hibernate.cache.use_query_cache property in the Spring config file to true.Let's think of an example for the query cache: We are going to retrieve players that have ages less than 10 years.
    public List<Player> getLittleLeaguePlayers() {  
            Query query = getSession().createQuery("Select p from Player p where p.age < ?");
            query.setLong(0, 10);
            query.setCacheable(true);
            return query.list();
    }
    
    This creates a cache entry with the select string(and parameters values) as the key and a list of ids say (id=2 and id=4). When this method is executed again, the query does not go through to the database, instead the ids of the entities are retrieved from the query cache, and Hibernate looks up the entities using the id in the entity cache, and then builds a list with them to be returned to the caller. Query Cache looks like:
    Key ---->Value
    { { query }, [parameters]} ---->[ids of cached entities]
    { {"Select p from Player p where p.age > ?"}, [10]} ---->[2, 4]
    In the event that the select is supposed to return scalar values, and not mapped entities, those values are kept as they are in the query cache. Query caches apply also to query results obtained from using the Criteria api:
            Criteria crit = getSession().createCriteria(Player.class);
            crit.add(Restrictions.lt("age", 10));
            crit.setCacheRegion("query.LittleLeague");
            crit.setCacheable(true);
    
    Since the query cache is just a cache, the same properties as a cache can be set, for example the maxElementsInMemory would limit the number of queries for which the results are cached.

Hibernate creates entities caches with the full name of the entity class, in our example there will be a cache named com.balamaci.domain.entity.Player. You can use this name to configure the properties of this cache in ehcache.xml for example:

<cache name="com.balamaci.domain.entity.Player"  
           maxElementsInMemory="30"
           eternal="false"
           timeToLiveSeconds="200"
           overflowToDisk="false"/>

You also can retrieve the cache from the cache manager by using this name. For example to show statistics about the cache. Or to force a clear cache from the interface and thus force hibernate to go to the database to retrieve new values.

Cache playerCache = (Cache) CacheManager.getInstance().getCache("com.balamaci.domain.entity.Player");

/** -- Obtaining different statistics of the cache -- **/
Statistics stats = playerCache.getStatistics();  
stats.getDiskStoreObjectCount();  
stats.getCacheHits();  
stats.getCacheMisses();

/** -- Clearing the cache for forcing a database reload -- **/
playerCache.removeAll();  
playerCache.clearStatistics();  

If no entry is configured specifically for a cache in ehcache.xml, the settings for <defaultCache> are used.

Hibernate creates a cache named org.hibernate.cache.StandardQueryCache and this cache is the query cache. We can configure this cache properties through ehcache.xml file. You can use a different cache for a particular query by setting query.setCacheRegion(), and you can set different properties for that particular cache also.

Query q = getSession().createQuery("Select p from Player p where p.age < ?");  
q.setLong(0, 12);  
q.setCacheRegion("query.LittleLeague");  
q.setCacheable(true);  

By enabling the query cache, Hibernate creates another cache named org.hibernate.cache.UpdateTimestampsCache besides the StandardQueryCache. This cache is used to determine if the results from a query cache are still valid. This is the nice thing about Hibernate, if you update the database entries through Hibernate, the query caches entries related to that entity are invalidated and queries go to the database and the cache will be refreshed. Of course if the database entries change in other way, say by calling a stored procedure, Hibernate cannot know that, and the cache is not invalidated.

In the end we'll discuss about the setting of cache concurrency(the annotation CacheConcurrencyStrategy.NONSTRICTREADWRITE in the Player entity definition). This answers the question, what happens when a thread wants to update a players name and another thread is reading the player name from the cache at the same time . Should the reader be blocked until the data is updated, or is it ok for the reader to receive the old version of the cache, and not be blocked by the writing thread?

  • READ_ONLY - An error is received if the entity with this type is updated.
  • NONSTRICT_READ_WRITE - A reader receives the old version of the cache in case
  • READ_WRITE - An effort is made to block the reader until the writer finishes

When working with caches, you pretty much expect to sometimes receive stale data, so in my opinion, mostly you'll be using the first two of them.

Wow, this was a long article and I guess I better leave something for part II, where we'll try to use EhCache as a generic cache, not only for hibernate entities.

comments powered by Disqus