I had a great opportunity in the past week to work with a couple great minds doing a proof of concept for distributed caching. I had first heard of distributed caching just this month while listening to a Deep Fried Bytes podcast. They were discussing challenges and solutions to scaling extremely high traffic websites. They implemented a distributed caching product called Memcached. I found it interesting and put distributed caching on my to-learn stack.
So this week I got to dig in a little bit and find out how this stuff worked. I was primarily researching Velocity, which is Microsoft's implementation of distributed caching. Here are my notes:
There is really no best prescribed strategy for caching. It is very much specific to your direct need in terms of how you implement it. Generally speaking, the pattern for accessing cached objects is to look up your object based on a key, check for null - if null then pull from your secondary source (DB, file, etc.) if not null the use that object.
Cache Patterns
One cool thing about Velocity is flexibility to choose between 3 types of cache patterns.
1) Partitioned - great for scalability. Add a machine to the cache cluster (group of cache servers) and you have more memory for the objects to be distributed over.
2) Replicated - great for throughput. This pattern will synchronize your cache to each server so each machine within the cluster has an identical set of cached objects. This approach tends to be quicker because your objects are always living on the server that you request them from - but it limits you in terms of how much memory you have available to you.
3) Local - best performance. This cache lives within your application process just like a static dictionary would. It is faster because the objects are not serialized when the are put into the cache.
The uncool part about these 3 options is that only one is available in the current CTP release. You'll have to wait for options 2 and 3.
Locking
Velocity supports pessimistic locking and optimistic versioning.
Optimistic version based storage will store a version with each cached item to track changes. When a object is retrieved from the cache the version is also returned. An update of the cached item succeeds only if the version of the passed-in cache item is the same as the one stored in the cache.
Pessimistic locking allows you to lock an object in the cache and keep others form updating the object from underneath you.
Querying
One challenge I found was if you have a set of objects (people for instance List<People>) and you want to find a person in that list based on their email address. Well, that is a challenge because I stored each person object in the cache with a key based on the object's ID. So to find a person by email address, I'll have to pull back ALL people and then iterate through the list in local memory. What I need is a way to query that list before the cache is pulled back. I've read that this is a feature that will be implemented in later CTPs of Velocity.
For now, I created a simple app that shows one possible solution to querying for objects using Tag objects from the API. Check out the source code here.
Learning More
http://msdn.microsoft.com/en-us/library/cc645013.aspx
This post has a really good visual of the caching architecture:
http://www.25hoursaday.com/weblog/2008/06/06/VelocityADistributedInMemoryCacheFromMicrosoft.aspx
3 comments:
So you must have read our minds. Keith and I are coming out with new episode that has some discussions around Velocity with a Microsoft team. Keep you eyes open next week hopefully for the episode.
Thanks for listening and for the mentioning Deep Fried Bytes.
-- Woody
Hi Steve,
The local cache option is available in CTP1. The consistency policy in CTP1 for the local cache is limited to expiration based. We are looking towards adding notification bsaed expiry in the upcoming CTP.
Thanks
Murali K
Hi Steve,
Thanks for a great discussion and welcome to the world of distributed caching.
Microsoft Velocity only covers a small portion of what caching solutions providers like NCache are offering. Their recent discussions about their CTP release clearly indicate that they have a long way to go before they can actually consider themselves to be in the same league as the incumbents.
NCache has been the caching solution of choice for various mission critical applications throughout the world since 2005. With its wide range of features NCache delivers today what velocity promises tomorrow.
Download a totally free version of NCache (NCache Express) from www.alachisoft.com/download.html
Post a Comment