Delayed, yet again, at the airport. Time to get this article written once and for all.
This is a rough draft, I still need to go through and proof read this article. However, several friends were anxious to read it, so here it is rough for now.
When your website or project grows, demands on your architecture and infrastructure can dramatically increase. You can then run into “bottlenecks”, or parts of your project that cap out their abilities, and cause the rest of your application to slow down. One of the more common parts of your architecture to reach its limits is your database. There is a reason for this: its called ACID. While I won’t get into the details, basically databases are awesome because of it’s “ACID compliance.” You can store information and get information easily. However, these requirements of being a good database can also require a lot of leg work for your server. So when you have hundreds, thousands, and even millions of queries executing on your server, it can require a lot of CPU, Memory, and I/O to do all the work.
This is where memcached comes in. I’ve implemented with great success in the past, and you can too. This article is not a step-by-step how-to setup memcached. There are plenty of articles that show you how here and here (to name a few). You also have the php documentation for memcache + php. Instead, we’re going to discuss the theory behind creating an effective cache. Our memcache servers at Dating DNA run at about 99.9% efficiency (meaning 99.9% of all requests to our cache find a valid entry and doesn’t hit our database). We’ll cover a few basic concepts, and then talk about the two types of caching methods.
General Concept
What is a cache? To quote Wikipedia:
In computer science, a cache (pronounced /kæʃ/) is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, it can be used in the future by accessing the cached copy rather than re-fetching or recomputing the original data.
So basically it is a collection of data that sits between your code and server. You typically check the cache first to see if it has a valid entry. If so, you use the information in the cache. If not, you generate the information manually. After generating the information manually, you put it in the cache. Ideally you want your cache to be full of good information to save your database the work again.
Why cache? In a perfect world where servers have no limitations you wouldn’t need a cache. Your database would be able to handle trillions of queries without ever running into locking issues, slow responses, expensive joins, etc. However, we live in a realistic world where our databases have limitations. So we implement caches to help alleviate those limitations.



Recent Comments