• Stars
    star
    189
  • Rank 203,435 (Top 5 %)
  • Language
    Python
  • Created about 14 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Improved memcached cache backend for Django
django-newcache
===============

Newcache is an improved memcached cache backend for Django. It provides four
major advantages over Django's built-in cache backend:

 * It supports pylibmc.
 * It allows for a function to be run on each key before it's sent to memcached.
 * It supports setting cache keys with infinite timeouts.
 * It mitigates the thundering herd problem.

It also has some pretty nice defaults. By default, the function that's run on
each key is one that hashes, versions, and flavors the key.  More on that 
later.


How to Install
--------------

The simplest way is to just set it as your cache backend in your settings.py, 
like so::

    CACHE_BACKEND = 'newcache://127.0.0.1:11211/?binary=true'

Note that we've passed an additional argument, binary, to the backend.  This
is because pylibmc supports using binary mode to talk to memcached. This is a
completely optional parameter, and can be omitted safely to use the old text 
mode. It is ignored when using python-memcached.


Default Behavior
----------------

Earlier we said that by default it hashes, versions, and flavors each key. What
does this mean?  Let's go through each item in detail.

Keys in memcached come with many restrictions, both on their length and on 
their contents.  Practically speaking, this means that you can't put spaces
in your keys, and they can't be very long.  One simple solution to this is to
create an sha1 hash of whatever key you want, and use the hash as your key
instead.  That is what we do in newcache.  It not only allows for long keys, 
but it also lets us put spaces or other characters in our key as well.

Sometimes it's necessary to clear the entire cache. We can do this using 
memcached's flushing mechanisms, but sometimes a cache is shared by many things
instead of just one web app.  It's a shame to have everything lose its
fresh cache just because one web app needed to clear its cache. For this, we
introduce a simple technique called versioning. A version number is added to
each cache key, and when this version is incremented, all the old cache keys
will become invalid because they have an incorrect version.

This is exposed as a new setting, CACHE_VERSION, and it defaults to 1.

Finally, we found that as we split our site out into development, staging, and
production, we didn't want them to share the same cache.  But we also didn't
want to spin up a new memcached instance for each one.  So we came up with the
idea of flavoring the cache.  The concept is simple--add a FLAVOR setting and
make it something like 'dev', 'prod', or 'test'.  With newcache, this flavor
string will be added to each key, ensuring that there are no collisions.

Concretely, this is what happens::

    # CACHE_VERSION = 2
    # FLAVOR = 'staging'
    cache.get('games')
    # ... would actually call ...
    cache.get('staging-2-9cfa7aefcc61936b70aaec6729329eda')


Changing the Default
--------------------

All of the above is simply the default, you may provide your own callable
function to be run on each key, by supplying the CACHE_KEY_MODULE setting. It
must provide a get_key function which takes any instance of basestring and 
output a str.


Thundering Herd Mitigation
--------------------------

The thundering herd problem manifests itself when a cache key expires, and many
things rush to get or generate the data stored for that key all at once.  This 
is doing a lot of unnecessary work and can cause service outages if the
database cannot handle the load.  To solve this problem, we really only want 
one thread or process to fetch this data.

Our method of solving this problem is to shove the old (expired) value back 
into the cache for a short time while the first process/thread goes and updates
the key.  This is done in a completely transparent way--no changes should need
to be made in the application code.

With this cache backend, we have provided an extra 'herd' keyword argument to 
the set, add, and set_many methods--which is set to True by default. What this 
does is transform your cache value into a tuple before saving it to the cache. 
Each value is structured like this:

    (A herd marker, your original value, the expiration timestamp)

Then when it actually sets the cache, it sets the real timeout to a little bit
longer than the expiration timestamp. Actually, this "little bit" is 
configurable using the CACHE_HERD_TIMEOUT setting, but it defaults to 60 
seconds.

Now every time we read a value from the cache, we automatically unpack it and 
check whether it's expired.  If it has expired, we put it back in the cache for 
CACHE_HERD_TIMEOUT seconds, but (and this is the key) we act as if it were a 
cache miss (so we return None, or whatever the default was for the call.)

*Note*: If you want to set a value to be used as a counter (with incr and
        decr) then you'll want to bypass the herd mechanism.

More Repositories

1

django-pagination

A set of utilities for creating robust pagination tools throughout a django application.
Python
515
star
2

hurricane

Hurricane is a project for easily creating Comet web applications.
JavaScript
98
star
3

node-jsonrpc

JSON-RPC client and server for node.js
JavaScript
90
star
4

django-tokyo-sessions

This is a session backend for Django that stores sessions in a Tokyo Cabinet database, which communicates via Tokyo Tyrant using the PyTyrant library. Tokyo Cabinet is a key-value store similar to BDB.
Python
81
star
5

django-simplestatic

A highly opinionated drop-in library for static file management in Django
Python
76
star
6

django-oembed

A collection of Django tools which make it easy to change text filled with oembed links into the embedded objects themselves.
Python
71
star
7

django-memcached

This is a very simple reusable app which does one thing: shows you statistics about your currently running memcached instances.
Python
71
star
8

django-couch-lifestream

An application for creating a lifestream with CouchDB and Django.
Python
67
star
9

mediasummon

Summon your photos and videos back to you
JavaScript
66
star
10

startthedark

StartTheDark is the product of a series of screencasts by Eric Florenzano about the Django web programming framework. The site itself is "A place to see what your friends are doing tonight!"
JavaScript
59
star
11

irlmoji

Take a pic that looks like an emoji!
JavaScript
50
star
12

servertail

Our team's djangodash project
JavaScript
49
star
13

awesomestream

AwesomeStream makes awesome streams
Python
49
star
14

django-cookie-sessions

A session backend which uses Django's secure cookie encoding and decoding functionality to store the whole session in the cookie, instead of talking to some database or cache instance.
Python
45
star
15

pytyrant

pytyrant is a pure python client implementation of the binary Tokyo Tyrant protocol (this is a fork where I'm working on better support for table databases)
Python
44
star
16

txconnpool

A generalized connection pooling library for Twisted.
Python
34
star
17

django-classfaves

A different approach to favorites in Django
Python
33
star
18

FlintVR

An experimental VR engine built in C++, but controlled with JS.
C++
31
star
19

kube-mastodon

This repository contains everything you need to get a Mastodon server running on Kubernetes.
27
star
20

django-session-user

A simple piece of middleware that can be added to your Django project which will store and retrieve the logged-in user's information from the session
Python
24
star
21

flintvr-react

This is a small shim library to wrap FlintVR with React.js bindings
JavaScript
21
star
22

slimgfast

A Go-based dynamic image resizer.
Go
19
star
23

yourmomdotcom

An IRC markov bot based on Twisted.
18
star
24

pynzb

pynzb is a unified API for parsing NZB files, with several concrete implementations included
Python
17
star
25

tweeak

Tweeak is an example project, created to learn and demonstrate how to use Riak.
Python
11
star
26

TweeHTML

An experiment in making an iPhone Twitter client in HTML *VERY ALPHA*
JavaScript
11
star
27

openopengraph

An open source implementation of Facebook's Open Graph protocol.
Python
9
star
28

cassbot

Stupid little bot to help out on #cassandra
Python
9
star
29

weightbot

An unofficial API into WeightBot.com
Python
8
star
30

ironichitcounter

Why aren't hit counters cool anymore?
Python
7
star
31

pykontagent

A simple interface into the Kontagent REST API
Python
5
star
32

simpleblog

The simple blog generator I'm using to build eflorenzano.com
Python
4
star
33

ircitude

Experimentations in IRC
Python
2
star
34

kube-http-proxy

Easily run a reverse proxy for all your Kubernetes http and https traffic.
Go
1
star
35

old-eflorenzano

My old blog
HTML
1
star
36

rclone-backup

This is a utility that uses rclone to periodically back up a folder on disk to a cloud storage provider.
Go
1
star