Simon Engledew's Blog

Eldritch nomenclature, conjoured for arcane contraptions to execute unerringly.

July 10, 2009 at 2:55am
home

Flyweighting in Python

If you are dealing with large static datasets in Python it can be useful to flyweight your objects. With flyweighting, every time you construct a new object you check to see if it already exists. If so, the original object will be returned instead of constructing a duplicate.

Recently I wrote a little bit of code to achieve this in the general case:

import weakref

class FlyweightedObject(object):
    _pool = weakref.WeakValueDictionary()
    
    def __new__(klass, *args):
        if hasattr(klass.__init__, 'im_func'):
            constructor = klass.__init__.im_func
            arguments_missing = constructor.func_code.co_argcount - len(args) - 1
            if arguments_missing > 0:
                args += constructor.func_defaults[-arguments_missing:]

        key = (klass,) + args
        instance = klass._pool.get(key, None)

        if instance is None:
            instance = object.__new__(klass)
            klass._pool[key] = instance
            
        return instance

Now when you inherit FlyweightedObject you get the flyweighting thrown in for free:

class Person(FlyweightedObject):
    def __init__(self, age, name = 'Simon'):
        self.age = age
        self.name = name

f = Person(1)
g = Person(1, 'Simon')

print id(f) == id(g)
# => True

One thing to watch out for is changing attributes after the object has been constructed. This will lead to the flyweight pool keys and the objects themselves going out of sync:

f = Person(1, 'Dave')
f.name = 'Simon'  

g = Person(1, 'Simon')

print id(f) == id(g)
# => False

One last thing. To get pickle working with flyweighted objects you’ll have to create a __getnewargs__ method which returns the tuple that will be passed to __new__ on unpickling:

class Person(FlyweightedObject):
    def __init__(self, age, name = 'Simon'):
        self.name = name
        self.age = age

    def __getnewargs__(self):
        return self.age, self.name

This can also be automated as long as the instance variables are named correctly:

def __getnewargs__(self):
    if hasattr(self.__class__.__init__, 'im_func'):
        constructor = self.__class__.__init__.im_func
        return tuple(getattr(self, attr) for attr in constructor.func_code.co_varnames[1:])
    return tuple()