Flyweighting in Python
If you are dealing with large static datasets in Python it can be useful to flyweight your objects. With flyweighting, every time you construct a new object you check to see if it already exists. If so, the original object will be returned instead of constructing a duplicate.
Recently I wrote a little bit of code to achieve this in the general case:
import weakref
class FlyweightedObject(object):
_pool = weakref.WeakValueDictionary()
def __new__(klass, *args):
if hasattr(klass.__init__, 'im_func'):
constructor = klass.__init__.im_func
arguments_missing = constructor.func_code.co_argcount - len(args) - 1
if arguments_missing > 0:
args += constructor.func_defaults[-arguments_missing:]
key = (klass,) + args
instance = klass._pool.get(key, None)
if instance is None:
instance = object.__new__(klass)
klass._pool[key] = instance
return instanceNow when you inherit FlyweightedObject you get the flyweighting thrown in for free:
class Person(FlyweightedObject):
def __init__(self, age, name = 'Simon'):
self.age = age
self.name = name
f = Person(1)
g = Person(1, 'Simon')
print id(f) == id(g)
# => TrueOne thing to watch out for is changing attributes after the object has been constructed. This will lead to the flyweight pool keys and the objects themselves going out of sync:
f = Person(1, 'Dave')
f.name = 'Simon'
g = Person(1, 'Simon')
print id(f) == id(g)
# => FalseOne last thing. To get pickle working with flyweighted objects you’ll have to create a __getnewargs__ method which returns the tuple that will be passed to __new__ on unpickling:
class Person(FlyweightedObject):
def __init__(self, age, name = 'Simon'):
self.name = name
self.age = age
def __getnewargs__(self):
return self.age, self.nameThis can also be automated as long as the instance variables are named correctly:
def __getnewargs__(self):
if hasattr(self.__class__.__init__, 'im_func'):
constructor = self.__class__.__init__.im_func
return tuple(getattr(self, attr) for attr in constructor.func_code.co_varnames[1:])
return tuple()