Exploring currently live objects

The snapshot function lets you take a snapshot of the current state of the Python interpreter, returning a graph of all tracked objects and the references between them. This can be useful for finding out what’s keeping objects alive.

Note

Simple objects like integers and strings are not tracked by the cyclic garbage collector, so they won’t show up in the graph returned by snapshot.

Here’s a worked example. We start with a simple asynchronous worker:

def worker(jobs_queue, results_queue):
    while True:
        job = jobs_queue.get()
        result = job()
        results_queue.put(result)

It listens for incoming jobs on a queue, performs the computation represented by each job, and puts the result on another results queue. We’ll start it running on a separate thread like this:

import threading
from queue import Queue

jobs_queue, results_queue = Queue(), Queue()
t = threading.Thread(target=worker, args=(jobs_queue, results_queue))
t.daemon = True
t.start()

Now we create some computations, feed them to the worker, and wait for and print the results:

class SomeComputation(object):
    def __init__(self, value):
        self.value = value

    def compute(self):
        return self.value**2

def do_some_computations(jobs_queue, results_queue):
    computations = [SomeComputation(n) for n in [11, 15, 17]]
    for computation in computations:
        jobs_queue.put(computation.compute)
    for computation in computations:
        print(results_queue.get())

do_some_computations(jobs_queue, results_queue)

So far, so good. But now we notice that for some reason, after the do_some_computation call, there’s still an instance of SomeComputation alive. In this simple case that’s not really an issue, but imagine replacing SomeComputation with something more complicated that’s holding onto a system resource of some kind. We want to find out what’s keeping it alive, and how we can fix the problem. We run the above code under python -i, and take a snapshot:

>>> import refcycle
>>> snapshot = refcycle.snapshot()
>>> snapshot
<refcycle.object_graph.ObjectGraph object of size 5797 at 0x1004ca110>

An ObjectGraph acts as a container, so we can search through it for the SomeComputation instance:

>>> computations = [obj for obj in snapshot
...                 if isinstance(obj, SomeComputation)]
>>> computations
[<__main__.SomeComputation object at 0x1004ca050>]
>>> c = computations[0]

Now we can use the ancestors method to find out what’s keeping references to c.

>>> snapshot.ancestors(c)
<refcycle.object_graph.ObjectGraph object of size 5 at 0x10242db50>

In this particular case the graph of all ancestors is very small. More typically, that graph is much larger, so it’s often convenient to limit the search to a given number of generations, for example with:

>>> snapshot.ancestors(c, generations=5)
<refcycle.object_graph.ObjectGraph object of size 5 at 0x10242db10>

Either way, we can now export this graph as an image:

>>> snapshot.ancestors(c).export_image('computations.svg')

This gives the following rather simple graph:

../_images/computations.svg

So it’s the compute bound method keeping c alive (through its __self__ reference). What’s keeping that alive is a frame object: the execution frame for the long-running worker function. Its local variable job is still referring to our bound method. Looking back at the original code, the cause is clear: the job local variable retains its reference to the job until the get call on the job queue returns the next job. But after the last job has been submitted, that get call waits forever, so the reference to the last job never disappears. And in this case the fix is easy: add a del job to the end of the while loop:

def worker(jobs_queue, results_queue):
    while True:
        job = jobs_queue.get()
        result = job()
        results_queue.put(result)
        del job

What about the other two frame objects in the graph? The worker thread spends almost all its time waiting, and that top frame is the current frame of the worker thread. It refers to the wait method of the threading.Condition object used by the jobs queue. The f_back edge refers to the calling frame, in this case the queue.Queue.get method call, whose f_back refers in turn to our worker function.