Python Internals – Integer object pool (PyIntObject)

Starting from this post, I will try to make a series of blogs on Python Internals, where Python object mechanism, Python bytecode (pyc) and Python VM are gonna be discussed. We will also talk about the limited resource online about Python internals. However, this series of posts owe its inspiring thoughts to the book “Python Source Code Analysis” by Robert Chen. Happy hacking!

0. Quiz

Let us start our journey from an ‘everyone-loved’ quiz. Relax. It is simple to read but tricky to think. There are 2 code snippets down there. Which one would you think runs faster?

# Small Integers
for j in range(0, 100000):
	for i in range(0, 256):
		pass
# Big Integers
for j in range(0, 100000):
	for i in range(1000, 1256):
		pass

Actually, I can make it even more ‘fun’ by adding one more code snippet down there. Now which one is the fastest? And the slowest?

# Bigger Integers
for j in range(0, 100000):
	for i in range(10000, 10256):
		pass

1. Answer

Theoretically, the speed comparison of these 3 snippets running follows this way: smaller > big == bigger. If this is a little bit confusing and surprising, please read on:)

2. PyIntObject

Like everyone knows, everything in Python is an object (PyObject) and Integer is not an exception. When we write down something like ‘a = 5’, Python tries to construct an Integer object (PyIntObject, one can think of it as a container) whose value is 5 internally. This construction usually involves ‘new’ in C++ and ‘malloc’ in C. As Integer in Python is immutable, almost all the integers have to have their own memory allocation even holding the same value (we will talk about why it is ‘almost’ here soon). Apparently, it is not a good design to call ‘malloc’ each time when an integer comes up and Python has been complained about its performance for a while. So how would we (Python) handle this case internally?

Like everyone knows (again?), the answer is integer object pool. This pool holds numbers of PyIntObjects which has been allocated already but not initialized (no valid value or type). Whenever a new integer shows up, a free and existing PyIntObject from the pool will be filled up and initialized without bothering the ‘malloc’. SoThe block_list is the pool and the free_list is the list of available PyIntObjects within the pool.

/* daveti: Python-2.7.8/Objects/intobject.c */
#define BLOCK_SIZE      1000    /* 1K less typical malloc overhead */
#define BHEAD_SIZE      8       /* Enough for a 64-bit pointer */
#define N_INTOBJECTS    ((BLOCK_SIZE - BHEAD_SIZE) / sizeof(PyIntObject))

struct _intblock {
    struct _intblock *next;
    PyIntObject objects[N_INTOBJECTS];
};

typedef struct _intblock PyIntBlock;

static PyIntBlock *block_list = NULL;
static PyIntObject *free_list = NULL;

But still, this does not explain the answer to the quiz. Let us move forward. Even though there is only one integer object pool. This pool is divided into 2 parts, small integers and big integers. Big integers are exactly the case mentioned above, where PyIntObject is used as a container for the integer and even integers having the same value have their own PyIntObject. However, small integers have a different story. Though in the same integer pool, small integers have been initialized with the small-integer values already. Following small integers will not trigger a new PyIntObject in the pool but refer to the same PyIntObject with the same value. This is a little bit confusing. Let us have an example here: for ‘a = 5’, there exists a PyIntObject with the value of 5 in the pool. So ‘a’ will ‘point’ to that PyIntObject. When ‘b = 5’ shows up, ‘b’ also ‘point’s to the same PyIntObject like ‘a’ does. However, for ‘a = 10000’, it is a big integer. Python will try to find an available PyIntObject from the pool and initialize it with the value of 10000. When ‘b = 10000’ comes up, a same procedure has to repeat again.

The difference between the small integers and the big ones also rises 2 questions – what is changed within the PyIntObject for small integers with the same value and what are small integers? As no new PyIntObject is requested from the pool, small integers with the same value, ‘point’ing to the same PyIntObject, would only change the reference count (refcount) within the PyIntObject. Unlike the big integers, small integers stay in the memory during the life of Python VM/interpreter, with the refcount never being 0. The definition of small integers are defined as below:

/* daveti: Python-2.7.8/Objects/intobject.c */
#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS           257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS           5
#endif
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
/* References to small integers are saved in this array so that they
   can be shared.
   The integers that are saved are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

During the initialization of Python, small integers, which are also built up the integer object pool, are initialized with the range [-5, 257) and ready to use.

/* daveti: Python-2.7.8/Objects/intobject.c */
int
_PyInt_Init(void)
{
    PyIntObject *v;
    int ival;
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
    //daveti: init small/big int pool!
    for (ival = -NSMALLNEGINTS; ival < NSMALLPOSINTS; ival++) {
          if (!free_list && (free_list = fill_free_list()) == NULL)
                    return 0;
        /* PyObject_New is inlined */
        v = free_list;
        free_list = (PyIntObject *)Py_TYPE(v);
        PyObject_INIT(v, &PyInt_Type);
        v->ob_ival = ival;
        //daveti: small int pool is based on block_list (big int pool) as well!
        small_ints[ival + NSMALLNEGINTS] = v;
    }
#endif
    return 1;
}

When an integer from your program comes to play, Python would try to determine if this is a small integer, which means the corresponding PyIntObject is already there for use. Otherwise, it should be a big integer, which means we have to find a free PyIntObject from the pool and initialize it with the corresponding value.

/* daveti: Python-2.7.8/Objects/intobject.c */
PyObject *
PyInt_FromLong(long ival)
{
    register PyIntObject *v;
    //daveti: try to use small int pool
#if NSMALLNEGINTS + NSMALLPOSINTS > 0
    if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) {
        v = small_ints[ival + NSMALLNEGINTS];
        Py_INCREF(v);
#ifdef COUNT_ALLOCS
        if (ival >= 0)
            quick_int_allocs++;
        else
            quick_neg_int_allocs++;
#endif
        return (PyObject *) v;
    }
#endif
    if (free_list == NULL) {
        if ((free_list = fill_free_list()) == NULL)
            return NULL;
    }
    //daveti: use big int pool/block!
    /* Inline PyObject_New */
    v = free_list;
    //daveti: free_list gives the first available PyIntObject
    //then we use Py_TYPE(v) to update the free_list - to the next available position
    //Isn't this smart:)
    free_list = (PyIntObject *)Py_TYPE(v);
    //daveti: have to init it anyway because of missing invalid ob_type
    PyObject_INIT(v, &PyInt_Type);
    v->ob_ival = ival;
    return (PyObject *) v;
}

Now let us go back to the quiz. The outside loop among those code snippets is the same. The only difference is the inner loop, which has different ranges. For the small case, the range is [0, 256), which would be treated as small integers by Python, which means Python would do nothing but update the reference count within the existing PyIntObject during each loop. For the big case, the range is [1000, 1256), which would be treated as big integers by Python, which means Python would try to find a free PyIntObject from the pool and initialize it with the corresponding value during each loop. The bigger case is the same as the big one.

3. Tests

To prove our analysis, we need to make sure there is no difference between the small case and the big case, from the Python bytecode level:

Daves-MacBook-Pro-2:blog daveti$ python
Python 2.7.6 (default, Sep 9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type “help”, “copyright”, “credits” or “license” for more information.
>>> def foos():
… for j in range(0, 100000):
… for i in range(0, 256):
… pass

>>> def foob():
… for j in range(0, 100000):
… for i in range(1000, 1256):
… pass

>>> import dis
>>> dis.dis(foos)
2 0 SETUP_LOOP 49 (to 52)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (0)
9 LOAD_CONST 2 (100000)
12 CALL_FUNCTION 2
15 GET_ITER
>> 16 FOR_ITER 32 (to 51)
19 STORE_FAST 0 (j)

3 22 SETUP_LOOP 23 (to 48)
25 LOAD_GLOBAL 0 (range)
28 LOAD_CONST 1 (0)
31 LOAD_CONST 3 (256)
34 CALL_FUNCTION 2
37 GET_ITER
>> 38 FOR_ITER 6 (to 47)
41 STORE_FAST 1 (i)

4 44 JUMP_ABSOLUTE 38
>> 47 POP_BLOCK
>> 48 JUMP_ABSOLUTE 16
>> 51 POP_BLOCK
>> 52 LOAD_CONST 0 (None)
55 RETURN_VALUE
>>> dis.dis(foob)
2 0 SETUP_LOOP 49 (to 52)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (0)
9 LOAD_CONST 2 (100000)
12 CALL_FUNCTION 2
15 GET_ITER
>> 16 FOR_ITER 32 (to 51)
19 STORE_FAST 0 (j)

3 22 SETUP_LOOP 23 (to 48)
25 LOAD_GLOBAL 0 (range)
28 LOAD_CONST 3 (1000)
31 LOAD_CONST 4 (1256)
34 CALL_FUNCTION 2
37 GET_ITER
>> 38 FOR_ITER 6 (to 47)
41 STORE_FAST 1 (i)

4 44 JUMP_ABSOLUTE 38
>> 47 POP_BLOCK
>> 48 JUMP_ABSOLUTE 16
>> 51 POP_BLOCK
>> 52 LOAD_CONST 0 (None)
55 RETURN_VALUE
>>>

Then let us go and profile them:

Daves-MacBook-Pro-2:python daveti$ cat small_int.py
# Small Integers
for j in range(0, 100000):
for i in range(0, 256):
pass
Daves-MacBook-Pro-2:python daveti$ cat big_int.py
# Big Integers
for j in range(0, 100000):
for i in range(1000, 1256):
pass
Daves-MacBook-Pro-2:python daveti$ python -m cProfile small_int.py
100003 function calls in 1.347 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 1.209 1.209 1.347 1.347 small_int.py:2()
1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
100001 0.137 0.000 0.137 0.000 {range}

Daves-MacBook-Pro-2:python daveti$ python -m cProfile big_int.py
100003 function calls in 1.513 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 1.327 1.327 1.513 1.513 big_int.py:2()
1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
100001 0.186 0.000 0.186 0.000 {range}

4. Thoughts

Till now, I hope you had enjoyed this post and been looking forward to my next writing:) But there is something more before we end this post. We now are clear with the idea of Integer object pool, its mechanism and implementation in Python and we now know answer to the quiz. But the key point of the quiz is not just that simple. It is telling us the performance of your Python program, if using a lot of integers, from either the view of time or space, depends on the value of these integers! Isn’t this surprising! Moving forward, to gain the best performance, we can change the range of small integers of Python. Similarly, programs, all using small integers, would have the best performance. So, either hack the Python compiler and your program, your choice:)

About daveti

Interested in kernel hacking, compilers, machine learning and guitars.
This entry was posted in Programming, Stuff about Compiler and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s