Package topo :: Package misc :: Module asizeof
[hide private]
[frames] | no frames]

Module asizeof

source code

This module exposes 10 functions and 2 classes to obtain lengths

and sizes of Python objects (for Python 2.2 or later [1]).

The main changes in this version are new function calcsize(), use gc.get_objects() to get all objects and improvements in this documentation.

Public Functions [2]

Function asizeof calculates the combined (approximate) size in bytes of one or several Python objects.

Function asizesof returns a tuple containing the (approximate) size in bytes for each given Python object separately.

Function asized returns for each object an instance of class Asized containing all the size information of the object and a tuple with the referents.

Functions basicsize and itemsize return the basic respectively item size of the given object.

Function flatsize returns the flat size of a Python object in bytes defined as the basic size plus the item size times the length of the given object.

Function leng returns the length of an object, like standard len but extended for several types, e.g. the leng of a multi- precision int (or long) is the number of digits [3]. The length of most mutable sequence objects includes an estimate of the over-allocation and therefore, the leng value may differ from the standard len result.

Function refs returns (a generator for) the referents of the given object, i.e. the objects referenced by the given object.

Function calcsize is equivalent to standard struct.calcsize but handles format characters 'z' for signed C type Py_ssize_t and 'Z' for unsigned C type size_t.

Certain classes are known to be sub-classes of or to behave as dict objects. Function adict can be used to install other class objects to be treated like dict.

Public Classes [2]

An instance of class Asized is returned for each object sized with the asized function or method.

Class Asizer can be used to accumulate the results of several asizeof or asizesof calls. After creating an Asizer instance, use methods asizeof and asizesof to size additional objects.

Call methods exclude_refs and/or exclude_types to exclude references to or instances or types of certain objects.

Use one of the print_... methods to report the statistics.

Duplicate Objects

Any duplicate, given objects are sized only once and the size is included in the combined total only once. But functions asizesof and asized do return a size value respectively an Asized instance for each given object, the same for duplicates.

Definitions [4]

The size of an object is defined as the sum of the flat size of the object plus the sizes of any referents. Referents are visited recursively up to a given limit. However, the size of objects referenced multiple times is included only once.

The flat size of an object is defined as the basic size of the object plus the item size times the number of allocated items. The flat size does include the size for the items (references to the referents), but not the referents themselves.

The flat size returned by function flatsize equals the result of the asizeof function with options code=True, ignored=False, limit=0 and option align set to the same value.

The accurate flat size for an object is obtained from function sys.getsizeof() where available. Otherwise, the length and size of sequence objects as dicts, lists, sets, etc. is based on an estimate for the number of allocated items. As a result, the reported length and size may substantially differ from the actual length and size.

The basic and item sizes are obtained from the __basicsize__ respectively __itemsize__ attribute of the (type of the) object. Where necessary (e.g. sequence objects), a zero __itemsize__ is replaced by the size of a corresponding C type.

The basic size (of GC managed objects) objects includes the overhead for Python's garbage collector (GC) as well as the space needed for refcounts (only in certain Python builds).

Optionally, sizes can be aligned to any power of 2 multiple.

Size of (byte)code

The (byte)code size of objects as classes, functions, methods, modules, etc. can be included by setting option code.

Iterators are handled similar to sequences: iterated object(s) are sized like referents if the recursion limit permits. Also, function gc.get_referents() must return the referent object of iterators.

Generators are sized as (byte)code only, but generated objects are never sized.

Old- and New-style Classes

All old- and new-style class, instance and type objects, are handled uniformly such that (a) instance and class objects can be distinguished and (b) instances of different old-style classes can be dealt with separately.

Class and type objects are represented as <class ....* def> respectively as <type ... def> where an '*' indicates an old- style class and the def suffix marks the definition object. Instances of old-style classes are shown as new-style ones but with an '*' at the end of the name, like <class module.name*>.

Ignored Objects

To avoid excessive sizes, several object types are ignored [4] by default, e.g. built-in functions, built-in types and classes [5], function globals and module referents. However, any instances thereof are sized and module objects will be sized when passed as given objects. Ignored object types are included if option ignored is set accordingly.

In addition, many __...__ attributes of callable objects are ignored, except crucial ones, e.g. class attributes __dict__, __doc__, __name__ and __slots__. For more details, see the type-specific _..._refs() and _len_...() functions below.

Option all can be used to size all Python objects and/or get the referents from gc.get_referents() and override the type- specific __..._refs() functions.

Notes

[1] Tested with Python 2.2.3, 2.3.7, 2.4.5, 2.5.1, 2.5.2, 2.6.2,
3.0.1 or 3.1a2 on CentOS 4.6, SuSE 9.3, MacOS X 10.4.11 Tiger (Intel) and 10.3.9 Panther (PPC), Solaris 10 (Opteron) and Windows XP all 32-bit Python and on RHEL 3u7 and Solaris 10 (Opteron) both 64-bit Python.

[2] The functions and classes in this module are not thread-safe.

[3] See Python source file .../Include/longinterp.h for the
C typedef of digit used in multi-precision int (or long) objects. The size of digit in bytes can be obtained in Python from the int (or long) __itemsize__ attribute. Function leng (rather _len_int) below deterimines the number of digits from the int (or long) value.
[4] These definitions and other assumptions are rather arbitrary
and may need corrections or adjustments.
[5] Types and classes are considered built-in if the module of
the type or class is listed in _builtin_modules below.

Version: 5.12 (Apr 27, 2009)

Classes [hide private]
  _Claskey
Wrapper for class objects.
  _Instkey
Wrapper for old-style class (instances).
  _NamedRef
Store referred object along with the name of the referent.
  _Slots
Wrapper class for __slots__ attribute at class instances to account for the size of the __slots__ tuple/list containing references to the attribute values.
  _Typedef
Type definition class.
  _Prof
Internal type profile class.
  Asized
Store the results of an asized object in these 4 attributes:
  Asizer
Sizer state and options.
Functions [hide private]
 
calcsize(fmt)
struct.calcsize() handling 'z' for signed Py_ssize_t and 'Z' for unsigned size_t.
source code
 
_items(obj)
Return iter-/generator, preferably.
source code
 
_keys(obj)
Return iter-/generator, preferably.
source code
 
_values(obj)
Use iter-/generator, preferably.
source code
 
_kwds(**kwds)
Return name=value pairs as keywords dict.
source code
 
_basicsize(t, base=0, heap=False, obj=None)
Get non-zero basicsize of type, including the header sizes.
source code
 
_derive_typedef(typ)
Return single, existing super type typedef or None.
source code
 
_dir2(obj, pref='', excl=(), slots=None, itor='')
Return an attribute name, object 2-tuple for certain attributes or for the '__slots__' attributes of the given object, but not both.
source code
 
_infer_dict(obj)
Return True for likely dict object.
source code
 
_isdictclass(obj)
Return True for known dict objects.
source code
 
_issubclass(sub, sup)
Safe issubclass().
source code
 
_itemsize(t, item=0)
Get non-zero itemsize of type.
source code
 
_kwdstr(**kwds)
Keyword arguments as a string.
source code
 
_lengstr(obj)
Object length as a string.
source code
 
_nameof(obj, dflt='')
Return the name of an object.
source code
 
_objs(objs, all=None, **unused)
Return the given or 'all' objects.
source code
 
_p100(part, total, prec=1)
Return percentage as string.
source code
 
_plural(num)
Return 's' if plural.
source code
 
_power2(n)
Find the next power of 2.
source code
 
_prepr(obj, clip=0)
Prettify and clip long repr() string.
source code
 
_printf(fmt, *args, **print3opts)
Formatted print.
source code
 
_refs(obj, named, *ats, **kwds)
Return specific attribute objects of an object.
source code
 
_repr(obj, clip=80)
Clip long repr() string.
source code
 
_SI(size, K=1024, i='i')
Return size as SI string.
source code
 
_SI2(size, **kwds)
Return size as regular plus SI string.
source code
 
_class_refs(obj, named)
Return specific referents of a class object.
source code
 
_co_refs(obj, named)
Return specific referents of a code object.
source code
 
_dict_refs(obj, named)
Return key and value objects of a dict/proxy.
source code
 
_enum_refs(obj, named)
Return specific referents of an enumerate object.
source code
 
_exc_refs(obj, named)
Return specific referents of an Exception object.
source code
 
_file_refs(obj, named)
Return specific referents of a file object.
source code
 
_frame_refs(obj, named)
Return specific referents of a frame object.
source code
 
_func_refs(obj, named)
Return specific referents of a function or lambda object.
source code
 
_gen_refs(obj, named)
Return the referent(s) of a generator object.
source code
 
_im_refs(obj, named)
Return specific referents of a method object.
source code
 
_inst_refs(obj, named)
Return specific referents of a class instance.
source code
 
_iter_refs(obj, named)
Return the referent(s) of an iterator object.
source code
 
_module_refs(obj, named)
Return specific referents of a module object.
source code
 
_prop_refs(obj, named)
Return specific referents of a property object.
source code
 
_seq_refs(obj, unused)
Return specific referents of a frozen/set, list, tuple and xrange object.
source code
 
_stat_refs(obj, named)
Return referents of a os.stat object.
source code
 
_statvfs_refs(obj, named)
Return referents of a os.statvfs object.
source code
 
_tb_refs(obj, named)
Return specific referents of a traceback object.
source code
 
_type_refs(obj, named)
Return specific referents of a type object.
source code
 
_weak_refs(obj, unused)
Return weakly referent object.
source code
 
_len(obj)
Safe len().
source code
 
_len_array(obj)
Array length in bytes.
source code
 
_len_bytearray(obj)
Bytearray size.
source code
 
_len_code(obj)
Length of code object (stack and variables only).
source code
 
_len_dict(obj)
Dict length in items (estimate).
source code
 
_len_frame(obj)
Length of a frame object.
source code
 
_len_int(obj)
Length of multi-precision int (aka long) in digits.
source code
 
_len_iter(obj)
Length (hint) of an iterator.
source code
 
_len_list(obj)
Length of list (estimate).
source code
 
_len_module(obj)
Module length.
source code
 
_len_set(obj)
Length of frozen/set (estimate).
source code
 
_len_slice(obj)
Slice length.
source code
 
_len_slots(obj)
Slots length.
source code
 
_len_struct(obj)
Struct length in bytes.
source code
 
_len_unicode(obj)
Unicode size.
source code
 
_claskey(obj, style)
Wrap an old- or new-style class object.
source code
 
_instkey(obj)
Wrap an old-style class (instance).
source code
 
_keytuple(obj)
Return class and instance keys for a class.
source code
 
_objkey(obj)
Return the key for any object.
source code
 
_typedef_both(t, base=0, item=0, leng=None, refs=None, kind='static', heap=False)
Add new typedef for both data and code.
source code
 
_typedef_code(t, base=0, refs=None, kind='static', heap=False)
Add new typedef for code only.
source code
 
_typedef(obj, derive=False, infer=False)
Create a new typedef for an object.
source code
 
adict(*classes)
Install one or more classes to be handled as dict.
source code
 
asized(*objs, **opts)
Return a tuple containing an Asized instance for each object passed as positional argment using the following options.
source code
 
asizeof(*objs, **opts)
Return the combined size in bytes of all objects passed as positional argments.
source code
 
asizesof(*objs, **opts)
Return a tuple containing the size in bytes of all objects passed as positional argments using the following options.
source code
 
_typedefof(obj, save=False, **opts)
Get the typedef for an object.
source code
 
basicsize(obj, **opts)
Return the basic size of an object (in bytes).
source code
 
flatsize(obj, align=0, **opts)
Return the flat size of an object (in bytes), optionally aligned to a given power of 2.
source code
 
itemsize(obj, **opts)
Return the item size of an object (in bytes).
source code
 
leng(obj, **opts)
Return the length of an object (in items).
source code
 
refs(obj, all=False, **opts)
Return (a generator for) specific referents of an object.
source code
Variables [hide private]
  _builtin_modules = ('__builtin__', 'types', 'exceptions', 'top...
  _sizeof_Cbyte = 1
  _sizeof_Clong = 4
  _sizeof_Cvoidp = 4
  _Zz = 'Ll'
  _sizeof_CPyCodeObject = 68
  _sizeof_CPyFrameObject = 316
  _sizeof_CPyModuleObject = 12
  _sizeof_CPyDictEntry = 12
  _sizeof_Csetentry = 8
  _sizeof_Cdigit = 2
  _sizeof_Cunicode = 2
  _sizeof_CPyGC_Head = 12
  _sizeof_Crefcounts = 0
  _Py_TPFLAGS_HEAPTYPE = 512
  _Py_TPFLAGS_HAVE_GC = 16384
  _all_refs = None, _class_refs, _co_refs, _dict_refs, _enum_ref...
  _digit2p2 = 65536
  _digitmax = 65535
  _digitlog = 0.0901684400556
  _all_lengs = None, _len, _len_array, _len_bytearray, _len_code...
  _old_style = '*'
  _new_style = ''
  _claskeys = {6956320: <type 'Struct' def>, 14939424: <type 'ar...
  _instkeys = {}
  _all_kinds = ('static', 'dynamic', 'derived', 'ignored', 'infe...
  _typedefs = {<type 'listiterator' def>: (436, 0, None, <functi...
  _dict_typedef = (136, 12, <function _len_dict at 0xbe1f684>, <...
  _dict_classes = {'UserDict': ('IterableUserDict', 'UserDict'),...
  _asizer = Asizer()
  __package__ = 'topo.misc'
  _kind_derived = 'derived'
  _kind_dynamic = 'dynamic'
  _kind_ignored = 'ignored'
  _kind_inferred = 'inferred'
  _kind_static = 'static'
Function Details [hide private]

_dir2(obj, pref='', excl=(), slots=None, itor='')

source code 
Return an attribute name, object 2-tuple for certain attributes or for the '__slots__' attributes of the given object, but not both. Any iterator referent objects are returned with the given name if the latter is non-empty.

asized(*objs, **opts)

source code 

Return a tuple containing an Asized instance for each object passed as positional argment using the following options.

align=8 -- size alignment all=False -- all current GC objects and referents clip=80 -- clip repr() strings code=False -- incl. (byte)code size derive=False -- derive from super type detail=0 -- Asized refs level ignored=True -- ignore certain types infer=False -- try to infer types limit=100 -- recursion limit stats=0.0 -- print statistics and cutoff percentage

If only one object is given, the return value is the Asized instance for that object.

Set detail to the desired referents level (recursion depth).

See function asizeof for descriptions of the other options.

The length of the returned tuple matches the number of given objects, if more than one object is given.

asizeof(*objs, **opts)

source code 

Return the combined size in bytes of all objects passed as positional argments.

The available options and defaults are the following.

align=8 -- size alignment all=False -- all current GC objects and referents clip=80 -- clip repr() strings code=False -- incl. (byte)code size derive=False -- derive from super type ignored=True -- ignore certain types infer=False -- try to infer types limit=100 -- recursion limit stats=0.0 -- print statistics and cutoff percentage

Set align to a power of 2 to align sizes. Any value less than 2 avoids size alignment.

All current GC objects are sized if all is True and if no positional arguments are supplied. Also, if all is True the GC referents are used instead of the limited ones.

A positive clip value truncates all repr() strings to at most clip characters.

The (byte)code size of callable objects like functions, methods, classes, etc. is included only if code is True.

If derive is True, new types are handled like an existing (super) type provided there is one and only of those.

By default, certain base types like object are ignored for sizing. Set ignored to False to force all ignored types in the size of objects.

By default certain base types like object, super, etc. are ignored. Set ignored to False to include those.

If infer is True, new types are inferred from attributes (only implemented for dict types on callable attributes as get, has_key, items, keys and values).

Set limit to a positive value to accumulate the sizes of the referents of each object, recursively up to the limit. Using limit zero returns the sum of the flat [1] sizes of the given objects. High limit values may cause runtime errors and miss objects for sizing.

A positive value for stats prints up to 8 statistics, (1) a summary of the number of objects sized and seen, (2) a simple profile of the sized objects by type and (3+) up to 6 tables showing the static, dynamic, derived, ignored, inferred and dict types used, found respectively installed.

The fractional part of the stats value (x 100) is the cutoff percentage for simple profiles. Objects below the cutoff value are not reported.

[1] See the documentation of this module for the definition
of flat size.

asizesof(*objs, **opts)

source code 

Return a tuple containing the size in bytes of all objects passed as positional argments using the following options.

align=8 -- size alignment all=False -- use GC objects and referents clip=80 -- clip repr() strings code=False -- incl. (byte)code size derive=False -- derive from super type ignored=True -- ignore certain types infer=False -- try to infer types limit=100 -- recursion limit stats=0.0 -- print statistics and cutoff percentage

See function asizeof for a description of the options.

The length of the returned tuple equals the number of given objects.

basicsize(obj, **opts)

source code 

Return the basic size of an object (in bytes).

Valid options and defaults are
derive=False -- derive type from super type infer=False -- try to infer types save=False -- save typedef if new

flatsize(obj, align=0, **opts)

source code 

Return the flat size of an object (in bytes), optionally aligned to a given power of 2.

See function basicsize for a description of the other options. See the documentation of this module for the definition of flat size.

itemsize(obj, **opts)

source code 

Return the item size of an object (in bytes).

See function basicsize for a description of the options.

leng(obj, **opts)

source code 

Return the length of an object (in items).

See function basicsize for a description of the options.

refs(obj, all=False, **opts)

source code 

Return (a generator for) specific referents of an object.

If all is True return the GC referents.

See function basicsize for a description of the options.


Variables Details [hide private]

_builtin_modules

Value:
('__builtin__', 'types', 'exceptions', 'topo.misc.asizeof')

_all_refs

Value:
None, _class_refs, _co_refs, _dict_refs, _enum_refs, _exc_refs, _file_refs, _frame\
_refs, _func_refs, _gen_refs, _im_refs, _inst_refs, _iter_refs, _module_refs, _pro\
p_refs, _seq_refs, _stat_refs, _statvfs_refs, _tb_refs, _type_refs, _weak_refs

_all_lengs

Value:
None, _len, _len_array, _len_bytearray, _len_code, _len_dict, _len_frame, _len_int\
, _len_iter, _len_list, _len_module, _len_set, _len_slice, _len_slots, _len_struct\
, _len_unicode

_claskeys

Value:
{6956320: <type 'Struct' def>,
 14939424: <type 'array.array' def>,
 135673664: <type 'instancemethod' def>,
 135675168: <type 'exceptions.Exception' def>,
 135686688: <type 'file' def>,
 135691296: <type 'float' def>,
 135694112: <type 'int' def>,
 135695232: <type 'iterator' def>,
...

_all_kinds

Value:
('static', 'dynamic', 'derived', 'ignored', 'inferred')

_typedefs

Value:
{<type 'listiterator' def>: (436, 0, None, <function _type_refs at 0xbe1f534>, Fal\
se, 'ignored', <type 'listiterator'>),
 <type 'tupleiterator' def>: (436, 0, None, <function _type_refs at 0xbe1f534>, Fa\
lse, 'ignored', <type 'tupleiterator'>),
 <type 'bytearray_iterator' def>: (436, 0, None, <function _type_refs at 0xbe1f534\
>, False, 'ignored', <type 'bytearray_iterator'>),
 <type 'listreverseiterator' def>: (436, 0, None, <function _type_refs at 0xbe1f53\
4>, False, 'ignored', <type 'listreverseiterator'>),
...

_dict_typedef

Value:
(136, 12, <function _len_dict at 0xbe1f684>, <function _dict_refs at 0xbe1f1b4>, T\
rue, 'static', <type 'dict'>)

_dict_classes

Value:
{'UserDict': ('IterableUserDict', 'UserDict'),
 'weakref': ('WeakKeyDictionary', 'WeakValueDictionary')}