|
|
- Metadata-Version: 2.1
- Name: cymem
- Version: 2.0.2
- Summary: Manage calls to calloc/free through Cython
- Home-page: https://github.com/explosion/cymem
- Author: Matthew Honnibal
- Author-email: matt@explosion.ai
- License: MIT
- Platform: UNKNOWN
- Classifier: Environment :: Console
- Classifier: Intended Audience :: Developers
- Classifier: Intended Audience :: Science/Research
- Classifier: License :: OSI Approved :: MIT License
- Classifier: Operating System :: POSIX :: Linux
- Classifier: Operating System :: MacOS :: MacOS X
- Classifier: Operating System :: Microsoft :: Windows
- Classifier: Programming Language :: Cython
- Classifier: Programming Language :: Python :: 2.6
- Classifier: Programming Language :: Python :: 2.7
- Classifier: Programming Language :: Python :: 3.3
- Classifier: Programming Language :: Python :: 3.4
- Classifier: Programming Language :: Python :: 3.5
- Classifier: Programming Language :: Python :: 3.6
- Classifier: Programming Language :: Python :: 3.7
- Classifier: Topic :: Scientific/Engineering
-
- cymem: A Cython Memory Helper
- ********************
-
- cymem provides two small memory-management helpers for Cython. They make it
- easy to tie memory to a Python object's life-cycle, so that the memory is freed
- when the object is garbage collected.
-
- .. image:: https://img.shields.io/travis/explosion/cymem/master.svg?style=flat-square&logo=travis
- :target: https://travis-ci.org/explosion/cymem
-
- .. image:: https://img.shields.io/appveyor/ci/explosion/cymem/master.svg?style=flat-square&logo=appveyor
- :target: https://ci.appveyor.com/project/explosion/cymem
- :alt: Appveyor Build Status
-
- .. image:: https://img.shields.io/pypi/v/cymem.svg?style=flat-square
- :target: https://pypi.python.org/pypi/cymem
- :alt: pypi Version
-
- .. image:: https://img.shields.io/conda/vn/conda-forge/cymem.svg?style=flat-square
- :target: https://anaconda.org/conda-forge/cymem
- :alt: conda Version
-
- .. image:: https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white
- :target: https://github.com/explosion/wheelwright/releases
- :alt: Python wheels
-
- Overview
- ========
-
- The most useful is ``cymem.Pool``, which acts as a thin wrapper around the calloc
- function:
-
- .. code:: python
-
- from cymem.cymem cimport Pool
- cdef Pool mem = Pool()
- data1 = <int*>mem.alloc(10, sizeof(int))
- data2 = <float*>mem.alloc(12, sizeof(float))
-
- The ``Pool`` object saves the memory addresses internally, and frees them when the
- object is garbage collected. Typically you'll attach the ``Pool`` to some cdef'd
- class. This is particularly handy for deeply nested structs, which have
- complicated initialization functions. Just pass the ``Pool`` object into the
- initializer, and you don't have to worry about freeing your struct at all —
- all of the calls to ``Pool.alloc`` will be automatically freed when the ``Pool``
- expires.
-
- Installation
- ============
-
- Installation is via `pip <https://pypi.python.org/pypi/pip>`_, and requires `Cython <http://cython.org/>`_.
-
- .. code:: bash
-
- pip install cymem
-
- Example Use Case: An array of structs
- =====================================
-
- Let's say we want a sequence of sparse matrices. We need fast access, and
- a Python list isn't performing well enough. So, we want a C-array or C++
- vector, which means we need the sparse matrix to be a C-level struct — it
- can't be a Python class. We can write this easily enough in Cython:
-
- .. code:: cython
-
- """Example without Cymem
-
- To use an array of structs, we must carefully walk the data structure when
- we deallocate it.
- """
-
- from libc.stdlib cimport calloc, free
-
- cdef struct SparseRow:
- size_t length
- size_t* indices
- double* values
-
- cdef struct SparseMatrix:
- size_t length
- SparseRow* rows
-
- cdef class MatrixArray:
- cdef size_t length
- cdef SparseMatrix** matrices
-
- def __cinit__(self, list py_matrices):
- self.length = 0
- self.matrices = NULL
-
- def __init__(self, list py_matrices):
- self.length = len(py_matrices)
- self.matrices = <SparseMatrix**>calloc(len(py_matrices), sizeof(SparseMatrix*))
-
- for i, py_matrix in enumerate(py_matrices):
- self.matrices[i] = sparse_matrix_init(py_matrix)
-
- def __dealloc__(self):
- for i in range(self.length):
- sparse_matrix_free(self.matrices[i])
- free(self.matrices)
-
-
- cdef SparseMatrix* sparse_matrix_init(list py_matrix) except NULL:
- sm = <SparseMatrix*>calloc(1, sizeof(SparseMatrix))
- sm.length = len(py_matrix)
- sm.rows = <SparseRow*>calloc(sm.length, sizeof(SparseRow))
- cdef size_t i, j
- cdef dict py_row
- cdef size_t idx
- cdef double value
- for i, py_row in enumerate(py_matrix):
- sm.rows[i].length = len(py_row)
- sm.rows[i].indices = <size_t*>calloc(sm.rows[i].length, sizeof(size_t))
- sm.rows[i].values = <double*>calloc(sm.rows[i].length, sizeof(double))
- for j, (idx, value) in enumerate(py_row.items()):
- sm.rows[i].indices[j] = idx
- sm.rows[i].values[j] = value
- return sm
-
-
- cdef void* sparse_matrix_free(SparseMatrix* sm) except *:
- cdef size_t i
- for i in range(sm.length):
- free(sm.rows[i].indices)
- free(sm.rows[i].values)
- free(sm.rows)
- free(sm)
-
-
- We wrap the data structure in a Python ref-counted class at as low a level as
- we can, given our performance constraints. This allows us to allocate and free
- the memory in the ``__cinit__`` and ``__dealloc__`` Cython special methods.
-
- However, it's very easy to make mistakes when writing the ``__dealloc__`` and
- ``sparse_matrix_free`` functions, leading to memory leaks. cymem prevents you from
- writing these deallocators at all. Instead, you write as follows:
-
- .. code:: cython
-
- """Example with Cymem.
-
- Memory allocation is hidden behind the Pool class, which remembers the
- addresses it gives out. When the Pool object is garbage collected, all of
- its addresses are freed.
-
- We don't need to write MatrixArray.__dealloc__ or sparse_matrix_free,
- eliminating a common class of bugs.
- """
- from cymem.cymem cimport Pool
-
- cdef struct SparseRow:
- size_t length
- size_t* indices
- double* values
-
- cdef struct SparseMatrix:
- size_t length
- SparseRow* rows
-
-
- cdef class MatrixArray:
- cdef size_t length
- cdef SparseMatrix** matrices
- cdef Pool mem
-
- def __cinit__(self, list py_matrices):
- self.mem = None
- self.length = 0
- self.matrices = NULL
-
- def __init__(self, list py_matrices):
- self.mem = Pool()
- self.length = len(py_matrices)
- self.matrices = <SparseMatrix**>self.mem.alloc(self.length, sizeof(SparseMatrix*))
- for i, py_matrix in enumerate(py_matrices):
- self.matrices[i] = sparse_matrix_init(self.mem, py_matrix)
-
- cdef SparseMatrix* sparse_matrix_init_cymem(Pool mem, list py_matrix) except NULL:
- sm = <SparseMatrix*>mem.alloc(1, sizeof(SparseMatrix))
- sm.length = len(py_matrix)
- sm.rows = <SparseRow*>mem.alloc(sm.length, sizeof(SparseRow))
- cdef size_t i, j
- cdef dict py_row
- cdef size_t idx
- cdef double value
- for i, py_row in enumerate(py_matrix):
- sm.rows[i].length = len(py_row)
- sm.rows[i].indices = <size_t*>mem.alloc(sm.rows[i].length, sizeof(size_t))
- sm.rows[i].values = <double*>mem.alloc(sm.rows[i].length, sizeof(double))
- for j, (idx, value) in enumerate(py_row.items()):
- sm.rows[i].indices[j] = idx
- sm.rows[i].values[j] = value
- return sm
-
-
- All that the ``Pool`` class does is remember the addresses it gives out. When the
- ``MatrixArray`` object is garbage-collected, the ``Pool`` object will also be garbage
- collected, which triggers a call to ``Pool.__dealloc__``. The ``Pool`` then frees all of
- its addresses. This saves you from walking back over your nested data structures
- to free them, eliminating a common class of errors.
-
- Custom Allocators
- =================
-
- Sometimes external C libraries use private functions to allocate and free objects,
- but we'd still like the laziness of the ``Pool``.
-
- .. code:: python
-
- from cymem.cymem cimport Pool, WrapMalloc, WrapFree
- cdef Pool mem = Pool(WrapMalloc(priv_malloc), WrapFree(priv_free))
-
-
|