You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

241 lines
8.4 KiB

4 years ago
  1. Metadata-Version: 2.1
  2. Name: cymem
  3. Version: 2.0.2
  4. Summary: Manage calls to calloc/free through Cython
  5. Home-page: https://github.com/explosion/cymem
  6. Author: Matthew Honnibal
  7. Author-email: matt@explosion.ai
  8. License: MIT
  9. Platform: UNKNOWN
  10. Classifier: Environment :: Console
  11. Classifier: Intended Audience :: Developers
  12. Classifier: Intended Audience :: Science/Research
  13. Classifier: License :: OSI Approved :: MIT License
  14. Classifier: Operating System :: POSIX :: Linux
  15. Classifier: Operating System :: MacOS :: MacOS X
  16. Classifier: Operating System :: Microsoft :: Windows
  17. Classifier: Programming Language :: Cython
  18. Classifier: Programming Language :: Python :: 2.6
  19. Classifier: Programming Language :: Python :: 2.7
  20. Classifier: Programming Language :: Python :: 3.3
  21. Classifier: Programming Language :: Python :: 3.4
  22. Classifier: Programming Language :: Python :: 3.5
  23. Classifier: Programming Language :: Python :: 3.6
  24. Classifier: Programming Language :: Python :: 3.7
  25. Classifier: Topic :: Scientific/Engineering
  26. cymem: A Cython Memory Helper
  27. ********************
  28. cymem provides two small memory-management helpers for Cython. They make it
  29. easy to tie memory to a Python object's life-cycle, so that the memory is freed
  30. when the object is garbage collected.
  31. .. image:: https://img.shields.io/travis/explosion/cymem/master.svg?style=flat-square&logo=travis
  32. :target: https://travis-ci.org/explosion/cymem
  33. .. image:: https://img.shields.io/appveyor/ci/explosion/cymem/master.svg?style=flat-square&logo=appveyor
  34. :target: https://ci.appveyor.com/project/explosion/cymem
  35. :alt: Appveyor Build Status
  36. .. image:: https://img.shields.io/pypi/v/cymem.svg?style=flat-square
  37. :target: https://pypi.python.org/pypi/cymem
  38. :alt: pypi Version
  39. .. image:: https://img.shields.io/conda/vn/conda-forge/cymem.svg?style=flat-square
  40. :target: https://anaconda.org/conda-forge/cymem
  41. :alt: conda Version
  42. .. image:: https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white
  43. :target: https://github.com/explosion/wheelwright/releases
  44. :alt: Python wheels
  45. Overview
  46. ========
  47. The most useful is ``cymem.Pool``, which acts as a thin wrapper around the calloc
  48. function:
  49. .. code:: python
  50. from cymem.cymem cimport Pool
  51. cdef Pool mem = Pool()
  52. data1 = <int*>mem.alloc(10, sizeof(int))
  53. data2 = <float*>mem.alloc(12, sizeof(float))
  54. The ``Pool`` object saves the memory addresses internally, and frees them when the
  55. object is garbage collected. Typically you'll attach the ``Pool`` to some cdef'd
  56. class. This is particularly handy for deeply nested structs, which have
  57. complicated initialization functions. Just pass the ``Pool`` object into the
  58. initializer, and you don't have to worry about freeing your struct at all —
  59. all of the calls to ``Pool.alloc`` will be automatically freed when the ``Pool``
  60. expires.
  61. Installation
  62. ============
  63. Installation is via `pip <https://pypi.python.org/pypi/pip>`_, and requires `Cython <http://cython.org/>`_.
  64. .. code:: bash
  65. pip install cymem
  66. Example Use Case: An array of structs
  67. =====================================
  68. Let's say we want a sequence of sparse matrices. We need fast access, and
  69. a Python list isn't performing well enough. So, we want a C-array or C++
  70. vector, which means we need the sparse matrix to be a C-level struct — it
  71. can't be a Python class. We can write this easily enough in Cython:
  72. .. code:: cython
  73. """Example without Cymem
  74. To use an array of structs, we must carefully walk the data structure when
  75. we deallocate it.
  76. """
  77. from libc.stdlib cimport calloc, free
  78. cdef struct SparseRow:
  79. size_t length
  80. size_t* indices
  81. double* values
  82. cdef struct SparseMatrix:
  83. size_t length
  84. SparseRow* rows
  85. cdef class MatrixArray:
  86. cdef size_t length
  87. cdef SparseMatrix** matrices
  88. def __cinit__(self, list py_matrices):
  89. self.length = 0
  90. self.matrices = NULL
  91. def __init__(self, list py_matrices):
  92. self.length = len(py_matrices)
  93. self.matrices = <SparseMatrix**>calloc(len(py_matrices), sizeof(SparseMatrix*))
  94. for i, py_matrix in enumerate(py_matrices):
  95. self.matrices[i] = sparse_matrix_init(py_matrix)
  96. def __dealloc__(self):
  97. for i in range(self.length):
  98. sparse_matrix_free(self.matrices[i])
  99. free(self.matrices)
  100. cdef SparseMatrix* sparse_matrix_init(list py_matrix) except NULL:
  101. sm = <SparseMatrix*>calloc(1, sizeof(SparseMatrix))
  102. sm.length = len(py_matrix)
  103. sm.rows = <SparseRow*>calloc(sm.length, sizeof(SparseRow))
  104. cdef size_t i, j
  105. cdef dict py_row
  106. cdef size_t idx
  107. cdef double value
  108. for i, py_row in enumerate(py_matrix):
  109. sm.rows[i].length = len(py_row)
  110. sm.rows[i].indices = <size_t*>calloc(sm.rows[i].length, sizeof(size_t))
  111. sm.rows[i].values = <double*>calloc(sm.rows[i].length, sizeof(double))
  112. for j, (idx, value) in enumerate(py_row.items()):
  113. sm.rows[i].indices[j] = idx
  114. sm.rows[i].values[j] = value
  115. return sm
  116. cdef void* sparse_matrix_free(SparseMatrix* sm) except *:
  117. cdef size_t i
  118. for i in range(sm.length):
  119. free(sm.rows[i].indices)
  120. free(sm.rows[i].values)
  121. free(sm.rows)
  122. free(sm)
  123. We wrap the data structure in a Python ref-counted class at as low a level as
  124. we can, given our performance constraints. This allows us to allocate and free
  125. the memory in the ``__cinit__`` and ``__dealloc__`` Cython special methods.
  126. However, it's very easy to make mistakes when writing the ``__dealloc__`` and
  127. ``sparse_matrix_free`` functions, leading to memory leaks. cymem prevents you from
  128. writing these deallocators at all. Instead, you write as follows:
  129. .. code:: cython
  130. """Example with Cymem.
  131. Memory allocation is hidden behind the Pool class, which remembers the
  132. addresses it gives out. When the Pool object is garbage collected, all of
  133. its addresses are freed.
  134. We don't need to write MatrixArray.__dealloc__ or sparse_matrix_free,
  135. eliminating a common class of bugs.
  136. """
  137. from cymem.cymem cimport Pool
  138. cdef struct SparseRow:
  139. size_t length
  140. size_t* indices
  141. double* values
  142. cdef struct SparseMatrix:
  143. size_t length
  144. SparseRow* rows
  145. cdef class MatrixArray:
  146. cdef size_t length
  147. cdef SparseMatrix** matrices
  148. cdef Pool mem
  149. def __cinit__(self, list py_matrices):
  150. self.mem = None
  151. self.length = 0
  152. self.matrices = NULL
  153. def __init__(self, list py_matrices):
  154. self.mem = Pool()
  155. self.length = len(py_matrices)
  156. self.matrices = <SparseMatrix**>self.mem.alloc(self.length, sizeof(SparseMatrix*))
  157. for i, py_matrix in enumerate(py_matrices):
  158. self.matrices[i] = sparse_matrix_init(self.mem, py_matrix)
  159. cdef SparseMatrix* sparse_matrix_init_cymem(Pool mem, list py_matrix) except NULL:
  160. sm = <SparseMatrix*>mem.alloc(1, sizeof(SparseMatrix))
  161. sm.length = len(py_matrix)
  162. sm.rows = <SparseRow*>mem.alloc(sm.length, sizeof(SparseRow))
  163. cdef size_t i, j
  164. cdef dict py_row
  165. cdef size_t idx
  166. cdef double value
  167. for i, py_row in enumerate(py_matrix):
  168. sm.rows[i].length = len(py_row)
  169. sm.rows[i].indices = <size_t*>mem.alloc(sm.rows[i].length, sizeof(size_t))
  170. sm.rows[i].values = <double*>mem.alloc(sm.rows[i].length, sizeof(double))
  171. for j, (idx, value) in enumerate(py_row.items()):
  172. sm.rows[i].indices[j] = idx
  173. sm.rows[i].values[j] = value
  174. return sm
  175. All that the ``Pool`` class does is remember the addresses it gives out. When the
  176. ``MatrixArray`` object is garbage-collected, the ``Pool`` object will also be garbage
  177. collected, which triggers a call to ``Pool.__dealloc__``. The ``Pool`` then frees all of
  178. its addresses. This saves you from walking back over your nested data structures
  179. to free them, eliminating a common class of errors.
  180. Custom Allocators
  181. =================
  182. Sometimes external C libraries use private functions to allocate and free objects,
  183. but we'd still like the laziness of the ``Pool``.
  184. .. code:: python
  185. from cymem.cymem cimport Pool, WrapMalloc, WrapFree
  186. cdef Pool mem = Pool(WrapMalloc(priv_malloc), WrapFree(priv_free))