You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

467 lines
19 KiB

4 years ago
  1. Metadata-Version: 2.1
  2. Name: thinc
  3. Version: 6.12.1
  4. Summary: Practical Machine Learning for NLP
  5. Home-page: https://github.com/explosion/thinc
  6. Author: Matthew Honnibal
  7. Author-email: matt@explosion.ai
  8. License: MIT
  9. Platform: UNKNOWN
  10. Classifier: Development Status :: 5 - Production/Stable
  11. Classifier: Environment :: Console
  12. Classifier: Intended Audience :: Developers
  13. Classifier: Intended Audience :: Science/Research
  14. Classifier: License :: OSI Approved :: MIT License
  15. Classifier: Operating System :: POSIX :: Linux
  16. Classifier: Operating System :: MacOS :: MacOS X
  17. Classifier: Operating System :: Microsoft :: Windows
  18. Classifier: Programming Language :: Cython
  19. Classifier: Programming Language :: Python :: 2.6
  20. Classifier: Programming Language :: Python :: 2.7
  21. Classifier: Programming Language :: Python :: 3.3
  22. Classifier: Programming Language :: Python :: 3.4
  23. Classifier: Programming Language :: Python :: 3.5
  24. Classifier: Programming Language :: Python :: 3.6
  25. Classifier: Programming Language :: Python :: 3.7
  26. Classifier: Topic :: Scientific/Engineering
  27. Requires-Dist: numpy (>=1.7.0)
  28. Requires-Dist: msgpack (<0.6.0,>=0.5.6)
  29. Requires-Dist: msgpack-numpy (<0.4.4)
  30. Requires-Dist: murmurhash (<1.1.0,>=0.28.0)
  31. Requires-Dist: cymem (<3.0.0,>=2.0.2)
  32. Requires-Dist: preshed (<3.0.0,>=2.0.1)
  33. Requires-Dist: cytoolz (<0.10,>=0.9.0)
  34. Requires-Dist: wrapt (<1.11.0,>=1.10.0)
  35. Requires-Dist: plac (<1.0.0,>=0.9.6)
  36. Requires-Dist: tqdm (<5.0.0,>=4.10.0)
  37. Requires-Dist: six (<2.0.0,>=1.10.0)
  38. Requires-Dist: dill (<0.3.0,>=0.2.7)
  39. Requires-Dist: pathlib (==1.0.1) ; python_version < "3.4"
  40. Provides-Extra: cuda
  41. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda'
  42. Requires-Dist: cupy (>=5.0.0b4) ; extra == 'cuda'
  43. Provides-Extra: cuda100
  44. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda100'
  45. Requires-Dist: cupy-cuda100 (>=5.0.0b4) ; extra == 'cuda100'
  46. Provides-Extra: cuda80
  47. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda80'
  48. Requires-Dist: cupy-cuda80 (>=5.0.0b4) ; extra == 'cuda80'
  49. Provides-Extra: cuda90
  50. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda90'
  51. Requires-Dist: cupy-cuda90 (>=5.0.0b4) ; extra == 'cuda90'
  52. Provides-Extra: cuda91
  53. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda91'
  54. Requires-Dist: cupy-cuda91 (>=5.0.0b4) ; extra == 'cuda91'
  55. Provides-Extra: cuda92
  56. Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda92'
  57. Requires-Dist: cupy-cuda92 (>=5.0.0b4) ; extra == 'cuda92'
  58. Thinc: Practical Machine Learning for NLP in Python
  59. ***************************************************
  60. **Thinc** is the machine learning library powering `spaCy <https://spacy.io>`_.
  61. It features a battle-tested linear model designed for large sparse learning
  62. problems, and a flexible neural network model under development for
  63. `spaCy v2.0 <https://alpha.spacy.io/usage/v2>`_.
  64. Thinc is a practical toolkit for implementing models that follow the
  65. `"Embed, encode, attend, predict" <https://explosion.ai/blog/deep-learning-formula-nlp>`_
  66. architecture. It's designed to be easy to install, efficient for CPU usage and
  67. optimised for NLP and deep learning with text – in particular, hierarchically
  68. structured input and variable-length sequences.
  69. 🔮 **Version 6.10 out now!** `Read the release notes here. <https://github.com/explosion/thinc/releases/>`_
  70. .. image:: https://img.shields.io/travis/explosion/thinc/master.svg?style=flat-square
  71. :target: https://travis-ci.org/explosion/thinc
  72. :alt: Build Status
  73. .. image:: https://img.shields.io/appveyor/ci/explosion/thinc/master.svg?style=flat-square
  74. :target: https://ci.appveyor.com/project/explosion/thinc
  75. :alt: Appveyor Build Status
  76. .. image:: https://img.shields.io/coveralls/explosion/thinc.svg?style=flat-square
  77. :target: https://coveralls.io/github/explosion/thinc
  78. :alt: Test Coverage
  79. .. image:: https://img.shields.io/github/release/explosion/thinc.svg?style=flat-square
  80. :target: https://github.com/explosion/thinc/releases
  81. :alt: Current Release Version
  82. .. image:: https://img.shields.io/pypi/v/thinc.svg?style=flat-square
  83. :target: https://pypi.python.org/pypi/thinc
  84. :alt: pypi Version
  85. .. image:: https://anaconda.org/conda-forge/thinc/badges/version.svg
  86. :target: https://anaconda.org/conda-forge/thinc
  87. :alt: conda Version
  88. .. image:: https://img.shields.io/badge/gitter-join%20chat%20%E2%86%92-7676d1.svg?style=flat-square
  89. :target: https://gitter.im/explosion/thinc
  90. :alt: Thinc on Gitter
  91. .. image:: https://img.shields.io/twitter/follow/explosion_ai.svg?style=social&label=Follow
  92. :target: https://twitter.com/explosion_ai
  93. :alt: Follow us on Twitter
  94. What's where (as of v6.9.0)
  95. ===========================
  96. ======================== ===
  97. ``thinc.v2v.Model`` Base class.
  98. ``thinc.v2v`` Layers transforming vectors to vectors.
  99. ``thinc.i2v`` Layers embedding IDs to vectors.
  100. ``thinc.t2v`` Layers pooling tensors to vectors.
  101. ``thinc.t2t`` Layers transforming tensors to tensors (e.g. CNN, LSTM).
  102. ``thinc.api`` Higher-order functions, for building networks. Will be renamed.
  103. ``thinc.extra`` Datasets and utilities.
  104. ``thinc.neural.ops`` Container classes for mathematical operations. Will be reorganized.
  105. ``thinc.linear.avgtron`` Legacy efficient Averaged Perceptron implementation.
  106. ======================== ===
  107. Development status
  108. ==================
  109. Thinc's deep learning functionality is still under active development: APIs are
  110. unstable, and we're not yet ready to provide usage support. However, if you're
  111. already quite familiar with neural networks, there's a lot here you might find
  112. interesting. Thinc's conceptual model is quite different from TensorFlow's.
  113. Thinc also implements some novel features, such as a small DSL for concisely
  114. wiring up models, embedding tables that support pre-computation and the
  115. hashing trick, dynamic batch sizes, a concatenation-based approach to
  116. variable-length sequences, and support for model averaging for the
  117. Adam solver (which performs very well).
  118. No computational graph – just higher order functions
  119. ======================================================
  120. The central problem for a neural network implementation is this: during the
  121. forward pass, you compute results that will later be useful during the backward
  122. pass. How do you keep track of this arbitrary state, while making sure that
  123. layers can be cleanly composed?
  124. Most libraries solve this problem by having you declare the forward
  125. computations, which are then compiled into a graph somewhere behind the scenes.
  126. Thinc doesn't have a "computational graph". Instead, we just use the stack,
  127. because we put the state from the forward pass into callbacks.
  128. All nodes in the network have a simple signature:
  129. .. code:: none
  130. f(inputs) -> {outputs, f(d_outputs)->d_inputs}
  131. To make this less abstract, here's a ReLu activation, following this signature:
  132. .. code:: python
  133. def relu(inputs):
  134. mask = inputs > 0
  135. def backprop_relu(d_outputs, optimizer):
  136. return d_outputs * mask
  137. return inputs * mask, backprop_relu
  138. When you call the ``relu`` function, you get back an output variable, and a
  139. callback. This lets you calculate a gradient using the output, and then pass it
  140. into the callback to perform the backward pass.
  141. This signature makes it easy to build a complex network out of smaller pieces,
  142. using arbitrary higher-order functions you can write yourself. To make this
  143. clearer, we need a function for a weights layer. Usually this will be
  144. implemented as a class — but let's continue using closures, to keep things
  145. concise, and to keep the simplicity of the interface explicit:
  146. .. code:: python
  147. import numpy
  148. def create_linear_layer(n_out, n_in):
  149. W = numpy.zeros((n_out, n_in))
  150. b = numpy.zeros((n_out, 1))
  151. def forward(X):
  152. Y = W @ X + b
  153. def backward(dY, optimizer):
  154. dX = W.T @ dY
  155. dW = numpy.einsum('ik,jk->ij', dY, X)
  156. db = dY.sum(axis=0)
  157. optimizer(W, dW)
  158. optimizer(b, db)
  159. return dX
  160. return Y, backward
  161. return forward
  162. If we call ``Wb = create_linear_layer(5, 4)``, the variable ``Wb`` will be the
  163. ``forward()`` function, implemented inside the body of ``create_linear_layer()``.
  164. The `Wb` instance will have access to the ``W`` and ``b`` variable defined in its
  165. outer scope. If we invoke ``create_linear_layer()`` again, we get a new instance,
  166. with its own internal state.
  167. The ``Wb`` instance and the ``relu`` function have exactly the same signature. This
  168. makes it easy to write higher order functions to compose them. The most obvious
  169. thing to do is chain them together:
  170. .. code:: python
  171. def chain(*layers):
  172. def forward(X):
  173. backprops = []
  174. Y = X
  175. for layer in layers:
  176. Y, backprop = layer(Y)
  177. backprops.append(backprop)
  178. def backward(dY, optimizer):
  179. for backprop in reversed(backprops):
  180. dY = backprop(dY, optimizer)
  181. return dY
  182. return Y, backward
  183. return forward
  184. We could now chain our linear layer together with the ``relu`` activation, to
  185. create a simple feed-forward network:
  186. .. code:: python
  187. Wb1 = create_linear_layer(10, 5)
  188. Wb2 = create_linear_layer(3, 10)
  189. model = chain(Wb1, relu, Wb2)
  190. X = numpy.random.uniform(size=(5, 4))
  191. y, bp_y = model(X)
  192. dY = y - truth
  193. dX = bp_y(dY, optimizer)
  194. This conceptual model makes Thinc very flexible. The trade-off is that Thinc is
  195. less convenient and efficient at workloads that fit exactly into what
  196. `Tensorflow <https://www.tensorflow.org/>`_ etc. are designed for. If your graph
  197. really is static, and your inputs are homogenous in size and shape,
  198. `Keras <https://keras.io/>`_ will likely be faster and simpler. But if you want
  199. to pass normal Python objects through your network, or handle sequences and recursions
  200. of arbitrary length or complexity, you might find Thinc's design a better fit for
  201. your problem.
  202. Quickstart
  203. ==========
  204. Thinc should install cleanly with both `pip <http://pypi.python.org/pypi/thinc>`_ and
  205. `conda <https://anaconda.org/conda-forge/thinc>`_, for **Pythons 2.7+ and 3.5+**, on
  206. **Linux**, **macOS / OSX** and **Windows**. Its only system dependency is a compiler
  207. tool-chain (e.g. ``build-essential``) and the Python development headers (e.g.
  208. ``python-dev``).
  209. .. code:: bash
  210. pip install thinc
  211. For GPU support, we're grateful to use the work of Chainer's cupy module, which provides a numpy-compatible interface for GPU arrays. However, installing Chainer when no GPU is available currently causes an error. We therefore do not list Chainer as an explicit dependency --- so building ``Thinc`` for GPU requires some extra steps:
  212. .. code:: bash
  213. export CUDA_HOME=/usr/local/cuda-8.0 # Or wherever your CUDA is
  214. export PATH=$PATH:$CUDA_HOME/bin
  215. pip install chainer
  216. python -c "import cupy; assert cupy" # Check it installed
  217. pip install thinc
  218. python -c "import thinc.neural.gpu_ops" # Check the GPU ops were built
  219. The rest of this section describes how to build Thinc from source. If you have
  220. `Fabric <http://www.fabfile.org>`_ installed, you can use the shortcut:
  221. .. code:: bash
  222. git clone https://github.com/explosion/thinc
  223. cd thinc
  224. fab clean env make test
  225. You can then run the examples as follows:
  226. .. code:: bash
  227. fab eg.mnist
  228. fab eg.basic_tagger
  229. fab eg.cnn_tagger
  230. Otherwise, you can build and test explicitly with:
  231. .. code:: bash
  232. git clone https://github.com/explosion/thinc
  233. cd thinc
  234. virtualenv .env
  235. source .env/bin/activate
  236. pip install -r requirements.txt
  237. python setup.py build_ext --inplace
  238. py.test thinc/
  239. And then run the examples as follows:
  240. .. code:: bash
  241. python examples/mnist.py
  242. python examples/basic_tagger.py
  243. python examples/cnn_tagger.py
  244. Usage
  245. =====
  246. The Neural Network API is still subject to change, even within minor versions.
  247. You can get a feel for the current API by checking out the examples. Here are
  248. a few quick highlights.
  249. 1. Shape inference
  250. ------------------
  251. Models can be created with some dimensions unspecified. Missing dimensions are
  252. inferred when pre-trained weights are loaded or when training begins. This
  253. eliminates a common source of programmer error:
  254. .. code:: python
  255. # Invalid network — shape mismatch
  256. model = chain(ReLu(512, 748), ReLu(512, 784), Softmax(10))
  257. # Leave the dimensions unspecified, and you can't be wrong.
  258. model = chain(ReLu(512), ReLu(512), Softmax())
  259. 2. Operator overloading
  260. -----------------------
  261. The ``Model.define_operators()`` classmethod allows you to bind arbitrary
  262. binary functions to Python operators, for use in any ``Model`` instance. The
  263. method can (and should) be used as a context-manager, so that the overloading
  264. is limited to the immediate block. This allows concise and expressive model
  265. definition:
  266. .. code:: python
  267. with Model.define_operators({'>>': chain}):
  268. model = ReLu(512) >> ReLu(512) >> Softmax()
  269. The overloading is cleaned up at the end of the block. A fairly arbitrary zoo
  270. of functions are currently implemented. Some of the most useful:
  271. * ``chain(model1, model2)``: Compose two models ``f(x)`` and ``g(x)`` into a single model computing ``g(f(x))``.
  272. * ``clone(model1, int)``: Create ``n`` copies of a model, each with distinct weights, and chain them together.
  273. * ``concatenate(model1, model2)``: Given two models with output dimensions ``(n,)`` and ``(m,)``, construct a model with output dimensions ``(m+n,)``.
  274. * ``add(model1, model2)``: ``add(f(x), g(x)) = f(x)+g(x)``
  275. * ``make_tuple(model1, model2)``: Construct tuples of the outputs of two models, at the batch level. The backward pass expects to receive a tuple of gradients, which are routed through the appropriate model, and summed.
  276. Putting these things together, here's the sort of tagging model that Thinc is
  277. designed to make easy.
  278. .. code:: python
  279. with Model.define_operators({'>>': chain, '**': clone, '|': concatenate}):
  280. model = (
  281. add_eol_markers('EOL')
  282. >> flatten
  283. >> memoize(
  284. CharLSTM(char_width)
  285. | (normalize >> str2int >> Embed(word_width)))
  286. >> ExtractWindow(nW=2)
  287. >> BatchNorm(ReLu(hidden_width)) ** 3
  288. >> Softmax()
  289. )
  290. Not all of these pieces are implemented yet, but hopefully this shows where
  291. we're going. The ``memoize`` function will be particularly important: in any
  292. batch of text, the common words will be very common. It's therefore important
  293. to evaluate models such as the ``CharLSTM`` once per word type per minibatch,
  294. rather than once per token.
  295. 3. Callback-based backpropagation
  296. ---------------------------------
  297. Most neural network libraries use a computational graph abstraction. This takes
  298. the execution away from you, so that gradients can be computed automatically.
  299. Thinc follows a style more like the ``autograd`` library, but with larger
  300. operations. Usage is as follows:
  301. .. code:: python
  302. def explicit_sgd_update(X, y):
  303. sgd = lambda weights, gradient: weights - gradient * 0.001
  304. yh, finish_update = model.begin_update(X, drop=0.2)
  305. finish_update(y-yh, sgd)
  306. Separating the backpropagation into three parts like this has many advantages.
  307. The interface to all models is completely uniform — there is no distinction
  308. between the top-level model you use as a predictor and the internal models for
  309. the layers. We also make concurrency simple, by making the ``begin_update()``
  310. step a pure function, and separating the accumulation of the gradient from the
  311. action of the optimizer.
  312. 4. Class annotations
  313. --------------------
  314. To keep the class hierarchy shallow, Thinc uses class decorators to reuse code
  315. for layer definitions. Specifically, the following decorators are available:
  316. * ``describe.attributes()``: Allows attributes to be specified by keyword argument. Used especially for dimensions and parameters.
  317. * ``describe.on_init()``: Allows callbacks to be specified, which will be called at the end of the ``__init__.py``.
  318. * ``describe.on_data()``: Allows callbacks to be specified, which will be called on ``Model.begin_training()``.
  319. 🛠 Changelog
  320. ============
  321. =========== ============== ===========
  322. Version Date Description
  323. =========== ============== ===========
  324. `v6.10.1`_ ``2017-11-15`` Fix GPU install and minor memory leak
  325. `v6.10.0`_ ``2017-10-28`` CPU efficiency improvements, refactoring
  326. `v6.9.0`_ ``2017-10-03`` Reorganize layers, bug fix to Layer Normalization
  327. `v6.8.2`_   ``2017-09-26`` Fix packaging of `gpu_ops`
  328. `v6.8.1`_   ``2017-08-23`` Fix Windows support
  329. `v6.8.0`_ ``2017-07-25`` SELU layer, attention, improved GPU/CPU compatibility
  330. `v6.7.3`_ ``2017-06-05`` Fix convolution on GPU
  331. `v6.7.2`_ ``2017-06-02`` Bug fixes to serialization
  332. `v6.7.1`_ ``2017-06-02`` Improve serialization
  333. `v6.7.0`_ ``2017-06-01`` Fixes to serialization, hash embeddings and flatten ops
  334. `v6.6.0`_ ``2017-05-14`` Improved GPU usage and examples
  335. v6.5.2 ``2017-03-20`` *n/a*
  336. `v6.5.1`_ ``2017-03-20`` Improved linear class and Windows fix
  337. `v6.5.0`_ ``2017-03-11`` Supervised similarity, fancier embedding and improvements to linear model
  338. v6.4.0 ``2017-02-15`` *n/a*
  339. `v6.3.0`_ ``2017-01-25`` Efficiency improvements, argument checking and error messaging
  340. `v6.2.0`_ ``2017-01-15`` Improve API and introduce overloaded operators
  341. `v6.1.3`_ ``2017-01-10`` More neural network functions and training continuation
  342. v6.1.3 ``2017-01-09`` *n/a*
  343. v6.1.2 ``2017-01-09`` *n/a*
  344. v6.1.1 ``2017-01-09`` *n/a*
  345. v6.1.0 ``2017-01-09`` *n/a*
  346. `v6.0.0`_ ``2016-12-31`` Add ``thinc.neural`` for NLP-oriented deep learning
  347. =========== ============== ===========
  348. .. _v6.10.1: https://github.com/explosion/thinc/releases/tag/v6.10.1
  349. .. _v6.10.0: https://github.com/explosion/thinc/releases/tag/v6.10.0
  350. .. _v6.9.0: https://github.com/explosion/thinc/releases/tag/v6.9.0
  351. .. _v6.8.2: https://github.com/explosion/thinc/releases/tag/v6.8.2
  352. .. _v6.8.1: https://github.com/explosion/thinc/releases/tag/v6.8.1
  353. .. _v6.8.0: https://github.com/explosion/thinc/releases/tag/v6.8.0
  354. .. _v6.7.3: https://github.com/explosion/thinc/releases/tag/v6.7.3
  355. .. _v6.7.2: https://github.com/explosion/thinc/releases/tag/v6.7.2
  356. .. _v6.7.1: https://github.com/explosion/thinc/releases/tag/v6.7.1
  357. .. _v6.7.0: https://github.com/explosion/thinc/releases/tag/v6.7.0
  358. .. _v6.6.0: https://github.com/explosion/thinc/releases/tag/v6.6.0
  359. .. _v6.5.1: https://github.com/explosion/thinc/releases/tag/v6.5.1
  360. .. _v6.5.0: https://github.com/explosion/thinc/releases/tag/v6.5.0
  361. .. _v6.3.0: https://github.com/explosion/thinc/releases/tag/v6.3.0
  362. .. _v6.2.0: https://github.com/explosion/thinc/releases/tag/v6.2.0
  363. .. _v6.1.3: https://github.com/explosion/thinc/releases/tag/v6.1.3
  364. .. _v6.0.0: https://github.com/explosion/thinc/releases/tag/v6.0.0