467 lines
19 KiB
Text
467 lines
19 KiB
Text
Metadata-Version: 2.1
|
||
Name: thinc
|
||
Version: 6.12.1
|
||
Summary: Practical Machine Learning for NLP
|
||
Home-page: https://github.com/explosion/thinc
|
||
Author: Matthew Honnibal
|
||
Author-email: matt@explosion.ai
|
||
License: MIT
|
||
Platform: UNKNOWN
|
||
Classifier: Development Status :: 5 - Production/Stable
|
||
Classifier: Environment :: Console
|
||
Classifier: Intended Audience :: Developers
|
||
Classifier: Intended Audience :: Science/Research
|
||
Classifier: License :: OSI Approved :: MIT License
|
||
Classifier: Operating System :: POSIX :: Linux
|
||
Classifier: Operating System :: MacOS :: MacOS X
|
||
Classifier: Operating System :: Microsoft :: Windows
|
||
Classifier: Programming Language :: Cython
|
||
Classifier: Programming Language :: Python :: 2.6
|
||
Classifier: Programming Language :: Python :: 2.7
|
||
Classifier: Programming Language :: Python :: 3.3
|
||
Classifier: Programming Language :: Python :: 3.4
|
||
Classifier: Programming Language :: Python :: 3.5
|
||
Classifier: Programming Language :: Python :: 3.6
|
||
Classifier: Programming Language :: Python :: 3.7
|
||
Classifier: Topic :: Scientific/Engineering
|
||
Requires-Dist: numpy (>=1.7.0)
|
||
Requires-Dist: msgpack (<0.6.0,>=0.5.6)
|
||
Requires-Dist: msgpack-numpy (<0.4.4)
|
||
Requires-Dist: murmurhash (<1.1.0,>=0.28.0)
|
||
Requires-Dist: cymem (<3.0.0,>=2.0.2)
|
||
Requires-Dist: preshed (<3.0.0,>=2.0.1)
|
||
Requires-Dist: cytoolz (<0.10,>=0.9.0)
|
||
Requires-Dist: wrapt (<1.11.0,>=1.10.0)
|
||
Requires-Dist: plac (<1.0.0,>=0.9.6)
|
||
Requires-Dist: tqdm (<5.0.0,>=4.10.0)
|
||
Requires-Dist: six (<2.0.0,>=1.10.0)
|
||
Requires-Dist: dill (<0.3.0,>=0.2.7)
|
||
Requires-Dist: pathlib (==1.0.1) ; python_version < "3.4"
|
||
Provides-Extra: cuda
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda'
|
||
Requires-Dist: cupy (>=5.0.0b4) ; extra == 'cuda'
|
||
Provides-Extra: cuda100
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda100'
|
||
Requires-Dist: cupy-cuda100 (>=5.0.0b4) ; extra == 'cuda100'
|
||
Provides-Extra: cuda80
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda80'
|
||
Requires-Dist: cupy-cuda80 (>=5.0.0b4) ; extra == 'cuda80'
|
||
Provides-Extra: cuda90
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda90'
|
||
Requires-Dist: cupy-cuda90 (>=5.0.0b4) ; extra == 'cuda90'
|
||
Provides-Extra: cuda91
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda91'
|
||
Requires-Dist: cupy-cuda91 (>=5.0.0b4) ; extra == 'cuda91'
|
||
Provides-Extra: cuda92
|
||
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda92'
|
||
Requires-Dist: cupy-cuda92 (>=5.0.0b4) ; extra == 'cuda92'
|
||
|
||
Thinc: Practical Machine Learning for NLP in Python
|
||
***************************************************
|
||
|
||
**Thinc** is the machine learning library powering `spaCy <https://spacy.io>`_.
|
||
It features a battle-tested linear model designed for large sparse learning
|
||
problems, and a flexible neural network model under development for
|
||
`spaCy v2.0 <https://alpha.spacy.io/usage/v2>`_.
|
||
|
||
Thinc is a practical toolkit for implementing models that follow the
|
||
`"Embed, encode, attend, predict" <https://explosion.ai/blog/deep-learning-formula-nlp>`_
|
||
architecture. It's designed to be easy to install, efficient for CPU usage and
|
||
optimised for NLP and deep learning with text – in particular, hierarchically
|
||
structured input and variable-length sequences.
|
||
|
||
🔮 **Version 6.10 out now!** `Read the release notes here. <https://github.com/explosion/thinc/releases/>`_
|
||
|
||
.. image:: https://img.shields.io/travis/explosion/thinc/master.svg?style=flat-square
|
||
:target: https://travis-ci.org/explosion/thinc
|
||
:alt: Build Status
|
||
|
||
.. image:: https://img.shields.io/appveyor/ci/explosion/thinc/master.svg?style=flat-square
|
||
:target: https://ci.appveyor.com/project/explosion/thinc
|
||
:alt: Appveyor Build Status
|
||
|
||
.. image:: https://img.shields.io/coveralls/explosion/thinc.svg?style=flat-square
|
||
:target: https://coveralls.io/github/explosion/thinc
|
||
:alt: Test Coverage
|
||
|
||
.. image:: https://img.shields.io/github/release/explosion/thinc.svg?style=flat-square
|
||
:target: https://github.com/explosion/thinc/releases
|
||
:alt: Current Release Version
|
||
|
||
.. image:: https://img.shields.io/pypi/v/thinc.svg?style=flat-square
|
||
:target: https://pypi.python.org/pypi/thinc
|
||
:alt: pypi Version
|
||
|
||
.. image:: https://anaconda.org/conda-forge/thinc/badges/version.svg
|
||
:target: https://anaconda.org/conda-forge/thinc
|
||
:alt: conda Version
|
||
|
||
.. image:: https://img.shields.io/badge/gitter-join%20chat%20%E2%86%92-7676d1.svg?style=flat-square
|
||
:target: https://gitter.im/explosion/thinc
|
||
:alt: Thinc on Gitter
|
||
|
||
.. image:: https://img.shields.io/twitter/follow/explosion_ai.svg?style=social&label=Follow
|
||
:target: https://twitter.com/explosion_ai
|
||
:alt: Follow us on Twitter
|
||
|
||
What's where (as of v6.9.0)
|
||
===========================
|
||
|
||
======================== ===
|
||
``thinc.v2v.Model`` Base class.
|
||
``thinc.v2v`` Layers transforming vectors to vectors.
|
||
``thinc.i2v`` Layers embedding IDs to vectors.
|
||
``thinc.t2v`` Layers pooling tensors to vectors.
|
||
``thinc.t2t`` Layers transforming tensors to tensors (e.g. CNN, LSTM).
|
||
``thinc.api`` Higher-order functions, for building networks. Will be renamed.
|
||
``thinc.extra`` Datasets and utilities.
|
||
``thinc.neural.ops`` Container classes for mathematical operations. Will be reorganized.
|
||
``thinc.linear.avgtron`` Legacy efficient Averaged Perceptron implementation.
|
||
======================== ===
|
||
|
||
Development status
|
||
==================
|
||
|
||
Thinc's deep learning functionality is still under active development: APIs are
|
||
unstable, and we're not yet ready to provide usage support. However, if you're
|
||
already quite familiar with neural networks, there's a lot here you might find
|
||
interesting. Thinc's conceptual model is quite different from TensorFlow's.
|
||
Thinc also implements some novel features, such as a small DSL for concisely
|
||
wiring up models, embedding tables that support pre-computation and the
|
||
hashing trick, dynamic batch sizes, a concatenation-based approach to
|
||
variable-length sequences, and support for model averaging for the
|
||
Adam solver (which performs very well).
|
||
|
||
No computational graph – just higher order functions
|
||
======================================================
|
||
|
||
The central problem for a neural network implementation is this: during the
|
||
forward pass, you compute results that will later be useful during the backward
|
||
pass. How do you keep track of this arbitrary state, while making sure that
|
||
layers can be cleanly composed?
|
||
|
||
Most libraries solve this problem by having you declare the forward
|
||
computations, which are then compiled into a graph somewhere behind the scenes.
|
||
Thinc doesn't have a "computational graph". Instead, we just use the stack,
|
||
because we put the state from the forward pass into callbacks.
|
||
|
||
All nodes in the network have a simple signature:
|
||
|
||
.. code:: none
|
||
|
||
f(inputs) -> {outputs, f(d_outputs)->d_inputs}
|
||
|
||
To make this less abstract, here's a ReLu activation, following this signature:
|
||
|
||
.. code:: python
|
||
|
||
def relu(inputs):
|
||
mask = inputs > 0
|
||
def backprop_relu(d_outputs, optimizer):
|
||
return d_outputs * mask
|
||
return inputs * mask, backprop_relu
|
||
|
||
When you call the ``relu`` function, you get back an output variable, and a
|
||
callback. This lets you calculate a gradient using the output, and then pass it
|
||
into the callback to perform the backward pass.
|
||
|
||
This signature makes it easy to build a complex network out of smaller pieces,
|
||
using arbitrary higher-order functions you can write yourself. To make this
|
||
clearer, we need a function for a weights layer. Usually this will be
|
||
implemented as a class — but let's continue using closures, to keep things
|
||
concise, and to keep the simplicity of the interface explicit:
|
||
|
||
.. code:: python
|
||
|
||
import numpy
|
||
|
||
def create_linear_layer(n_out, n_in):
|
||
W = numpy.zeros((n_out, n_in))
|
||
b = numpy.zeros((n_out, 1))
|
||
|
||
def forward(X):
|
||
Y = W @ X + b
|
||
def backward(dY, optimizer):
|
||
dX = W.T @ dY
|
||
dW = numpy.einsum('ik,jk->ij', dY, X)
|
||
db = dY.sum(axis=0)
|
||
|
||
optimizer(W, dW)
|
||
optimizer(b, db)
|
||
|
||
return dX
|
||
return Y, backward
|
||
return forward
|
||
|
||
If we call ``Wb = create_linear_layer(5, 4)``, the variable ``Wb`` will be the
|
||
``forward()`` function, implemented inside the body of ``create_linear_layer()``.
|
||
The `Wb` instance will have access to the ``W`` and ``b`` variable defined in its
|
||
outer scope. If we invoke ``create_linear_layer()`` again, we get a new instance,
|
||
with its own internal state.
|
||
|
||
The ``Wb`` instance and the ``relu`` function have exactly the same signature. This
|
||
makes it easy to write higher order functions to compose them. The most obvious
|
||
thing to do is chain them together:
|
||
|
||
.. code:: python
|
||
|
||
def chain(*layers):
|
||
def forward(X):
|
||
backprops = []
|
||
Y = X
|
||
for layer in layers:
|
||
Y, backprop = layer(Y)
|
||
backprops.append(backprop)
|
||
def backward(dY, optimizer):
|
||
for backprop in reversed(backprops):
|
||
dY = backprop(dY, optimizer)
|
||
return dY
|
||
return Y, backward
|
||
return forward
|
||
|
||
We could now chain our linear layer together with the ``relu`` activation, to
|
||
create a simple feed-forward network:
|
||
|
||
.. code:: python
|
||
|
||
Wb1 = create_linear_layer(10, 5)
|
||
Wb2 = create_linear_layer(3, 10)
|
||
|
||
model = chain(Wb1, relu, Wb2)
|
||
|
||
X = numpy.random.uniform(size=(5, 4))
|
||
|
||
y, bp_y = model(X)
|
||
|
||
dY = y - truth
|
||
dX = bp_y(dY, optimizer)
|
||
|
||
This conceptual model makes Thinc very flexible. The trade-off is that Thinc is
|
||
less convenient and efficient at workloads that fit exactly into what
|
||
`Tensorflow <https://www.tensorflow.org/>`_ etc. are designed for. If your graph
|
||
really is static, and your inputs are homogenous in size and shape,
|
||
`Keras <https://keras.io/>`_ will likely be faster and simpler. But if you want
|
||
to pass normal Python objects through your network, or handle sequences and recursions
|
||
of arbitrary length or complexity, you might find Thinc's design a better fit for
|
||
your problem.
|
||
|
||
Quickstart
|
||
==========
|
||
|
||
Thinc should install cleanly with both `pip <http://pypi.python.org/pypi/thinc>`_ and
|
||
`conda <https://anaconda.org/conda-forge/thinc>`_, for **Pythons 2.7+ and 3.5+**, on
|
||
**Linux**, **macOS / OSX** and **Windows**. Its only system dependency is a compiler
|
||
tool-chain (e.g. ``build-essential``) and the Python development headers (e.g.
|
||
``python-dev``).
|
||
|
||
.. code:: bash
|
||
|
||
pip install thinc
|
||
|
||
For GPU support, we're grateful to use the work of Chainer's cupy module, which provides a numpy-compatible interface for GPU arrays. However, installing Chainer when no GPU is available currently causes an error. We therefore do not list Chainer as an explicit dependency --- so building ``Thinc`` for GPU requires some extra steps:
|
||
|
||
.. code:: bash
|
||
|
||
export CUDA_HOME=/usr/local/cuda-8.0 # Or wherever your CUDA is
|
||
export PATH=$PATH:$CUDA_HOME/bin
|
||
pip install chainer
|
||
python -c "import cupy; assert cupy" # Check it installed
|
||
pip install thinc
|
||
python -c "import thinc.neural.gpu_ops" # Check the GPU ops were built
|
||
|
||
The rest of this section describes how to build Thinc from source. If you have
|
||
`Fabric <http://www.fabfile.org>`_ installed, you can use the shortcut:
|
||
|
||
.. code:: bash
|
||
|
||
git clone https://github.com/explosion/thinc
|
||
cd thinc
|
||
fab clean env make test
|
||
|
||
You can then run the examples as follows:
|
||
|
||
.. code:: bash
|
||
|
||
fab eg.mnist
|
||
fab eg.basic_tagger
|
||
fab eg.cnn_tagger
|
||
|
||
Otherwise, you can build and test explicitly with:
|
||
|
||
.. code:: bash
|
||
|
||
git clone https://github.com/explosion/thinc
|
||
cd thinc
|
||
|
||
virtualenv .env
|
||
source .env/bin/activate
|
||
|
||
pip install -r requirements.txt
|
||
python setup.py build_ext --inplace
|
||
py.test thinc/
|
||
|
||
And then run the examples as follows:
|
||
|
||
.. code:: bash
|
||
|
||
python examples/mnist.py
|
||
python examples/basic_tagger.py
|
||
python examples/cnn_tagger.py
|
||
|
||
|
||
Usage
|
||
=====
|
||
|
||
The Neural Network API is still subject to change, even within minor versions.
|
||
You can get a feel for the current API by checking out the examples. Here are
|
||
a few quick highlights.
|
||
|
||
1. Shape inference
|
||
------------------
|
||
|
||
Models can be created with some dimensions unspecified. Missing dimensions are
|
||
inferred when pre-trained weights are loaded or when training begins. This
|
||
eliminates a common source of programmer error:
|
||
|
||
.. code:: python
|
||
|
||
# Invalid network — shape mismatch
|
||
model = chain(ReLu(512, 748), ReLu(512, 784), Softmax(10))
|
||
|
||
# Leave the dimensions unspecified, and you can't be wrong.
|
||
model = chain(ReLu(512), ReLu(512), Softmax())
|
||
|
||
2. Operator overloading
|
||
-----------------------
|
||
|
||
The ``Model.define_operators()`` classmethod allows you to bind arbitrary
|
||
binary functions to Python operators, for use in any ``Model`` instance. The
|
||
method can (and should) be used as a context-manager, so that the overloading
|
||
is limited to the immediate block. This allows concise and expressive model
|
||
definition:
|
||
|
||
.. code:: python
|
||
|
||
with Model.define_operators({'>>': chain}):
|
||
model = ReLu(512) >> ReLu(512) >> Softmax()
|
||
|
||
The overloading is cleaned up at the end of the block. A fairly arbitrary zoo
|
||
of functions are currently implemented. Some of the most useful:
|
||
|
||
* ``chain(model1, model2)``: Compose two models ``f(x)`` and ``g(x)`` into a single model computing ``g(f(x))``.
|
||
|
||
* ``clone(model1, int)``: Create ``n`` copies of a model, each with distinct weights, and chain them together.
|
||
|
||
* ``concatenate(model1, model2)``: Given two models with output dimensions ``(n,)`` and ``(m,)``, construct a model with output dimensions ``(m+n,)``.
|
||
|
||
* ``add(model1, model2)``: ``add(f(x), g(x)) = f(x)+g(x)``
|
||
|
||
* ``make_tuple(model1, model2)``: Construct tuples of the outputs of two models, at the batch level. The backward pass expects to receive a tuple of gradients, which are routed through the appropriate model, and summed.
|
||
|
||
Putting these things together, here's the sort of tagging model that Thinc is
|
||
designed to make easy.
|
||
|
||
.. code:: python
|
||
|
||
with Model.define_operators({'>>': chain, '**': clone, '|': concatenate}):
|
||
model = (
|
||
add_eol_markers('EOL')
|
||
>> flatten
|
||
>> memoize(
|
||
CharLSTM(char_width)
|
||
| (normalize >> str2int >> Embed(word_width)))
|
||
>> ExtractWindow(nW=2)
|
||
>> BatchNorm(ReLu(hidden_width)) ** 3
|
||
>> Softmax()
|
||
)
|
||
|
||
Not all of these pieces are implemented yet, but hopefully this shows where
|
||
we're going. The ``memoize`` function will be particularly important: in any
|
||
batch of text, the common words will be very common. It's therefore important
|
||
to evaluate models such as the ``CharLSTM`` once per word type per minibatch,
|
||
rather than once per token.
|
||
|
||
3. Callback-based backpropagation
|
||
---------------------------------
|
||
|
||
Most neural network libraries use a computational graph abstraction. This takes
|
||
the execution away from you, so that gradients can be computed automatically.
|
||
Thinc follows a style more like the ``autograd`` library, but with larger
|
||
operations. Usage is as follows:
|
||
|
||
.. code:: python
|
||
|
||
def explicit_sgd_update(X, y):
|
||
sgd = lambda weights, gradient: weights - gradient * 0.001
|
||
yh, finish_update = model.begin_update(X, drop=0.2)
|
||
finish_update(y-yh, sgd)
|
||
|
||
Separating the backpropagation into three parts like this has many advantages.
|
||
The interface to all models is completely uniform — there is no distinction
|
||
between the top-level model you use as a predictor and the internal models for
|
||
the layers. We also make concurrency simple, by making the ``begin_update()``
|
||
step a pure function, and separating the accumulation of the gradient from the
|
||
action of the optimizer.
|
||
|
||
4. Class annotations
|
||
--------------------
|
||
|
||
To keep the class hierarchy shallow, Thinc uses class decorators to reuse code
|
||
for layer definitions. Specifically, the following decorators are available:
|
||
|
||
* ``describe.attributes()``: Allows attributes to be specified by keyword argument. Used especially for dimensions and parameters.
|
||
|
||
* ``describe.on_init()``: Allows callbacks to be specified, which will be called at the end of the ``__init__.py``.
|
||
|
||
* ``describe.on_data()``: Allows callbacks to be specified, which will be called on ``Model.begin_training()``.
|
||
|
||
🛠 Changelog
|
||
============
|
||
|
||
=========== ============== ===========
|
||
Version Date Description
|
||
=========== ============== ===========
|
||
`v6.10.1`_ ``2017-11-15`` Fix GPU install and minor memory leak
|
||
`v6.10.0`_ ``2017-10-28`` CPU efficiency improvements, refactoring
|
||
`v6.9.0`_ ``2017-10-03`` Reorganize layers, bug fix to Layer Normalization
|
||
`v6.8.2`_ ``2017-09-26`` Fix packaging of `gpu_ops`
|
||
`v6.8.1`_ ``2017-08-23`` Fix Windows support
|
||
`v6.8.0`_ ``2017-07-25`` SELU layer, attention, improved GPU/CPU compatibility
|
||
`v6.7.3`_ ``2017-06-05`` Fix convolution on GPU
|
||
`v6.7.2`_ ``2017-06-02`` Bug fixes to serialization
|
||
`v6.7.1`_ ``2017-06-02`` Improve serialization
|
||
`v6.7.0`_ ``2017-06-01`` Fixes to serialization, hash embeddings and flatten ops
|
||
`v6.6.0`_ ``2017-05-14`` Improved GPU usage and examples
|
||
v6.5.2 ``2017-03-20`` *n/a*
|
||
`v6.5.1`_ ``2017-03-20`` Improved linear class and Windows fix
|
||
`v6.5.0`_ ``2017-03-11`` Supervised similarity, fancier embedding and improvements to linear model
|
||
v6.4.0 ``2017-02-15`` *n/a*
|
||
`v6.3.0`_ ``2017-01-25`` Efficiency improvements, argument checking and error messaging
|
||
`v6.2.0`_ ``2017-01-15`` Improve API and introduce overloaded operators
|
||
`v6.1.3`_ ``2017-01-10`` More neural network functions and training continuation
|
||
v6.1.3 ``2017-01-09`` *n/a*
|
||
v6.1.2 ``2017-01-09`` *n/a*
|
||
v6.1.1 ``2017-01-09`` *n/a*
|
||
v6.1.0 ``2017-01-09`` *n/a*
|
||
`v6.0.0`_ ``2016-12-31`` Add ``thinc.neural`` for NLP-oriented deep learning
|
||
=========== ============== ===========
|
||
|
||
.. _v6.10.1: https://github.com/explosion/thinc/releases/tag/v6.10.1
|
||
.. _v6.10.0: https://github.com/explosion/thinc/releases/tag/v6.10.0
|
||
.. _v6.9.0: https://github.com/explosion/thinc/releases/tag/v6.9.0
|
||
.. _v6.8.2: https://github.com/explosion/thinc/releases/tag/v6.8.2
|
||
.. _v6.8.1: https://github.com/explosion/thinc/releases/tag/v6.8.1
|
||
.. _v6.8.0: https://github.com/explosion/thinc/releases/tag/v6.8.0
|
||
.. _v6.7.3: https://github.com/explosion/thinc/releases/tag/v6.7.3
|
||
.. _v6.7.2: https://github.com/explosion/thinc/releases/tag/v6.7.2
|
||
.. _v6.7.1: https://github.com/explosion/thinc/releases/tag/v6.7.1
|
||
.. _v6.7.0: https://github.com/explosion/thinc/releases/tag/v6.7.0
|
||
.. _v6.6.0: https://github.com/explosion/thinc/releases/tag/v6.6.0
|
||
.. _v6.5.1: https://github.com/explosion/thinc/releases/tag/v6.5.1
|
||
.. _v6.5.0: https://github.com/explosion/thinc/releases/tag/v6.5.0
|
||
.. _v6.3.0: https://github.com/explosion/thinc/releases/tag/v6.3.0
|
||
.. _v6.2.0: https://github.com/explosion/thinc/releases/tag/v6.2.0
|
||
.. _v6.1.3: https://github.com/explosion/thinc/releases/tag/v6.1.3
|
||
.. _v6.0.0: https://github.com/explosion/thinc/releases/tag/v6.0.0
|
||
|
||
|