|
Metadata-Version: 2.1
|
|
Name: thinc
|
|
Version: 6.12.1
|
|
Summary: Practical Machine Learning for NLP
|
|
Home-page: https://github.com/explosion/thinc
|
|
Author: Matthew Honnibal
|
|
Author-email: matt@explosion.ai
|
|
License: MIT
|
|
Platform: UNKNOWN
|
|
Classifier: Development Status :: 5 - Production/Stable
|
|
Classifier: Environment :: Console
|
|
Classifier: Intended Audience :: Developers
|
|
Classifier: Intended Audience :: Science/Research
|
|
Classifier: License :: OSI Approved :: MIT License
|
|
Classifier: Operating System :: POSIX :: Linux
|
|
Classifier: Operating System :: MacOS :: MacOS X
|
|
Classifier: Operating System :: Microsoft :: Windows
|
|
Classifier: Programming Language :: Cython
|
|
Classifier: Programming Language :: Python :: 2.6
|
|
Classifier: Programming Language :: Python :: 2.7
|
|
Classifier: Programming Language :: Python :: 3.3
|
|
Classifier: Programming Language :: Python :: 3.4
|
|
Classifier: Programming Language :: Python :: 3.5
|
|
Classifier: Programming Language :: Python :: 3.6
|
|
Classifier: Programming Language :: Python :: 3.7
|
|
Classifier: Topic :: Scientific/Engineering
|
|
Requires-Dist: numpy (>=1.7.0)
|
|
Requires-Dist: msgpack (<0.6.0,>=0.5.6)
|
|
Requires-Dist: msgpack-numpy (<0.4.4)
|
|
Requires-Dist: murmurhash (<1.1.0,>=0.28.0)
|
|
Requires-Dist: cymem (<3.0.0,>=2.0.2)
|
|
Requires-Dist: preshed (<3.0.0,>=2.0.1)
|
|
Requires-Dist: cytoolz (<0.10,>=0.9.0)
|
|
Requires-Dist: wrapt (<1.11.0,>=1.10.0)
|
|
Requires-Dist: plac (<1.0.0,>=0.9.6)
|
|
Requires-Dist: tqdm (<5.0.0,>=4.10.0)
|
|
Requires-Dist: six (<2.0.0,>=1.10.0)
|
|
Requires-Dist: dill (<0.3.0,>=0.2.7)
|
|
Requires-Dist: pathlib (==1.0.1) ; python_version < "3.4"
|
|
Provides-Extra: cuda
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda'
|
|
Requires-Dist: cupy (>=5.0.0b4) ; extra == 'cuda'
|
|
Provides-Extra: cuda100
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda100'
|
|
Requires-Dist: cupy-cuda100 (>=5.0.0b4) ; extra == 'cuda100'
|
|
Provides-Extra: cuda80
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda80'
|
|
Requires-Dist: cupy-cuda80 (>=5.0.0b4) ; extra == 'cuda80'
|
|
Provides-Extra: cuda90
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda90'
|
|
Requires-Dist: cupy-cuda90 (>=5.0.0b4) ; extra == 'cuda90'
|
|
Provides-Extra: cuda91
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda91'
|
|
Requires-Dist: cupy-cuda91 (>=5.0.0b4) ; extra == 'cuda91'
|
|
Provides-Extra: cuda92
|
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda92'
|
|
Requires-Dist: cupy-cuda92 (>=5.0.0b4) ; extra == 'cuda92'
|
|
|
|
Thinc: Practical Machine Learning for NLP in Python
|
|
***************************************************
|
|
|
|
**Thinc** is the machine learning library powering `spaCy <https://spacy.io>`_.
|
|
It features a battle-tested linear model designed for large sparse learning
|
|
problems, and a flexible neural network model under development for
|
|
`spaCy v2.0 <https://alpha.spacy.io/usage/v2>`_.
|
|
|
|
Thinc is a practical toolkit for implementing models that follow the
|
|
`"Embed, encode, attend, predict" <https://explosion.ai/blog/deep-learning-formula-nlp>`_
|
|
architecture. It's designed to be easy to install, efficient for CPU usage and
|
|
optimised for NLP and deep learning with text – in particular, hierarchically
|
|
structured input and variable-length sequences.
|
|
|
|
🔮 **Version 6.10 out now!** `Read the release notes here. <https://github.com/explosion/thinc/releases/>`_
|
|
|
|
.. image:: https://img.shields.io/travis/explosion/thinc/master.svg?style=flat-square
|
|
:target: https://travis-ci.org/explosion/thinc
|
|
:alt: Build Status
|
|
|
|
.. image:: https://img.shields.io/appveyor/ci/explosion/thinc/master.svg?style=flat-square
|
|
:target: https://ci.appveyor.com/project/explosion/thinc
|
|
:alt: Appveyor Build Status
|
|
|
|
.. image:: https://img.shields.io/coveralls/explosion/thinc.svg?style=flat-square
|
|
:target: https://coveralls.io/github/explosion/thinc
|
|
:alt: Test Coverage
|
|
|
|
.. image:: https://img.shields.io/github/release/explosion/thinc.svg?style=flat-square
|
|
:target: https://github.com/explosion/thinc/releases
|
|
:alt: Current Release Version
|
|
|
|
.. image:: https://img.shields.io/pypi/v/thinc.svg?style=flat-square
|
|
:target: https://pypi.python.org/pypi/thinc
|
|
:alt: pypi Version
|
|
|
|
.. image:: https://anaconda.org/conda-forge/thinc/badges/version.svg
|
|
:target: https://anaconda.org/conda-forge/thinc
|
|
:alt: conda Version
|
|
|
|
.. image:: https://img.shields.io/badge/gitter-join%20chat%20%E2%86%92-7676d1.svg?style=flat-square
|
|
:target: https://gitter.im/explosion/thinc
|
|
:alt: Thinc on Gitter
|
|
|
|
.. image:: https://img.shields.io/twitter/follow/explosion_ai.svg?style=social&label=Follow
|
|
:target: https://twitter.com/explosion_ai
|
|
:alt: Follow us on Twitter
|
|
|
|
What's where (as of v6.9.0)
|
|
===========================
|
|
|
|
======================== ===
|
|
``thinc.v2v.Model`` Base class.
|
|
``thinc.v2v`` Layers transforming vectors to vectors.
|
|
``thinc.i2v`` Layers embedding IDs to vectors.
|
|
``thinc.t2v`` Layers pooling tensors to vectors.
|
|
``thinc.t2t`` Layers transforming tensors to tensors (e.g. CNN, LSTM).
|
|
``thinc.api`` Higher-order functions, for building networks. Will be renamed.
|
|
``thinc.extra`` Datasets and utilities.
|
|
``thinc.neural.ops`` Container classes for mathematical operations. Will be reorganized.
|
|
``thinc.linear.avgtron`` Legacy efficient Averaged Perceptron implementation.
|
|
======================== ===
|
|
|
|
Development status
|
|
==================
|
|
|
|
Thinc's deep learning functionality is still under active development: APIs are
|
|
unstable, and we're not yet ready to provide usage support. However, if you're
|
|
already quite familiar with neural networks, there's a lot here you might find
|
|
interesting. Thinc's conceptual model is quite different from TensorFlow's.
|
|
Thinc also implements some novel features, such as a small DSL for concisely
|
|
wiring up models, embedding tables that support pre-computation and the
|
|
hashing trick, dynamic batch sizes, a concatenation-based approach to
|
|
variable-length sequences, and support for model averaging for the
|
|
Adam solver (which performs very well).
|
|
|
|
No computational graph – just higher order functions
|
|
======================================================
|
|
|
|
The central problem for a neural network implementation is this: during the
|
|
forward pass, you compute results that will later be useful during the backward
|
|
pass. How do you keep track of this arbitrary state, while making sure that
|
|
layers can be cleanly composed?
|
|
|
|
Most libraries solve this problem by having you declare the forward
|
|
computations, which are then compiled into a graph somewhere behind the scenes.
|
|
Thinc doesn't have a "computational graph". Instead, we just use the stack,
|
|
because we put the state from the forward pass into callbacks.
|
|
|
|
All nodes in the network have a simple signature:
|
|
|
|
.. code:: none
|
|
|
|
f(inputs) -> {outputs, f(d_outputs)->d_inputs}
|
|
|
|
To make this less abstract, here's a ReLu activation, following this signature:
|
|
|
|
.. code:: python
|
|
|
|
def relu(inputs):
|
|
mask = inputs > 0
|
|
def backprop_relu(d_outputs, optimizer):
|
|
return d_outputs * mask
|
|
return inputs * mask, backprop_relu
|
|
|
|
When you call the ``relu`` function, you get back an output variable, and a
|
|
callback. This lets you calculate a gradient using the output, and then pass it
|
|
into the callback to perform the backward pass.
|
|
|
|
This signature makes it easy to build a complex network out of smaller pieces,
|
|
using arbitrary higher-order functions you can write yourself. To make this
|
|
clearer, we need a function for a weights layer. Usually this will be
|
|
implemented as a class — but let's continue using closures, to keep things
|
|
concise, and to keep the simplicity of the interface explicit:
|
|
|
|
.. code:: python
|
|
|
|
import numpy
|
|
|
|
def create_linear_layer(n_out, n_in):
|
|
W = numpy.zeros((n_out, n_in))
|
|
b = numpy.zeros((n_out, 1))
|
|
|
|
def forward(X):
|
|
Y = W @ X + b
|
|
def backward(dY, optimizer):
|
|
dX = W.T @ dY
|
|
dW = numpy.einsum('ik,jk->ij', dY, X)
|
|
db = dY.sum(axis=0)
|
|
|
|
optimizer(W, dW)
|
|
optimizer(b, db)
|
|
|
|
return dX
|
|
return Y, backward
|
|
return forward
|
|
|
|
If we call ``Wb = create_linear_layer(5, 4)``, the variable ``Wb`` will be the
|
|
``forward()`` function, implemented inside the body of ``create_linear_layer()``.
|
|
The `Wb` instance will have access to the ``W`` and ``b`` variable defined in its
|
|
outer scope. If we invoke ``create_linear_layer()`` again, we get a new instance,
|
|
with its own internal state.
|
|
|
|
The ``Wb`` instance and the ``relu`` function have exactly the same signature. This
|
|
makes it easy to write higher order functions to compose them. The most obvious
|
|
thing to do is chain them together:
|
|
|
|
.. code:: python
|
|
|
|
def chain(*layers):
|
|
def forward(X):
|
|
backprops = []
|
|
Y = X
|
|
for layer in layers:
|
|
Y, backprop = layer(Y)
|
|
backprops.append(backprop)
|
|
def backward(dY, optimizer):
|
|
for backprop in reversed(backprops):
|
|
dY = backprop(dY, optimizer)
|
|
return dY
|
|
return Y, backward
|
|
return forward
|
|
|
|
We could now chain our linear layer together with the ``relu`` activation, to
|
|
create a simple feed-forward network:
|
|
|
|
.. code:: python
|
|
|
|
Wb1 = create_linear_layer(10, 5)
|
|
Wb2 = create_linear_layer(3, 10)
|
|
|
|
model = chain(Wb1, relu, Wb2)
|
|
|
|
X = numpy.random.uniform(size=(5, 4))
|
|
|
|
y, bp_y = model(X)
|
|
|
|
dY = y - truth
|
|
dX = bp_y(dY, optimizer)
|
|
|
|
This conceptual model makes Thinc very flexible. The trade-off is that Thinc is
|
|
less convenient and efficient at workloads that fit exactly into what
|
|
`Tensorflow <https://www.tensorflow.org/>`_ etc. are designed for. If your graph
|
|
really is static, and your inputs are homogenous in size and shape,
|
|
`Keras <https://keras.io/>`_ will likely be faster and simpler. But if you want
|
|
to pass normal Python objects through your network, or handle sequences and recursions
|
|
of arbitrary length or complexity, you might find Thinc's design a better fit for
|
|
your problem.
|
|
|
|
Quickstart
|
|
==========
|
|
|
|
Thinc should install cleanly with both `pip <http://pypi.python.org/pypi/thinc>`_ and
|
|
`conda <https://anaconda.org/conda-forge/thinc>`_, for **Pythons 2.7+ and 3.5+**, on
|
|
**Linux**, **macOS / OSX** and **Windows**. Its only system dependency is a compiler
|
|
tool-chain (e.g. ``build-essential``) and the Python development headers (e.g.
|
|
``python-dev``).
|
|
|
|
.. code:: bash
|
|
|
|
pip install thinc
|
|
|
|
For GPU support, we're grateful to use the work of Chainer's cupy module, which provides a numpy-compatible interface for GPU arrays. However, installing Chainer when no GPU is available currently causes an error. We therefore do not list Chainer as an explicit dependency --- so building ``Thinc`` for GPU requires some extra steps:
|
|
|
|
.. code:: bash
|
|
|
|
export CUDA_HOME=/usr/local/cuda-8.0 # Or wherever your CUDA is
|
|
export PATH=$PATH:$CUDA_HOME/bin
|
|
pip install chainer
|
|
python -c "import cupy; assert cupy" # Check it installed
|
|
pip install thinc
|
|
python -c "import thinc.neural.gpu_ops" # Check the GPU ops were built
|
|
|
|
The rest of this section describes how to build Thinc from source. If you have
|
|
`Fabric <http://www.fabfile.org>`_ installed, you can use the shortcut:
|
|
|
|
.. code:: bash
|
|
|
|
git clone https://github.com/explosion/thinc
|
|
cd thinc
|
|
fab clean env make test
|
|
|
|
You can then run the examples as follows:
|
|
|
|
.. code:: bash
|
|
|
|
fab eg.mnist
|
|
fab eg.basic_tagger
|
|
fab eg.cnn_tagger
|
|
|
|
Otherwise, you can build and test explicitly with:
|
|
|
|
.. code:: bash
|
|
|
|
git clone https://github.com/explosion/thinc
|
|
cd thinc
|
|
|
|
virtualenv .env
|
|
source .env/bin/activate
|
|
|
|
pip install -r requirements.txt
|
|
python setup.py build_ext --inplace
|
|
py.test thinc/
|
|
|
|
And then run the examples as follows:
|
|
|
|
.. code:: bash
|
|
|
|
python examples/mnist.py
|
|
python examples/basic_tagger.py
|
|
python examples/cnn_tagger.py
|
|
|
|
|
|
Usage
|
|
=====
|
|
|
|
The Neural Network API is still subject to change, even within minor versions.
|
|
You can get a feel for the current API by checking out the examples. Here are
|
|
a few quick highlights.
|
|
|
|
1. Shape inference
|
|
------------------
|
|
|
|
Models can be created with some dimensions unspecified. Missing dimensions are
|
|
inferred when pre-trained weights are loaded or when training begins. This
|
|
eliminates a common source of programmer error:
|
|
|
|
.. code:: python
|
|
|
|
# Invalid network — shape mismatch
|
|
model = chain(ReLu(512, 748), ReLu(512, 784), Softmax(10))
|
|
|
|
# Leave the dimensions unspecified, and you can't be wrong.
|
|
model = chain(ReLu(512), ReLu(512), Softmax())
|
|
|
|
2. Operator overloading
|
|
-----------------------
|
|
|
|
The ``Model.define_operators()`` classmethod allows you to bind arbitrary
|
|
binary functions to Python operators, for use in any ``Model`` instance. The
|
|
method can (and should) be used as a context-manager, so that the overloading
|
|
is limited to the immediate block. This allows concise and expressive model
|
|
definition:
|
|
|
|
.. code:: python
|
|
|
|
with Model.define_operators({'>>': chain}):
|
|
model = ReLu(512) >> ReLu(512) >> Softmax()
|
|
|
|
The overloading is cleaned up at the end of the block. A fairly arbitrary zoo
|
|
of functions are currently implemented. Some of the most useful:
|
|
|
|
* ``chain(model1, model2)``: Compose two models ``f(x)`` and ``g(x)`` into a single model computing ``g(f(x))``.
|
|
|
|
* ``clone(model1, int)``: Create ``n`` copies of a model, each with distinct weights, and chain them together.
|
|
|
|
* ``concatenate(model1, model2)``: Given two models with output dimensions ``(n,)`` and ``(m,)``, construct a model with output dimensions ``(m+n,)``.
|
|
|
|
* ``add(model1, model2)``: ``add(f(x), g(x)) = f(x)+g(x)``
|
|
|
|
* ``make_tuple(model1, model2)``: Construct tuples of the outputs of two models, at the batch level. The backward pass expects to receive a tuple of gradients, which are routed through the appropriate model, and summed.
|
|
|
|
Putting these things together, here's the sort of tagging model that Thinc is
|
|
designed to make easy.
|
|
|
|
.. code:: python
|
|
|
|
with Model.define_operators({'>>': chain, '**': clone, '|': concatenate}):
|
|
model = (
|
|
add_eol_markers('EOL')
|
|
>> flatten
|
|
>> memoize(
|
|
CharLSTM(char_width)
|
|
| (normalize >> str2int >> Embed(word_width)))
|
|
>> ExtractWindow(nW=2)
|
|
>> BatchNorm(ReLu(hidden_width)) ** 3
|
|
>> Softmax()
|
|
)
|
|
|
|
Not all of these pieces are implemented yet, but hopefully this shows where
|
|
we're going. The ``memoize`` function will be particularly important: in any
|
|
batch of text, the common words will be very common. It's therefore important
|
|
to evaluate models such as the ``CharLSTM`` once per word type per minibatch,
|
|
rather than once per token.
|
|
|
|
3. Callback-based backpropagation
|
|
---------------------------------
|
|
|
|
Most neural network libraries use a computational graph abstraction. This takes
|
|
the execution away from you, so that gradients can be computed automatically.
|
|
Thinc follows a style more like the ``autograd`` library, but with larger
|
|
operations. Usage is as follows:
|
|
|
|
.. code:: python
|
|
|
|
def explicit_sgd_update(X, y):
|
|
sgd = lambda weights, gradient: weights - gradient * 0.001
|
|
yh, finish_update = model.begin_update(X, drop=0.2)
|
|
finish_update(y-yh, sgd)
|
|
|
|
Separating the backpropagation into three parts like this has many advantages.
|
|
The interface to all models is completely uniform — there is no distinction
|
|
between the top-level model you use as a predictor and the internal models for
|
|
the layers. We also make concurrency simple, by making the ``begin_update()``
|
|
step a pure function, and separating the accumulation of the gradient from the
|
|
action of the optimizer.
|
|
|
|
4. Class annotations
|
|
--------------------
|
|
|
|
To keep the class hierarchy shallow, Thinc uses class decorators to reuse code
|
|
for layer definitions. Specifically, the following decorators are available:
|
|
|
|
* ``describe.attributes()``: Allows attributes to be specified by keyword argument. Used especially for dimensions and parameters.
|
|
|
|
* ``describe.on_init()``: Allows callbacks to be specified, which will be called at the end of the ``__init__.py``.
|
|
|
|
* ``describe.on_data()``: Allows callbacks to be specified, which will be called on ``Model.begin_training()``.
|
|
|
|
🛠 Changelog
|
|
============
|
|
|
|
=========== ============== ===========
|
|
Version Date Description
|
|
=========== ============== ===========
|
|
`v6.10.1`_ ``2017-11-15`` Fix GPU install and minor memory leak
|
|
`v6.10.0`_ ``2017-10-28`` CPU efficiency improvements, refactoring
|
|
`v6.9.0`_ ``2017-10-03`` Reorganize layers, bug fix to Layer Normalization
|
|
`v6.8.2`_ ``2017-09-26`` Fix packaging of `gpu_ops`
|
|
`v6.8.1`_ ``2017-08-23`` Fix Windows support
|
|
`v6.8.0`_ ``2017-07-25`` SELU layer, attention, improved GPU/CPU compatibility
|
|
`v6.7.3`_ ``2017-06-05`` Fix convolution on GPU
|
|
`v6.7.2`_ ``2017-06-02`` Bug fixes to serialization
|
|
`v6.7.1`_ ``2017-06-02`` Improve serialization
|
|
`v6.7.0`_ ``2017-06-01`` Fixes to serialization, hash embeddings and flatten ops
|
|
`v6.6.0`_ ``2017-05-14`` Improved GPU usage and examples
|
|
v6.5.2 ``2017-03-20`` *n/a*
|
|
`v6.5.1`_ ``2017-03-20`` Improved linear class and Windows fix
|
|
`v6.5.0`_ ``2017-03-11`` Supervised similarity, fancier embedding and improvements to linear model
|
|
v6.4.0 ``2017-02-15`` *n/a*
|
|
`v6.3.0`_ ``2017-01-25`` Efficiency improvements, argument checking and error messaging
|
|
`v6.2.0`_ ``2017-01-15`` Improve API and introduce overloaded operators
|
|
`v6.1.3`_ ``2017-01-10`` More neural network functions and training continuation
|
|
v6.1.3 ``2017-01-09`` *n/a*
|
|
v6.1.2 ``2017-01-09`` *n/a*
|
|
v6.1.1 ``2017-01-09`` *n/a*
|
|
v6.1.0 ``2017-01-09`` *n/a*
|
|
`v6.0.0`_ ``2016-12-31`` Add ``thinc.neural`` for NLP-oriented deep learning
|
|
=========== ============== ===========
|
|
|
|
.. _v6.10.1: https://github.com/explosion/thinc/releases/tag/v6.10.1
|
|
.. _v6.10.0: https://github.com/explosion/thinc/releases/tag/v6.10.0
|
|
.. _v6.9.0: https://github.com/explosion/thinc/releases/tag/v6.9.0
|
|
.. _v6.8.2: https://github.com/explosion/thinc/releases/tag/v6.8.2
|
|
.. _v6.8.1: https://github.com/explosion/thinc/releases/tag/v6.8.1
|
|
.. _v6.8.0: https://github.com/explosion/thinc/releases/tag/v6.8.0
|
|
.. _v6.7.3: https://github.com/explosion/thinc/releases/tag/v6.7.3
|
|
.. _v6.7.2: https://github.com/explosion/thinc/releases/tag/v6.7.2
|
|
.. _v6.7.1: https://github.com/explosion/thinc/releases/tag/v6.7.1
|
|
.. _v6.7.0: https://github.com/explosion/thinc/releases/tag/v6.7.0
|
|
.. _v6.6.0: https://github.com/explosion/thinc/releases/tag/v6.6.0
|
|
.. _v6.5.1: https://github.com/explosion/thinc/releases/tag/v6.5.1
|
|
.. _v6.5.0: https://github.com/explosion/thinc/releases/tag/v6.5.0
|
|
.. _v6.3.0: https://github.com/explosion/thinc/releases/tag/v6.3.0
|
|
.. _v6.2.0: https://github.com/explosion/thinc/releases/tag/v6.2.0
|
|
.. _v6.1.3: https://github.com/explosion/thinc/releases/tag/v6.1.3
|
|
.. _v6.0.0: https://github.com/explosion/thinc/releases/tag/v6.0.0
|
|
|
|
|