385 lines
14 KiB
Text
385 lines
14 KiB
Text
|
Metadata-Version: 2.1
|
|||
|
Name: spacy
|
|||
|
Version: 2.0.18
|
|||
|
Summary: Industrial-strength Natural Language Processing (NLP) with Python and Cython
|
|||
|
Home-page: https://spacy.io
|
|||
|
Author: Explosion AI
|
|||
|
Author-email: contact@explosion.ai
|
|||
|
License: MIT
|
|||
|
Platform: UNKNOWN
|
|||
|
Classifier: Development Status :: 5 - Production/Stable
|
|||
|
Classifier: Environment :: Console
|
|||
|
Classifier: Intended Audience :: Developers
|
|||
|
Classifier: Intended Audience :: Science/Research
|
|||
|
Classifier: License :: OSI Approved :: MIT License
|
|||
|
Classifier: Operating System :: POSIX :: Linux
|
|||
|
Classifier: Operating System :: MacOS :: MacOS X
|
|||
|
Classifier: Operating System :: Microsoft :: Windows
|
|||
|
Classifier: Programming Language :: Cython
|
|||
|
Classifier: Programming Language :: Python :: 2
|
|||
|
Classifier: Programming Language :: Python :: 2.7
|
|||
|
Classifier: Programming Language :: Python :: 3
|
|||
|
Classifier: Programming Language :: Python :: 3.4
|
|||
|
Classifier: Programming Language :: Python :: 3.5
|
|||
|
Classifier: Programming Language :: Python :: 3.6
|
|||
|
Classifier: Topic :: Scientific/Engineering
|
|||
|
Requires-Dist: numpy (>=1.15.0)
|
|||
|
Requires-Dist: murmurhash (<1.1.0,>=0.28.0)
|
|||
|
Requires-Dist: cymem (<2.1.0,>=2.0.2)
|
|||
|
Requires-Dist: preshed (<2.1.0,>=2.0.1)
|
|||
|
Requires-Dist: thinc (<6.13.0,>=6.12.1)
|
|||
|
Requires-Dist: plac (<1.0.0,>=0.9.6)
|
|||
|
Requires-Dist: ujson (>=1.35)
|
|||
|
Requires-Dist: dill (<0.3,>=0.2)
|
|||
|
Requires-Dist: regex (==2018.01.10)
|
|||
|
Requires-Dist: requests (<3.0.0,>=2.13.0)
|
|||
|
Requires-Dist: pathlib (==1.0.1) ; python_version < "3.4"
|
|||
|
Provides-Extra: cuda
|
|||
|
Requires-Dist: cupy (>=4.0) ; extra == 'cuda'
|
|||
|
Provides-Extra: cuda100
|
|||
|
Requires-Dist: cupy-cuda100 (>=4.0) ; extra == 'cuda100'
|
|||
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda100'
|
|||
|
Provides-Extra: cuda80
|
|||
|
Requires-Dist: cupy-cuda80 (>=4.0) ; extra == 'cuda80'
|
|||
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda80'
|
|||
|
Provides-Extra: cuda90
|
|||
|
Requires-Dist: cupy-cuda90 (>=4.0) ; extra == 'cuda90'
|
|||
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda90'
|
|||
|
Provides-Extra: cuda91
|
|||
|
Requires-Dist: cupy-cuda91 (>=4.0) ; extra == 'cuda91'
|
|||
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda91'
|
|||
|
Provides-Extra: cuda92
|
|||
|
Requires-Dist: cupy-cuda92 (>=4.0) ; extra == 'cuda92'
|
|||
|
Requires-Dist: thinc-gpu-ops (<0.1.0,>=0.0.3) ; extra == 'cuda92'
|
|||
|
|
|||
|
spaCy: Industrial-strength NLP
|
|||
|
******************************
|
|||
|
|
|||
|
spaCy is a library for advanced Natural Language Processing in Python and Cython.
|
|||
|
It's built on the very latest research, and was designed from day one to be
|
|||
|
used in real products. spaCy comes with
|
|||
|
`pre-trained statistical models <https://spacy.io/models>`_ and word
|
|||
|
vectors, and currently supports tokenization for **30+ languages**. It features
|
|||
|
the **fastest syntactic parser** in the world, convolutional **neural network models**
|
|||
|
for tagging, parsing and **named entity recognition** and easy **deep learning**
|
|||
|
integration. It's commercial open-source software, released under the MIT license.
|
|||
|
|
|||
|
💫 **Version 2.0 out now!** `Check out the release notes here. <https://github.com/explosion/spaCy/releases>`_
|
|||
|
|
|||
|
.. image:: https://img.shields.io/travis/explosion/spaCy/master.svg?style=flat-square&logo=travis
|
|||
|
:target: https://travis-ci.org/explosion/spaCy
|
|||
|
:alt: Build Status
|
|||
|
|
|||
|
.. image:: https://img.shields.io/appveyor/ci/explosion/spaCy/master.svg?style=flat-square&logo=appveyor
|
|||
|
:target: https://ci.appveyor.com/project/explosion/spaCy
|
|||
|
:alt: Appveyor Build Status
|
|||
|
|
|||
|
.. image:: https://img.shields.io/github/release/explosion/spacy.svg?style=flat-square
|
|||
|
:target: https://github.com/explosion/spaCy/releases
|
|||
|
:alt: Current Release Version
|
|||
|
|
|||
|
.. image:: https://img.shields.io/pypi/v/spacy.svg?style=flat-square
|
|||
|
:target: https://pypi.python.org/pypi/spacy
|
|||
|
:alt: pypi Version
|
|||
|
|
|||
|
.. image:: https://img.shields.io/conda/vn/conda-forge/spacy.svg?style=flat-square
|
|||
|
:target: https://anaconda.org/conda-forge/spacy
|
|||
|
:alt: conda Version
|
|||
|
|
|||
|
.. image:: https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true&style=flat-square&logo=python&logoColor=white
|
|||
|
:target: https://github.com/explosion/wheelwright/releases
|
|||
|
:alt: Python wheels
|
|||
|
|
|||
|
.. image:: https://img.shields.io/twitter/follow/spacy_io.svg?style=social&label=Follow
|
|||
|
:target: https://twitter.com/spacy_io
|
|||
|
:alt: spaCy on Twitter
|
|||
|
|
|||
|
📖 Documentation
|
|||
|
================
|
|||
|
|
|||
|
=================== ===
|
|||
|
`spaCy 101`_ New to spaCy? Here's everything you need to know!
|
|||
|
`Usage Guides`_ How to use spaCy and its features.
|
|||
|
`New in v2.0`_ New features, backwards incompatibilities and migration guide.
|
|||
|
`API Reference`_ The detailed reference for spaCy's API.
|
|||
|
`Models`_ Download statistical language models for spaCy.
|
|||
|
`Universe`_ Libraries, extensions, demos, books and courses.
|
|||
|
`Changelog`_ Changes and version history.
|
|||
|
`Contribute`_ How to contribute to the spaCy project and code base.
|
|||
|
=================== ===
|
|||
|
|
|||
|
.. _spaCy 101: https://spacy.io/usage/spacy-101
|
|||
|
.. _New in v2.0: https://spacy.io/usage/v2#migrating
|
|||
|
.. _Usage Guides: https://spacy.io/usage/
|
|||
|
.. _API Reference: https://spacy.io/api/
|
|||
|
.. _Models: https://spacy.io/models
|
|||
|
.. _Universe: https://spacy.io/universe
|
|||
|
.. _Changelog: https://spacy.io/usage/#changelog
|
|||
|
.. _Contribute: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md
|
|||
|
|
|||
|
💬 Where to ask questions
|
|||
|
==========================
|
|||
|
|
|||
|
The spaCy project is maintained by `@honnibal <https://github.com/honnibal>`_
|
|||
|
and `@ines <https://github.com/ines>`_. Please understand that we won't be able
|
|||
|
to provide individual support via email. We also believe that help is much more
|
|||
|
valuable if it's shared publicly, so that more people can benefit from it.
|
|||
|
|
|||
|
====================== ===
|
|||
|
**Bug Reports** `GitHub Issue Tracker`_
|
|||
|
**Usage Questions** `Stack Overflow`_, `Gitter Chat`_, `Reddit User Group`_
|
|||
|
**General Discussion** `Gitter Chat`_, `Reddit User Group`_
|
|||
|
====================== ===
|
|||
|
|
|||
|
.. _GitHub Issue Tracker: https://github.com/explosion/spaCy/issues
|
|||
|
.. _Stack Overflow: http://stackoverflow.com/questions/tagged/spacy
|
|||
|
.. _Gitter Chat: https://gitter.im/explosion/spaCy
|
|||
|
.. _Reddit User Group: https://www.reddit.com/r/spacynlp
|
|||
|
|
|||
|
Features
|
|||
|
========
|
|||
|
|
|||
|
* **Fastest syntactic parser** in the world
|
|||
|
* **Named entity** recognition
|
|||
|
* Non-destructive **tokenization**
|
|||
|
* Support for **30+ languages**
|
|||
|
* Pre-trained `statistical models <https://spacy.io/models>`_ and word vectors
|
|||
|
* Easy **deep learning** integration
|
|||
|
* Part-of-speech tagging
|
|||
|
* Labelled dependency parsing
|
|||
|
* Syntax-driven sentence segmentation
|
|||
|
* Built in **visualizers** for syntax and NER
|
|||
|
* Convenient string-to-hash mapping
|
|||
|
* Export to numpy data arrays
|
|||
|
* Efficient binary serialization
|
|||
|
* Easy **model packaging** and deployment
|
|||
|
* State-of-the-art speed
|
|||
|
* Robust, rigorously evaluated accuracy
|
|||
|
|
|||
|
📖 **For more details, see the** `facts, figures and benchmarks <https://spacy.io/usage/facts-figures>`_.
|
|||
|
|
|||
|
Install spaCy
|
|||
|
=============
|
|||
|
|
|||
|
For detailed installation instructions, see
|
|||
|
the `documentation <https://spacy.io/usage>`_.
|
|||
|
|
|||
|
==================== ===
|
|||
|
**Operating system** macOS / OS X, Linux, Windows (Cygwin, MinGW, Visual Studio)
|
|||
|
**Python version** CPython 2.7, 3.4+. Only 64 bit.
|
|||
|
**Package managers** `pip`_, `conda`_ (via ``conda-forge``)
|
|||
|
==================== ===
|
|||
|
|
|||
|
.. _pip: https://pypi.python.org/pypi/spacy
|
|||
|
.. _conda: https://anaconda.org/conda-forge/spacy
|
|||
|
|
|||
|
pip
|
|||
|
---
|
|||
|
|
|||
|
Using pip, spaCy releases are available as source packages and binary wheels
|
|||
|
(as of ``v2.0.13``).
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
pip install spacy
|
|||
|
|
|||
|
When using pip it is generally recommended to install packages in a virtual
|
|||
|
environment to avoid modifying system state:
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
python -m venv .env
|
|||
|
source .env/bin/activate
|
|||
|
pip install spacy
|
|||
|
|
|||
|
conda
|
|||
|
-----
|
|||
|
|
|||
|
Thanks to our great community, we've finally re-added conda support. You can now
|
|||
|
install spaCy via ``conda-forge``:
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
conda config --add channels conda-forge
|
|||
|
conda install spacy
|
|||
|
|
|||
|
For the feedstock including the build recipe and configuration,
|
|||
|
check out `this repository <https://github.com/conda-forge/spacy-feedstock>`_.
|
|||
|
Improvements and pull requests to the recipe and setup are always appreciated.
|
|||
|
|
|||
|
Updating spaCy
|
|||
|
--------------
|
|||
|
|
|||
|
Some updates to spaCy may require downloading new statistical models. If you're
|
|||
|
running spaCy v2.0 or higher, you can use the ``validate`` command to check if
|
|||
|
your installed models are compatible and if not, print details on how to update
|
|||
|
them:
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
pip install -U spacy
|
|||
|
python -m spacy validate
|
|||
|
|
|||
|
If you've trained your own models, keep in mind that your training and runtime
|
|||
|
inputs must match. After updating spaCy, we recommend **retraining your models**
|
|||
|
with the new version.
|
|||
|
|
|||
|
📖 **For details on upgrading from spaCy 1.x to spaCy 2.x, see the**
|
|||
|
`migration guide <https://spacy.io/usage/v2#migrating>`_.
|
|||
|
|
|||
|
Download models
|
|||
|
===============
|
|||
|
|
|||
|
As of v1.7.0, models for spaCy can be installed as **Python packages**.
|
|||
|
This means that they're a component of your application, just like any
|
|||
|
other module. Models can be installed using spaCy's ``download`` command,
|
|||
|
or manually by pointing pip to a path or URL.
|
|||
|
|
|||
|
======================= ===
|
|||
|
`Available Models`_ Detailed model descriptions, accuracy figures and benchmarks.
|
|||
|
`Models Documentation`_ Detailed usage instructions.
|
|||
|
======================= ===
|
|||
|
|
|||
|
.. _Available Models: https://spacy.io/models
|
|||
|
.. _Models Documentation: https://spacy.io/docs/usage/models
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
# out-of-the-box: download best-matching default model
|
|||
|
python -m spacy download en
|
|||
|
|
|||
|
# download best-matching version of specific model for your spaCy installation
|
|||
|
python -m spacy download en_core_web_lg
|
|||
|
|
|||
|
# pip install .tar.gz archive from path or URL
|
|||
|
pip install /Users/you/en_core_web_sm-2.0.0.tar.gz
|
|||
|
|
|||
|
Loading and using models
|
|||
|
------------------------
|
|||
|
|
|||
|
To load a model, use ``spacy.load()`` with the model's shortcut link:
|
|||
|
|
|||
|
.. code:: python
|
|||
|
|
|||
|
import spacy
|
|||
|
nlp = spacy.load('en')
|
|||
|
doc = nlp(u'This is a sentence.')
|
|||
|
|
|||
|
If you've installed a model via pip, you can also ``import`` it directly and
|
|||
|
then call its ``load()`` method:
|
|||
|
|
|||
|
.. code:: python
|
|||
|
|
|||
|
import spacy
|
|||
|
import en_core_web_sm
|
|||
|
|
|||
|
nlp = en_core_web_sm.load()
|
|||
|
doc = nlp(u'This is a sentence.')
|
|||
|
|
|||
|
📖 **For more info and examples, check out the**
|
|||
|
`models documentation <https://spacy.io/docs/usage/models>`_.
|
|||
|
|
|||
|
Support for older versions
|
|||
|
--------------------------
|
|||
|
|
|||
|
If you're using an older version (``v1.6.0`` or below), you can still download
|
|||
|
and install the old models from within spaCy using ``python -m spacy.en.download all``
|
|||
|
or ``python -m spacy.de.download all``. The ``.tar.gz`` archives are also
|
|||
|
`attached to the v1.6.0 release <https://github.com/explosion/spaCy/tree/v1.6.0>`_.
|
|||
|
To download and install the models manually, unpack the archive, drop the
|
|||
|
contained directory into ``spacy/data`` and load the model via ``spacy.load('en')``
|
|||
|
or ``spacy.load('de')``.
|
|||
|
|
|||
|
Compile from source
|
|||
|
===================
|
|||
|
|
|||
|
The other way to install spaCy is to clone its
|
|||
|
`GitHub repository <https://github.com/explosion/spaCy>`_ and build it from
|
|||
|
source. That is the common way if you want to make changes to the code base.
|
|||
|
You'll need to make sure that you have a development environment consisting of a
|
|||
|
Python distribution including header files, a compiler,
|
|||
|
`pip <https://pip.pypa.io/en/latest/installing/>`__, `virtualenv <https://virtualenv.pypa.io/>`_
|
|||
|
and `git <https://git-scm.com>`_ installed. The compiler part is the trickiest.
|
|||
|
How to do that depends on your system. See notes on Ubuntu, OS X and Windows for
|
|||
|
details.
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
# make sure you are using the latest pip
|
|||
|
python -m pip install -U pip
|
|||
|
git clone https://github.com/explosion/spaCy
|
|||
|
cd spaCy
|
|||
|
|
|||
|
python -m venv .env
|
|||
|
source .env/bin/activate
|
|||
|
export PYTHONPATH=`pwd`
|
|||
|
pip install -r requirements.txt
|
|||
|
python setup.py build_ext --inplace
|
|||
|
|
|||
|
Compared to regular install via pip, `requirements.txt <requirements.txt>`_
|
|||
|
additionally installs developer dependencies such as Cython. For more details
|
|||
|
and instructions, see the documentation on
|
|||
|
`compiling spaCy from source <https://spacy.io/usage/#source>`_ and the
|
|||
|
`quickstart widget <https://spacy.io/usage/#section-quickstart>`_ to get
|
|||
|
the right commands for your platform and Python version.
|
|||
|
|
|||
|
Instead of the above verbose commands, you can also use the following
|
|||
|
`Fabric <http://www.fabfile.org/>`_ commands. All commands assume that your
|
|||
|
virtual environment is located in a directory ``.env``. If you're using a
|
|||
|
different directory, you can change it via the environment variable ``VENV_DIR``,
|
|||
|
for example ``VENV_DIR=".custom-env" fab clean make``.
|
|||
|
|
|||
|
============= ===
|
|||
|
``fab env`` Create virtual environment and delete previous one, if it exists.
|
|||
|
``fab make`` Compile the source.
|
|||
|
``fab clean`` Remove compiled objects, including the generated C++.
|
|||
|
``fab test`` Run basic tests, aborting after first failure.
|
|||
|
============= ===
|
|||
|
|
|||
|
Ubuntu
|
|||
|
------
|
|||
|
|
|||
|
Install system-level dependencies via ``apt-get``:
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
sudo apt-get install build-essential python-dev git
|
|||
|
|
|||
|
macOS / OS X
|
|||
|
------------
|
|||
|
|
|||
|
Install a recent version of `XCode <https://developer.apple.com/xcode/>`_,
|
|||
|
including the so-called "Command Line Tools". macOS and OS X ship with Python
|
|||
|
and git preinstalled.
|
|||
|
|
|||
|
Windows
|
|||
|
-------
|
|||
|
|
|||
|
Install a version of `Visual Studio Express <https://www.visualstudio.com/vs/visual-studio-express/>`_
|
|||
|
or higher that matches the version that was used to compile your Python
|
|||
|
interpreter. For official distributions these are VS 2008 (Python 2.7),
|
|||
|
VS 2010 (Python 3.4) and VS 2015 (Python 3.5).
|
|||
|
|
|||
|
Run tests
|
|||
|
=========
|
|||
|
|
|||
|
spaCy comes with an `extensive test suite <spacy/tests>`_. In order to run the
|
|||
|
tests, you'll usually want to clone the repository and build spaCy from source.
|
|||
|
This will also install the required development dependencies and test utilities
|
|||
|
defined in the ``requirements.txt``.
|
|||
|
|
|||
|
Alternatively, you can find out where spaCy is installed and run ``pytest`` on
|
|||
|
that directory. Don't forget to also install the test utilities via spaCy's
|
|||
|
``requirements.txt``:
|
|||
|
|
|||
|
.. code:: bash
|
|||
|
|
|||
|
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
|
|||
|
pip install -r path/to/requirements.txt
|
|||
|
python -m pytest <spacy-directory>
|
|||
|
|
|||
|
See `the documentation <https://spacy.io/usage/#tests>`_ for more details and
|
|||
|
examples.
|
|||
|
|
|||
|
|