63 lines
2.7 KiB
Text
63 lines
2.7 KiB
Text
Metadata-Version: 2.1
|
|
Name: PyStemmer
|
|
Version: 1.3.0
|
|
Summary: Snowball stemming algorithms, for information retrieval
|
|
Home-page: http://snowball.tartarus.org/
|
|
Author: Richard Boulton
|
|
Author-email: richard@tartarus.org
|
|
Maintainer: Richard Boulton
|
|
Maintainer-email: richard@tartarus.org
|
|
License: ['MIT', 'BSD']
|
|
Download-URL: http://snowball.tartarus.org/wrappers/PyStemmer-1.3.0.tar.gz
|
|
Keywords: python,information retrieval,language processing,morphological analysis,stemming algorithms,stemmers
|
|
Platform: any
|
|
Classifier: Development Status :: 5 - Production/Stable
|
|
Classifier: Intended Audience :: Developers
|
|
Classifier: License :: OSI Approved :: MIT License
|
|
Classifier: License :: OSI Approved :: BSD License
|
|
Classifier: Natural Language :: Danish
|
|
Classifier: Natural Language :: Dutch
|
|
Classifier: Natural Language :: English
|
|
Classifier: Natural Language :: Finnish
|
|
Classifier: Natural Language :: French
|
|
Classifier: Natural Language :: German
|
|
Classifier: Natural Language :: Italian
|
|
Classifier: Natural Language :: Norwegian
|
|
Classifier: Natural Language :: Portuguese
|
|
Classifier: Natural Language :: Russian
|
|
Classifier: Natural Language :: Spanish
|
|
Classifier: Natural Language :: Swedish
|
|
Classifier: Operating System :: OS Independent
|
|
Classifier: Programming Language :: C
|
|
Classifier: Programming Language :: Other
|
|
Classifier: Programming Language :: Python
|
|
Classifier: Programming Language :: Python :: 2
|
|
Classifier: Programming Language :: Python :: 2.6
|
|
Classifier: Programming Language :: Python :: 2.7
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: Programming Language :: Python :: 3.2
|
|
Classifier: Programming Language :: Python :: 3.3
|
|
Classifier: Topic :: Database
|
|
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
|
|
Classifier: Topic :: Text Processing :: Indexing
|
|
Classifier: Topic :: Text Processing :: Linguistic
|
|
|
|
Stemming algorithms
|
|
|
|
PyStemmer provides access to efficient algorithms for calculating a
|
|
"stemmed" form of a word. This is a form with most of the common
|
|
morphological endings removed; hopefully representing a common
|
|
linguistic base form. This is most useful in building search engines
|
|
and information retrieval software; for example, a search with stemming
|
|
enabled should be able to find a document containing "cycling" given the
|
|
query "cycles".
|
|
|
|
PyStemmer provides algorithms for several (mainly european) languages,
|
|
by wrapping the libstemmer library from the Snowball project in a Python
|
|
module.
|
|
|
|
It also provides access to the classic Porter stemming algorithm for
|
|
english: although this has been superceded by an improved algorithm, the
|
|
original algorithm may be of interest to information retrieval
|
|
researchers wishing to reproduce results of earlier experiments.
|
|
|