|
|
- Metadata-Version: 2.1
- Name: PyStemmer
- Version: 1.3.0
- Summary: Snowball stemming algorithms, for information retrieval
- Home-page: http://snowball.tartarus.org/
- Author: Richard Boulton
- Author-email: richard@tartarus.org
- Maintainer: Richard Boulton
- Maintainer-email: richard@tartarus.org
- License: ['MIT', 'BSD']
- Download-URL: http://snowball.tartarus.org/wrappers/PyStemmer-1.3.0.tar.gz
- Keywords: python,information retrieval,language processing,morphological analysis,stemming algorithms,stemmers
- Platform: any
- Classifier: Development Status :: 5 - Production/Stable
- Classifier: Intended Audience :: Developers
- Classifier: License :: OSI Approved :: MIT License
- Classifier: License :: OSI Approved :: BSD License
- Classifier: Natural Language :: Danish
- Classifier: Natural Language :: Dutch
- Classifier: Natural Language :: English
- Classifier: Natural Language :: Finnish
- Classifier: Natural Language :: French
- Classifier: Natural Language :: German
- Classifier: Natural Language :: Italian
- Classifier: Natural Language :: Norwegian
- Classifier: Natural Language :: Portuguese
- Classifier: Natural Language :: Russian
- Classifier: Natural Language :: Spanish
- Classifier: Natural Language :: Swedish
- Classifier: Operating System :: OS Independent
- Classifier: Programming Language :: C
- Classifier: Programming Language :: Other
- Classifier: Programming Language :: Python
- Classifier: Programming Language :: Python :: 2
- Classifier: Programming Language :: Python :: 2.6
- Classifier: Programming Language :: Python :: 2.7
- Classifier: Programming Language :: Python :: 3
- Classifier: Programming Language :: Python :: 3.2
- Classifier: Programming Language :: Python :: 3.3
- Classifier: Topic :: Database
- Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
- Classifier: Topic :: Text Processing :: Indexing
- Classifier: Topic :: Text Processing :: Linguistic
-
- Stemming algorithms
-
- PyStemmer provides access to efficient algorithms for calculating a
- "stemmed" form of a word. This is a form with most of the common
- morphological endings removed; hopefully representing a common
- linguistic base form. This is most useful in building search engines
- and information retrieval software; for example, a search with stemming
- enabled should be able to find a document containing "cycling" given the
- query "cycles".
-
- PyStemmer provides algorithms for several (mainly european) languages,
- by wrapping the libstemmer library from the Snowball project in a Python
- module.
-
- It also provides access to the classic Porter stemming algorithm for
- english: although this has been superceded by an improved algorithm, the
- original algorithm may be of interest to information retrieval
- researchers wishing to reproduce results of earlier experiments.
-
|