You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

63 lines
2.7 KiB

Metadata-Version: 2.1
Name: PyStemmer
Version: 1.3.0
Summary: Snowball stemming algorithms, for information retrieval
Home-page: http://snowball.tartarus.org/
Author: Richard Boulton
Author-email: richard@tartarus.org
Maintainer: Richard Boulton
Maintainer-email: richard@tartarus.org
License: ['MIT', 'BSD']
Download-URL: http://snowball.tartarus.org/wrappers/PyStemmer-1.3.0.tar.gz
Keywords: python,information retrieval,language processing,morphological analysis,stemming algorithms,stemmers
Platform: any
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: Danish
Classifier: Natural Language :: Dutch
Classifier: Natural Language :: English
Classifier: Natural Language :: Finnish
Classifier: Natural Language :: French
Classifier: Natural Language :: German
Classifier: Natural Language :: Italian
Classifier: Natural Language :: Norwegian
Classifier: Natural Language :: Portuguese
Classifier: Natural Language :: Russian
Classifier: Natural Language :: Spanish
Classifier: Natural Language :: Swedish
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: C
Classifier: Programming Language :: Other
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Topic :: Database
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Text Processing :: Linguistic
Stemming algorithms
PyStemmer provides access to efficient algorithms for calculating a
"stemmed" form of a word. This is a form with most of the common
morphological endings removed; hopefully representing a common
linguistic base form. This is most useful in building search engines
and information retrieval software; for example, a search with stemming
enabled should be able to find a document containing "cycling" given the
query "cycles".
PyStemmer provides algorithms for several (mainly european) languages,
by wrapping the libstemmer library from the Snowball project in a Python
module.
It also provides access to the classic Porter stemming algorithm for
english: although this has been superceded by an improved algorithm, the
original algorithm may be of interest to information retrieval
researchers wishing to reproduce results of earlier experiments.