You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

70 lines
2.1 KiB

4 years ago
  1. Chardet: The Universal Character Encoding Detector
  2. --------------------------------------------------
  3. .. image:: https://img.shields.io/travis/chardet/chardet/stable.svg
  4. :alt: Build status
  5. :target: https://travis-ci.org/chardet/chardet
  6. .. image:: https://img.shields.io/coveralls/chardet/chardet/stable.svg
  7. :target: https://coveralls.io/r/chardet/chardet
  8. .. image:: https://img.shields.io/pypi/v/chardet.svg
  9. :target: https://warehouse.python.org/project/chardet/
  10. :alt: Latest version on PyPI
  11. .. image:: https://img.shields.io/pypi/l/chardet.svg
  12. :alt: License
  13. Detects
  14. - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
  15. - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
  16. - EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
  17. - EUC-KR, ISO-2022-KR (Korean)
  18. - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
  19. - ISO-8859-5, windows-1251 (Bulgarian)
  20. - ISO-8859-1, windows-1252 (Western European languages)
  21. - ISO-8859-7, windows-1253 (Greek)
  22. - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
  23. - TIS-620 (Thai)
  24. .. note::
  25. Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily
  26. disabled until we can retrain the models.
  27. Requires Python 2.6, 2.7, or 3.3+.
  28. Installation
  29. ------------
  30. Install from `PyPI <https://pypi.python.org/pypi/chardet>`_::
  31. pip install chardet
  32. Documentation
  33. -------------
  34. For users, docs are now available at https://chardet.readthedocs.io/.
  35. Command-line Tool
  36. -----------------
  37. chardet comes with a command-line script which reports on the encodings of one
  38. or more files::
  39. % chardetect somefile someotherfile
  40. somefile: windows-1252 with confidence 0.5
  41. someotherfile: ascii with confidence 1.0
  42. About
  43. -----
  44. This is a continuation of Mark Pilgrim's excellent chardet. Previously, two
  45. versions needed to be maintained: one that supported python 2.x and one that
  46. supported python 3.x. We've recently merged with `Ian Cordasco <https://github.com/sigmavirus24>`_'s
  47. `charade <https://github.com/sigmavirus24/charade>`_ fork, so now we have one
  48. coherent version that works for Python 2.6+.
  49. :maintainer: Dan Blanchard