You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

239 lines
8.7 KiB

4 years ago
  1. Metadata-Version: 2.1
  2. Name: idna
  3. Version: 2.7
  4. Summary: Internationalized Domain Names in Applications (IDNA)
  5. Home-page: https://github.com/kjd/idna
  6. Author: Kim Davies
  7. Author-email: kim@cynosure.com.au
  8. License: BSD-like
  9. Platform: UNKNOWN
  10. Classifier: Development Status :: 5 - Production/Stable
  11. Classifier: Intended Audience :: Developers
  12. Classifier: Intended Audience :: System Administrators
  13. Classifier: License :: OSI Approved :: BSD License
  14. Classifier: Operating System :: OS Independent
  15. Classifier: Programming Language :: Python
  16. Classifier: Programming Language :: Python :: 2.6
  17. Classifier: Programming Language :: Python :: 2.7
  18. Classifier: Programming Language :: Python :: 3
  19. Classifier: Programming Language :: Python :: 3.3
  20. Classifier: Programming Language :: Python :: 3.4
  21. Classifier: Programming Language :: Python :: 3.5
  22. Classifier: Programming Language :: Python :: 3.6
  23. Classifier: Topic :: Internet :: Name Service (DNS)
  24. Classifier: Topic :: Software Development :: Libraries :: Python Modules
  25. Classifier: Topic :: Utilities
  26. Internationalized Domain Names in Applications (IDNA)
  27. =====================================================
  28. Support for the Internationalised Domain Names in Applications
  29. (IDNA) protocol as specified in `RFC 5891 <http://tools.ietf.org/html/rfc5891>`_.
  30. This is the latest version of the protocol and is sometimes referred to as
  31. “IDNA 2008”.
  32. This library also provides support for Unicode Technical Standard 46,
  33. `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_.
  34. This acts as a suitable replacement for the “encodings.idna” module that
  35. comes with the Python standard library, but only supports the
  36. old, deprecated IDNA specification (`RFC 3490 <http://tools.ietf.org/html/rfc3490>`_).
  37. Basic functions are simply executed:
  38. .. code-block:: pycon
  39. # Python 3
  40. >>> import idna
  41. >>> idna.encode('ドメイン.テスト')
  42. b'xn--eckwd4c7c.xn--zckzah'
  43. >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
  44. ドメイン.テスト
  45. # Python 2
  46. >>> import idna
  47. >>> idna.encode(u'ドメイン.テスト')
  48. 'xn--eckwd4c7c.xn--zckzah'
  49. >>> print idna.decode('xn--eckwd4c7c.xn--zckzah')
  50. ドメイン.テスト
  51. Packages
  52. --------
  53. The latest tagged release version is published in the PyPI repository:
  54. .. image:: https://badge.fury.io/py/idna.svg
  55. :target: http://badge.fury.io/py/idna
  56. Installation
  57. ------------
  58. To install this library, you can use pip:
  59. .. code-block:: bash
  60. $ pip install idna
  61. Alternatively, you can install the package using the bundled setup script:
  62. .. code-block:: bash
  63. $ python setup.py install
  64. This library works with Python 2.6 or later, and Python 3.3 or later.
  65. Usage
  66. -----
  67. For typical usage, the ``encode`` and ``decode`` functions will take a domain
  68. name argument and perform a conversion to A-labels or U-labels respectively.
  69. .. code-block:: pycon
  70. # Python 3
  71. >>> import idna
  72. >>> idna.encode('ドメイン.テスト')
  73. b'xn--eckwd4c7c.xn--zckzah'
  74. >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
  75. ドメイン.テスト
  76. You may use the codec encoding and decoding methods using the
  77. ``idna.codec`` module:
  78. .. code-block:: pycon
  79. # Python 2
  80. >>> import idna.codec
  81. >>> print u'домена.испытание'.encode('idna')
  82. xn--80ahd1agd.xn--80akhbyknj4f
  83. >>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna')
  84. домена.испытание
  85. Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel``
  86. functions if necessary:
  87. .. code-block:: pycon
  88. # Python 2
  89. >>> idna.alabel(u'测试')
  90. 'xn--0zwm56d'
  91. Compatibility Mapping (UTS #46)
  92. +++++++++++++++++++++++++++++++
  93. As described in `RFC 5895 <http://tools.ietf.org/html/rfc5895>`_, the IDNA
  94. specification no longer normalizes input from different potential ways a user
  95. may input a domain name. This functionality, known as a “mapping”, is now
  96. considered by the specification to be a local user-interface issue distinct
  97. from IDNA conversion functionality.
  98. This library provides one such mapping, that was developed by the Unicode
  99. Consortium. Known as `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_,
  100. it provides for both a regular mapping for typical applications, as well as
  101. a transitional mapping to help migrate from older IDNA 2003 applications.
  102. For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
  103. LETTER K* is not allowed (nor are capital letters in general). UTS 46 will
  104. convert this into lower case prior to applying the IDNA conversion.
  105. .. code-block:: pycon
  106. # Python 3
  107. >>> import idna
  108. >>> idna.encode(u'Königsgäßchen')
  109. ...
  110. idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
  111. >>> idna.encode('Königsgäßchen', uts46=True)
  112. b'xn--knigsgchen-b4a3dun'
  113. >>> print(idna.decode('xn--knigsgchen-b4a3dun'))
  114. königsgäßchen
  115. Transitional processing provides conversions to help transition from the older
  116. 2003 standard to the current standard. For example, in the original IDNA
  117. specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two
  118. *LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this
  119. conversion is not performed.
  120. .. code-block:: pycon
  121. # Python 2
  122. >>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True)
  123. 'xn--knigsgsschen-lcb0w'
  124. Implementors should use transitional processing with caution, only in rare
  125. cases where conversion from legacy labels to current labels must be performed
  126. (i.e. IDNA implementations that pre-date 2008). For typical applications
  127. that just need to convert labels, transitional processing is unlikely to be
  128. beneficial and could produce unexpected incompatible results.
  129. ``encodings.idna`` Compatibility
  130. ++++++++++++++++++++++++++++++++
  131. Function calls from the Python built-in ``encodings.idna`` module are
  132. mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
  133. Simply substitute the ``import`` clause in your code to refer to the
  134. new module name.
  135. Exceptions
  136. ----------
  137. All errors raised during the conversion following the specification should
  138. raise an exception derived from the ``idna.IDNAError`` base class.
  139. More specific exceptions that may be generated as ``idna.IDNABidiError``
  140. when the error reflects an illegal combination of left-to-right and right-to-left
  141. characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is
  142. an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext``
  143. when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO
  144. or CONTEXTJ but the contextual requirements are not satisfied.)
  145. Building and Diagnostics
  146. ------------------------
  147. The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for
  148. performance. These tables are derived from computing against eligibility criteria
  149. in the respective standards. These tables are computed using the command-line
  150. script ``tools/idna-data``.
  151. This tool will fetch relevant tables from the Unicode Consortium and perform the
  152. required calculations to identify eligibility. It has three main modes:
  153. * ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``,
  154. the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
  155. who wish to track this library against a different Unicode version may use this tool
  156. to manually generate a different version of the ``idnadata.py`` and ``uts46data.py``
  157. files.
  158. * ``idna-data make-table``. Generate a table of the IDNA disposition
  159. (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
  160. 5892 and the pre-computed tables published by `IANA <http://iana.org/>`_.
  161. * ``idna-data U+0061``. Prints debugging output on the various properties
  162. associated with an individual Unicode codepoint (in this case, U+0061), that are
  163. used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
  164. or analysis.
  165. The tool accepts a number of arguments, described using ``idna-data -h``. Most notably,
  166. the ``--version`` argument allows the specification of the version of Unicode to use
  167. in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata``
  168. will generate library data against Unicode 9.0.0.
  169. Note that this script requires Python 3, but all generated library data will work
  170. in Python 2.6+.
  171. Testing
  172. -------
  173. The library has a test suite based on each rule of the IDNA specification, as
  174. well as tests that are provided as part of the Unicode Technical Standard 46,
  175. `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_.
  176. The tests are run automatically on each commit at Travis CI:
  177. .. image:: https://travis-ci.org/kjd/idna.svg?branch=master
  178. :target: https://travis-ci.org/kjd/idna