You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

106 lines
2.9 KiB

4 years ago
  1. Metadata-Version: 2.1
  2. Name: beautifulsoup4
  3. Version: 4.6.3
  4. Summary: Screen-scraping library
  5. Home-page: http://www.crummy.com/software/BeautifulSoup/bs4/
  6. Author: Leonard Richardson
  7. Author-email: leonardr@segfault.org
  8. License: MIT
  9. Download-URL: http://www.crummy.com/software/BeautifulSoup/bs4/download/
  10. Platform: UNKNOWN
  11. Classifier: Development Status :: 5 - Production/Stable
  12. Classifier: Intended Audience :: Developers
  13. Classifier: License :: OSI Approved :: MIT License
  14. Classifier: Programming Language :: Python
  15. Classifier: Programming Language :: Python :: 2.7
  16. Classifier: Programming Language :: Python :: 3
  17. Classifier: Topic :: Text Processing :: Markup :: HTML
  18. Classifier: Topic :: Text Processing :: Markup :: XML
  19. Classifier: Topic :: Text Processing :: Markup :: SGML
  20. Classifier: Topic :: Software Development :: Libraries :: Python Modules
  21. Description-Content-Type: text/markdown
  22. Provides-Extra: lxml
  23. Provides-Extra: html5lib
  24. Provides-Extra: html5lib
  25. Requires-Dist: html5lib; extra == 'html5lib'
  26. Provides-Extra: lxml
  27. Requires-Dist: lxml; extra == 'lxml'
  28. Beautiful Soup is a library that makes it easy to scrape information
  29. from web pages. It sits atop an HTML or XML parser, providing Pythonic
  30. idioms for iterating, searching, and modifying the parse tree.
  31. # Quick start
  32. ```
  33. >>> from bs4 import BeautifulSoup
  34. >>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML")
  35. >>> print soup.prettify()
  36. <html>
  37. <body>
  38. <p>
  39. Some
  40. <b>
  41. bad
  42. <i>
  43. HTML
  44. </i>
  45. </b>
  46. </p>
  47. </body>
  48. </html>
  49. >>> soup.find(text="bad")
  50. u'bad'
  51. >>> soup.i
  52. <i>HTML</i>
  53. >>> soup = BeautifulSoup("<tag1>Some<tag2/>bad<tag3>XML", "xml")
  54. >>> print soup.prettify()
  55. <?xml version="1.0" encoding="utf-8">
  56. <tag1>
  57. Some
  58. <tag2 />
  59. bad
  60. <tag3>
  61. XML
  62. </tag3>
  63. </tag1>
  64. ```
  65. To go beyond the basics, [comprehensive documentation is available](http://www.crummy.com/software/BeautifulSoup/bs4/doc/).
  66. # Links
  67. * [Homepage](http://www.crummy.com/software/BeautifulSoup/bs4/)
  68. * [Documentation](http://www.crummy.com/software/BeautifulSoup/bs4/doc/)
  69. * [Discussion group](http://groups.google.com/group/beautifulsoup/)
  70. * [Development](https://code.launchpad.net/beautifulsoup/)
  71. * [Bug tracker](https://bugs.launchpad.net/beautifulsoup/)
  72. * [Complete changelog](https://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/NEWS.txt)
  73. # Building the documentation
  74. The bs4/doc/ directory contains full documentation in Sphinx
  75. format. Run `make html` in that directory to create HTML
  76. documentation.
  77. # Running the unit tests
  78. Beautiful Soup supports unit test discovery from the project root directory:
  79. ```
  80. $ nosetests
  81. ```
  82. ```
  83. $ python -m unittest discover -s bs4 # Python 2.7 and up
  84. ```
  85. If you checked out the source tree, you should see a script in the
  86. home directory called test-all-versions. This script will run the unit
  87. tests under Python 2.7, then create a temporary Python 3 conversion of
  88. the source and run the unit tests again under Python 3.