No description
Find a file
2023-11-20 16:37:27 +01:00
spiders added functions for uniform and not uniform entry end points - non uniform endpoints are generally parsed as text from any paragraph xml element p 2023-11-20 15:28:04 +00:00
.gitignore first function works, actuall xml parser has still problems with certain xml types 2023-11-06 19:19:31 +00:00
main.py added functions for uniform and not uniform entry end points - non uniform endpoints are generally parsed as text from any paragraph xml element p 2023-11-20 15:28:04 +00:00
README.md Update 'README.md' 2023-11-20 16:37:27 +01:00
requirements.txt specifying the links, new exception clause if soupparser does not work 2023-11-07 14:55:05 +00:00

  __     _ _                     _     _
 / _| __| | |__        ___ _ __ (_) __| | ___ _ __
| |_ / _` | '_ \ _____/ __| '_ \| |/ _` |/ _ | '__|
|  _| (_| | |_) |_____\__ | |_) | | (_| |  __| |
|_|  \__,_|_.__/      |___| .__/|_|\__,_|\___|_|
                          |_|

Configure fdb-spider in a yaml file. Spider Multi page databases of links. Filter and serialize content to json.

Filter either by xpath syntax. Or Filter with the help of Artificial Neural Networks.