alpcentaur
|
a0075e429d
|
added further database in config.yaml, added new exception for downloading js generated html pages
|
2023-11-27 15:10:11 +00:00 |
|
alpcentaur
|
df4a8289b8
|
added pdf parser if entry link is direct pdf
|
2023-11-22 17:03:15 +00:00 |
|
alpcentaur
|
14ece9bceb
|
added functions for uniform and not uniform entry end points - non uniform endpoints are generally parsed as text from any paragraph xml element p
|
2023-11-20 15:28:04 +00:00 |
|
alpcentaur
|
42841ee650
|
added some exceptions for bad encoding and get errors
|
2023-11-14 14:38:45 +00:00 |
|
alpcentaur
|
ff23c22e3c
|
added working bund.de-bekanntmachungen config with new example of xpath contains
|
2023-11-13 16:44:11 +00:00 |
|
alpcentaur
|
c078ee4b1b
|
first function works, actuall xml parser has still problems with certain xml types
|
2023-11-06 19:17:45 +00:00 |
|
alpcentaur
|
59838bb8e1
|
added main.py importing and using the spider functions
|
2023-11-02 10:54:16 +00:00 |
|