alpcentaur
|
42841ee650
|
added some exceptions for bad encoding and get errors
|
2023-11-14 14:38:45 +00:00 |
|
alpcentaur
|
317ef99720
|
changed code in entrylist data2dictionary to handle empty or missing xml elements
|
2023-11-14 10:22:26 +00:00 |
|
alpcentaur
|
ff23c22e3c
|
added working bund.de-bekanntmachungen config with new example of xpath contains
|
2023-11-13 16:44:11 +00:00 |
|
alpcentaur
|
06fa81e549
|
added function find config parameter and changed core spider
|
2023-11-10 01:12:49 +00:00 |
|
alpcentaur
|
a846ce04cc
|
specifying the links, new exception clause if soupparser does not work
|
2023-11-07 14:55:05 +00:00 |
|
alpcentaur
|
a99881796a
|
first function works, actuall xml parser has still problems with certain xml types
|
2023-11-06 19:19:31 +00:00 |
|
alpcentaur
|
c078ee4b1b
|
first function works, actuall xml parser has still problems with certain xml types
|
2023-11-06 19:17:45 +00:00 |
|
alpcentaur
|
8b20bc178f
|
added multi pages configuration and code
|
2023-11-06 18:17:32 +00:00 |
|
alpcentaur
|
7aa903883b
|
update to config.yaml
|
2023-11-03 12:23:04 +00:00 |
|
alpcentaur
|
59838bb8e1
|
added main.py importing and using the spider functions
|
2023-11-02 10:54:16 +00:00 |
|
alpcentaur
|
5ac07d151a
|
added first config.yaml template and started creating folder structure
|
2023-10-31 17:41:44 +00:00 |
|
alpcentaur
|
b3011efc73
|
small change of naming in error message added
|
2023-10-30 16:43:32 +00:00 |
|
alpcentaur
|
687d40f156
|
first change of naming, first commit for the actual spider based on importPEP
|
2023-10-30 16:41:14 +00:00 |
|
alpcentaur
|
8783251133
|
first commit
|
2023-10-30 14:35:32 +00:00 |
|