alpcentaur
|
a56569712e
|
another small change to config.yaml before pushing
|
9 months ago |
alpcentaur
|
a0dd469f25
|
added new database ted.europe.eu, created new case of slow downloading, intergrated scrolling into entrylistpagesdownload
|
9 months ago |
alpcentaur
|
094f092291
|
deleted fdb entry that was a ghost for syntax reasons, but same syntax should be in other fdb anyway
|
9 months ago |
alpcentaur
|
0500f5853d
|
full working example from localhost
|
9 months ago |
alpcentaur
|
cf3bb52684
|
corrected link glueing for pdf links for loop
|
9 months ago |
alpcentaur
|
af8374f715
|
added other exception for unitrue var text not being found, before saving index 0 to variable produced error to whole execution
|
10 months ago |
alpcentaur
|
20db0028e1
|
added first changes to fix js related bug for giz db
|
10 months ago |
alpcentaur
|
dec60f9bf5
|
added changed logic for link addition regarding entry links
|
10 months ago |
alpcentaur
|
ece5cf1301
|
added better logic for getting the right link of entry
|
10 months ago |
alpcentaur
|
16199256e3
|
javascript on highest level done better
|
11 months ago |
alpcentaur
|
14b8db7941
|
started adding javascript handling on highest spider level
|
11 months ago |
alpcentaur
|
953f85ee5b
|
added new lines to chromedriver, to make it work on other systems
|
11 months ago |
alpcentaur
|
d2324d265a
|
added pdf child text downloading and parse to json exceptions/cases for javascript entry data and normal data
|
11 months ago |
alpcentaur
|
ec180bed0a
|
added flow for selenium grabbing popup instead of links for entries
|
11 months ago |
alpcentaur
|
b4fd385c5d
|
did some changes to main.py for using sys.argv
|
11 months ago |
alpcentaur
|
89dcca2031
|
added further handling for javascript links not being urls, made config for giz work
|
11 months ago |
alpcentaur
|
a0075e429d
|
added further database in config.yaml, added new exception for downloading js generated html pages
|
11 months ago |
alpcentaur
|
b2cf4b67ce
|
added first config parameters for search on not uniform entries
|
11 months ago |
alpcentaur
|
ff23c22e3c
|
added working bund.de-bekanntmachungen config with new example of xpath contains
|
1 year ago |
alpcentaur
|
06fa81e549
|
added function find config parameter and changed core spider
|
1 year ago |
alpcentaur
|
a846ce04cc
|
specifying the links, new exception clause if soupparser does not work
|
1 year ago |
alpcentaur
|
c078ee4b1b
|
first function works, actuall xml parser has still problems with certain xml types
|
1 year ago |
alpcentaur
|
8b20bc178f
|
added multi pages configuration and code
|
1 year ago |
alpcentaur
|
7aa903883b
|
update to config.yaml
|
1 year ago |
alpcentaur
|
5ac07d151a
|
added first config.yaml template and started creating folder structure
|
1 year ago |