72 Commits (master)
 

Author SHA1 Message Date
  alpcentaur 2aa1134b48 updated gitignore 6 months ago
  alpcentaur a9c2346c04 first change to also click Accept Button in English if may come for js spidering functionality 6 months ago
  alpcentaur 0808e5a42d main.py and config.yaml are left out from updates, only examples are provided. Change in Readme too 6 months ago
  alpcentaur 4ec9f76080 added xorg-server-xephyr as dep to install 6 months ago
  alpcentaur 10cdab6f60 updated README with new and working install order 6 months ago
  alpcentaur ccfe20044f added another tip to README.md, header for display and another tip added too 6 months ago
  alpcentaur 0fa420d74c added explanation of display variable in the spiders code 6 months ago
  alpcentaur 0d7728240e update var javascriptlink in README.md 6 months ago
  alpcentaur c52ea0cf0a added example1 for js configuration in README.md 6 months ago
  alpcentaur 5000dca314 Update README.md with better explanation how to js spider 6 months ago
  alpcentaur 0908ccf6e5 clarifications for javascript link and js link plus js iteration 6 months ago
  alpcentaur ff0fe5193d fixed the links for the clickable content summary 6 months ago
  alpcentaur 49d5c2ffa9 third try ordering 6 months ago
  alpcentaur f489106ea0 second try ordering 6 months ago
  alpcentaur 32fceffd01 searchable headers for step by step guide started 6 months ago
  alpcentaur eca77f9b63 Step by Step Guide continuation of describing the variables 6 months ago
  alpcentaur 483eaec26e changed domain for new configuration dtvp 6 months ago
  alpcentaur c33dbc37e6 Merge remote-tracking branch 'refs/remotes/origin/master' 6 months ago
  alpcentaur a07d2e93f6 changes for new database dtvp, new exceptions trying to click away cookie pop ups 6 months ago
  alpcentaur d284fef015 changes for new database dtvp, new exceptions trying to click away cookie pop ups 6 months ago
  alpcentaur 5fd6b7f781 Part 2 of Step by Step Guide 6 months ago
  alpcentaur e4fa13d29d Start of Step by Step Guide 6 months ago
  alpcentaur 7ba196b0c2 changed size of virtual window, added some scrolling and shortened the time for js lazy loading enforced slow downloading 6 months ago
  alpcentaur a56569712e another small change to config.yaml before pushing 6 months ago
  alpcentaur a0dd469f25 added new database ted.europe.eu, created new case of slow downloading, intergrated scrolling into entrylistpagesdownload 7 months ago
  alpcentaur 094f092291 deleted fdb entry that was a ghost for syntax reasons, but same syntax should be in other fdb anyway 7 months ago
  alpcentaur d7d157bf42 added further dokumentation to README.md 7 months ago
  alpcentaur 0500f5853d full working example from localhost 7 months ago
  alpcentaur 0411d74936 deleted config.yaml.save 7 months ago
  alpcentaur cf3bb52684 corrected link glueing for pdf links for loop 7 months ago
  alpcentaur af8374f715 added other exception for unitrue var text not being found, before saving index 0 to variable produced error to whole execution 8 months ago
  alpcentaur 20db0028e1 added first changes to fix js related bug for giz db 8 months ago
  alpcentaur dec60f9bf5 added changed logic for link addition regarding entry links 8 months ago
  alpcentaur 5d17f4e421 corrected error which arised in logic of wget backup get 8 months ago
  alpcentaur 92c238a2ed added instruction for downloading chromium driver for python selenium to README.md 8 months ago
  alpcentaur ece5cf1301 added better logic for getting the right link of entry 8 months ago
  alpcentaur 0e58756600 added last resort exception for entry page downloading with wget, also implemented some further logic regarding getting the right links 8 months ago
  alpcentaur 16199256e3 javascript on highest level done better 8 months ago
  alpcentaur 5627c80177 merged onlinkgen with master, and added more universal chrome driver initialization to the beginning of the javascript entries gothrough function in download_entry_list_pages_of_funding_databases() 8 months ago
  alpcentaur 14b8db7941 started adding javascript handling on highest spider level 8 months ago
  alpcentaur fbee5d6229 last commit in detached head 8 months ago
  alpcentaur 953f85ee5b added new lines to chromedriver, to make it work on other systems 8 months ago
  alpcentaur d2324d265a added pdf child text downloading and parse to json exceptions/cases for javascript entry data and normal data 9 months ago
  alpcentaur 885c210971 added selenium for pop up entry links 9 months ago
  alpcentaur ec180bed0a added flow for selenium grabbing popup instead of links for entries 9 months ago
  alpcentaur b4fd385c5d did some changes to main.py for using sys.argv 9 months ago
  alpcentaur 99c74dcbad updated requirements.txt 9 months ago
  alpcentaur 54daad8dfa started sys arguments for main.py, to be able to control spider from interface 9 months ago
  alpcentaur 89dcca2031 added further handling for javascript links not being urls, made config for giz work 9 months ago
  alpcentaur a0075e429d added further database in config.yaml, added new exception for downloading js generated html pages 9 months ago