automated Pipeline for parsing profiles of politically exposed persons (PEP) into Wikidata
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

16 lines
804 B

  1. # Settings for the PEP crawler per country to crawl
  2. # Follow the syntax
  3. nicaragua:
  4. memberList:
  5. link: http://legislacion.asamblea.gob.ni/Tablas%20Generales.nsf/Main.xsp
  6. parent: [html, body, form, table, tbody, tr, td, table, tbody]
  7. child-name: [html, body, form, table, tbody, tr, td, table, tbody, tr, td.null, a.text]
  8. child-link: [html, body, form, table, tbody, tr, td, table, tbody, tr, td.null, a.href]
  9. member:
  10. info-1:
  11. parent: [html, body, form, table, tbody]
  12. child-name: [html, body, form, table, tbody, tr.0, td.1, span]
  13. child-image: [html, body, form, table, tbody, tr.1, td.0, span, img]
  14. child-role: [html, body, form, table, tbody, tr.1, td.2, span + label.1]
  15. child-politicalParty: [html, body, form, table, tbody, tr.4, td, span]