automated Pipeline for parsing profiles of politically exposed persons (PEP) into Wikidata
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

16 lines
804 B

# Settings for the PEP crawler per country to crawl
# Follow the syntax
nicaragua:
memberList:
link: http://legislacion.asamblea.gob.ni/Tablas%20Generales.nsf/Main.xsp
parent: [html, body, form, table, tbody, tr, td, table, tbody]
child-name: [html, body, form, table, tbody, tr, td, table, tbody, tr, td.null, a.text]
child-link: [html, body, form, table, tbody, tr, td, table, tbody, tr, td.null, a.href]
member:
info-1:
parent: [html, body, form, table, tbody]
child-name: [html, body, form, table, tbody, tr.0, td.1, span]
child-image: [html, body, form, table, tbody, tr.1, td.0, span, img]
child-role: [html, body, form, table, tbody, tr.1, td.2, span + label.1]
child-politicalParty: [html, body, form, table, tbody, tr.4, td, span]