Start of Step by Step Guide

Oi
2024-02-28 17:17:27 +01:00 · 2024-02-28 17:17:27 +01:00 · e4fa13d29d
commit e4fa13d29d
parent 7ba196b0c2
1 changed files with 41 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -15,6 +15,7 @@
 3. [Usage](#usage)
  *  [Configuration File Syntax](#configuration-file-syntax)
  *  [Efficient Xpath Copying](#efficient-xpath-copying)
+  *  [Step By Step Guide](#step-by-step-guide)

 # Introduction

@ -111,3 +112,43 @@ slashes. That will make the spider more stable, in case the websites
 html/xml gets changed for maintenance or other reasons.


+## Step By Step Guide
+
+Start with an old Configuration that is similar to what you need.
+
+There are Three Types of Configurations:
+
+The first Type is purely path based. An example is greenjobs.de.
+The second Type is a mixture of path and javascript functions, giz is an example for this Type.
+The third Type is purely javascript based. An example is ted.europe.eu.
+
+Type 1:
+
+Start with collecting every variable.
+From up to down.
+
+### var domain
+
+domain is the variable for the root of the website.
+In case links are glued, they will be glued based on the root.
+
+### var entry-list
+
+Now come all the variables regarding the entry list pages.
+
+#### var link1, link2 and iteration-var-list
+
+In Pseudo Code, whats happening with these three variables is
+
+```
+for n in iteration var list:
+    get(link1 + n + link2)
+```
+
+So if you are on the no javascript side of reality, you are lucky. Thats all needed to get the collection of links.
+
+We can just come to 
+
+#### var parent
+
+Oi