|
|
- {
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Hier wird die Bibliothek ConjunctSolve und deren Funktionen importiert. Anschließend wird die Klasse initialisiert."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [],
- "source": [
- "from ConjunctSolve import *\n",
- "\n",
- "\n",
- "\n",
- "cs = ConjunctSolve(None,None)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "1000\n",
- "2000\n",
- "3000\n",
- "4000\n",
- "5000\n",
- "6000\n",
- "7000\n",
- "8000\n",
- "9000\n",
- "10000\n",
- "11000\n",
- "12000\n",
- "13000\n",
- "14000\n",
- "15000\n",
- "16000\n",
- "17000\n",
- "18000\n",
- "19000\n",
- "20000\n",
- "21000\n",
- "22000\n",
- "23000\n",
- "24000\n",
- "25000\n",
- "26000\n",
- "27000\n",
- "28000\n",
- "29000\n",
- "30000\n",
- "31000\n",
- "32000\n",
- "33000\n",
- "34000\n",
- "35000\n",
- "36000\n",
- "37000\n",
- "38000\n",
- "39000\n",
- "40000\n",
- "41000\n",
- "42000\n",
- "43000\n",
- "44000\n",
- "45000\n",
- "46000\n",
- "47000\n",
- "48000\n",
- "49000\n",
- "50000\n",
- "51000\n",
- "52000\n",
- "53000\n",
- "54000\n",
- "55000\n",
- "56000\n",
- "57000\n",
- "58000\n",
- "59000\n",
- "60000\n",
- "61000\n",
- "62000\n",
- "63000\n",
- "64000\n",
- "65000\n",
- "66000\n",
- "67000\n",
- "68000\n",
- "69000\n",
- "70000\n",
- "71000\n",
- "72000\n",
- "73000\n",
- "74000\n",
- "75000\n",
- "76000\n",
- "77000\n",
- "78000\n",
- "79000\n",
- "80000\n",
- "81000\n",
- "82000\n",
- "83000\n",
- "84000\n",
- "85000\n",
- "86000\n",
- "87000\n",
- "88000\n",
- "89000\n",
- "90000\n",
- "91000\n",
- "92000\n",
- "93000\n",
- "94000\n",
- "95000\n",
- "96000\n",
- "97000\n",
- "98000\n",
- "99000\n",
- "100000\n",
- "101000\n",
- "102000\n",
- "103000\n",
- "104000\n",
- "105000\n",
- "106000\n",
- "107000\n",
- "108000\n",
- "109000\n",
- "110000\n",
- "111000\n",
- "112000\n",
- "113000\n",
- "114000\n",
- "115000\n",
- "116000\n",
- "117000\n",
- "118000\n",
- "119000\n",
- "120000\n",
- "121000\n",
- "122000\n",
- "123000\n",
- "124000\n",
- "125000\n",
- "126000\n",
- "127000\n",
- "128000\n",
- "creating the hkl dump of Indi_ConjuDBAll\n",
- "done..\n",
- "Creating the hkl dump of Indi_ConjuDB 1\n",
- "Creating the hkl dump of Indi_ConjuDB 2\n"
- ]
- },
- {
- "data": {
- "text/plain": [
- "'done'"
- ]
- },
- "execution_count": 2,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "cs.create_hklDB_from_csv('Indikativ_Conjunktiv.txt', 'None')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Nun werden die Datenbanken in den Arbeitsspeicher geladen"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Creating the bag of words...\n",
- "\n",
- "dumping the data to hkl format..\n",
- "done\n",
- "Creating the bag of words...\n",
- "\n",
- "dumping the data to hkl format..\n",
- "done\n"
- ]
- }
- ],
- "source": [
- "cs.load_DB_into_FASTsearch()"
- ]
- },
- {
- "cell_type": "raw",
- "metadata": {},
- "source": [
- "oi ist eine Klasse mit nur zwei Funktionen. Das Einlesen und schreiben von Textdateien. Die Funktion ReadDoc2Sent liest ein Textdokument ein. Der Output sind die Sätze in Listen geschrieben \n",
- "( eine Liste in python hat die Form [ 'Das', 'ist', 'ein', 'Satz.' ] )."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [],
- "source": [
- "from oi import *\n",
- "oi = oi()\n",
- "\n",
- "sentences, punctuations = oi.ReadDoc2Sent('atest1')\n",
- "print(sentences, punctuations)"
- ]
- },
- {
- "cell_type": "raw",
- "metadata": {},
- "source": [
- "Die Funktion replaceConjunctives wird nun auf die Liste aus Satzlisten angewendet. Die Variable outsentences ist auch wieder eine Liste."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [
- {
- "ename": "TypeError",
- "evalue": "replaceConjunctives() takes 2 positional arguments but 3 were given",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m<ipython-input-7-154bf4b4e943>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0moutsentences\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreplaceConjunctives\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msentences\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpunctuations\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mTypeError\u001b[0m: replaceConjunctives() takes 2 positional arguments but 3 were given"
- ]
- }
- ],
- "source": [
- "outsentences = cs.replaceConjunctives(sentences, punctuations)"
- ]
- },
- {
- "cell_type": "raw",
- "metadata": {},
- "source": [
- "Abschließend wird nun die Satzliste mit den ausgetauschten Konjunktiven in die Datei 'atest1out' geschrieben."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "oi.PrintSplitSentencesToTextFile(punctuations, outsentences, 'atest1out')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.5.3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
- }
|