Lecture: Automatic Extraction of Causal Relations in German
In order to use text data for digital systems like search machines, the computer needs to understand the semantics of the text. An important task in Computational Linguistics is therefore the automatic detection of semantic relations. A rather challenging kind of relation is the one of causality, since it is usually highly dependent on the context and rarely true in general. Thus, even humans struggle with the annotation of causal relations. However, state-of-the-art systems often use machine learning techniques for this task which requires a huge amount of annotated data.
Because of this problem, I built a system in the course of my Bachelor thesis which automatically detects and extracts causal entities in unannotated German corpora. I focus on specific causal structures having nominal triggers (e.g. The defect toaster was the CAUSE for the fire). The system is provided with few manually selected examples and uses a bootstrapping algorithm to find new examples, triggers and causal patterns. To avoid semantic drift, several filtering methods based on word embeddings and the resource GermaNet are tested and evaluated. In my talk, I am going to present this system and its results.