Vortrag: DURel Annotation Tool
Measuring Patterns of Contextual Word Meaning over Time
DURel is an annotation tool for sentence pairs of a word. The annotations are used to form sense clusters of a word and to visualize them over time.
We present an online annotation interface for sentence pairs of a word. Annotators are asked to judge the degree of semantic relatedness of pairs of word uses, such as the two uses of arm in (1) and (2) on a scale of 1 (unrelated) to 4 (identical).
(1) and taking a knife from her pocket, she opened a vein in her little arm, and dipping a feather in the blood, wrote something on a piece of white cloth, which was spread before her.
(2) It stood behind a high brick wall, its back windows overlooking an arm of the sea which, at low tide, was a black and stinking mud-flat
The annotated data of a word is then represented in a Word Usage Graph (WUG), where nodes represent word uses and weights on edges represent the (median) semantic relatedness judgment of a pair of uses as e.g. (1) and (2). The final WUGs are clustered with a variation of correlation clustering and split into subgraphs representing nodes and edges from different time periods. Clusters are then interpreted as word senses and changes in clusters over time as lexical semantic change.
The interface allows users to upload a project, i.e., use samples for several target words which will be combined into use pairs per word and presented to annotators in random order. Users can manage their projects assigning them to registered annotators. The annotation can be stopped at any point and the annotated data can be downloaded. The system also allows to directly cluster and visualize the data over time as interactive WUGs.
The DURel Tool may be interesting for researchers who are interested in measuring the semantic patterns underlying a set of words uses from some corpus as occurring e.g. in lexical and historical semantics, lexicography or digital humanities.