UDPipe is a trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files. The software has been developed at the Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic. This document describes the installation at Kielipankki - The Language Bank of Finland. Using UDPipe on CSC's servers requires a CSC user account: https://research.csc.fi/accounts-and-projects
UDPipe is installed on CSC's Taito cluster in the following configuration:
Software: UDPipe 1.2.0
UDPipe was compiled and installed from Source without local modifications. Please refer to the user's manual linked in the Documentation section.
The tool was installled using Ansible scripts that can be found here: https://github.com/CSCfi/Kielipankki-palvelut/tree/Dec2018/commandline/roles/udpipe