Compilation of the Corpus of International Treaties
Main Article Content
Abstract
This paper focuses on the description of the corpus «PEST-INTER» in five languages and the process of its compilation and incorporation. The aim is to give step-by-step instruction on the corpus compilation. The further purpose is to show up the practical solutions for the problems raising in different stages of the corpus compilation. Describing the decisions taken and the strategies followed I discuss the corpus planning going into depth on web crawling, character and corpus encoding, automatic alignment and editing of the compiled texts.
Downloads
Article Details
Suggested policy for journals that offer open access
Authors who publish with this journal agree to the following terms:
1. Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License, which allows others to share the work with an acknowledgement of authorship of the work and initial publication in this journal.
2. Authors may enter into additional contractual arrangements for non-exclusive distribution of the published version of the paper in the journal (e.g., submission to an institutional repository), with an acknowledgement of its initial publication in this journal.
3. Authors are allowed and encouraged to publish their work prior to the final version published in this journal once accepted (e.g., in institutional repositories or on their website), as it can lead to productive exchanges, as well as earlier and higher citation of the published work (see The Open Access Effect).