Further requirements for annotated publications
However, certain providers may decide to run the TDM or annotation software at their own premises and upload the results of the processing directly into OpenMinTeD (e.g. annotating the publications with structural markup, recognizing acknowledgements or citations sections etc.).
In these cases, the annotated output (publication) is considered a new resource and should follow the technical specifications that have been set for processing resources inside the OpenMinTeD platform. More specifically, it should be
- encoded with the XML Metadata Interchange (XMI) format, and most specifically with a UIMA CAS
- described with its own metadata record, using metadata for annotated publications
- packaged with all other raw and annotated publications into a corpus and registered following the instructions for sharing corpora and further requirements for annotated corpora.