Skip to content

Upload and download documents

Upload documents

To upload additional documents in a library:

  1. In the list view of the Documents tab, in the toolbar to the right, select Upload documents .

    • Select Show advanced settings:

      • If you want to disable automatic language detection: turn off Autodetect language and choose the language from the Select language drop-down list.
      • If you want to disable automatic character encoding detection: turn off Autodetect encoding and choose the encoding from the Select encoding drop-down list.
      • When done, select Hide advanced settings.
  2. Select Add files.

  3. Select the files to upload.
    The selected files are displayed in a list. You can delete one or more files by clicking on the "X" button at the right of the file name.
  4. Select Upload.


You will be asked to reload the library in the lower right corner to have the updated number of documents.


Supported formats and limits

Supported document formats are those managed by the Apache Tika toolkit. Documents are automatically converted to plain text files during upload.

Documents are ignored if:

  • They are empty.
  • They mainly consist of nonsense words.
  • (In case of automatic language recognition) Their language is unrecognized or not supported.
  • They exceed the following values:

    • 200 MB for .zip files.
    • 1 MB for .txt files.
    • 100 MB for other file types.

At most the first 50KB of text is considered for each document.

Download documents

  1. In the list view of the Documents tab, in toolbar to the right, select Export .
  2. In the Export documents window, Export panel, enter the filename or confirm the suggested one and confirm the Extension.
  3. Documents are pre-processed when uploaded and spacing is affected. For example, a sequence of empty lines is compressed to one. If you want to export documents with their original spacing, check Include original text spaced documents.
  4. Select the Export filter to filter the document set to export:

    • Confirm All documents to download the whole corpus (set by default).


    • Select Filtered documents to download a filtered list of the documents according to the following available filters:

      • Documents with annotations.
      • Documents with extractions.
      • Current list of filtered documents.
  1. Select Export.
  2. In the Download tab, or in the notification in the lower right corner, select Download to download a documents set of your interest.