- `TXT file`: each line contains a text and no line breaks (`\n`).
- `CSV file`: file must contain a header with a `text` column or be one-column csv file.
- `JSON file`: each line contains a JSON object with a `text` key. JSON format supports line breaks rendering.
> Notice: Doccano won't render line breaks in annotation page for sequence labeling task due to the indent problem, but the exported JSON file still contains line breaks.
@ -135,6 +135,8 @@ He lives in Newark, Ohio.
...
```
Any other columns (for csv) or keys (for json) are preserved and will be exported in the `metadata` column or key as is.
Once you select a TXT/JSON file on your computer, click `Upload dataset` button. After uploading the dataset file, we will see the `Dataset` page (or click `Dataset` button list in the left bar). This page displays all the documents we uploaded in one project.
### Define labels
@ -156,7 +158,22 @@ After the annotation step, you can download the annotated data. Click the `Edit
You can export data as CSV file or JSON file by clicking the button. As for the export file format, you can check it here: [Export File Formats](https://github.com/chakki-works/doccano/wiki/Export-File-Formats)
You can export data as CSV file or JSON file by clicking the button. As for the export file format, you can check it here: [Export File Formats](https://github.com/chakki-works/doccano/wiki/Export-File-Formats).
Each exported document will have metadata column or key, which will contain
additional columns or keys from the imported document. The primary use-case for metadata is to allow you to match exported data with other system
by adding `external_id` to the imported file. For example:
Input file may look like this:
`import.json`
```JSON
{"text": "EU rejects German call to boycott British lamb.", "external_id": 1}
```
and the exported file will look like this:
`output.json`
```JSON
{"doc_id": 2023, "text": "EU rejects German call to boycott British lamb.", "labels": ["news"], "username": "root", "metadata": {"external_id": 1}}