@ -58,20 +58,19 @@ Doccano can be deployed to AWS ([Cloudformation](https://docs.aws.amazon.com/AWS
> Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS).
## Features
* Collaborative annotation
* Multi-Language support
* Emoji :smile: support
* (future) Auto labeling
- Collaborative annotation
- Multi-Language support
- Emoji :smile: support
- (future) Auto labeling
## Requirements
* Python 3.6+
* Django 2.1.7+
* Node.js 8.0+
* Google Chrome(highly recommended)
- Python 3.6+
- Django 2.1.7+
- Node.js 8.0+
- Google Chrome(highly recommended)
## Installation
@ -164,7 +163,9 @@ Finally, to start the server, run the following command:
```bash
python manage.py runserver
```
Optionally, you can change the bind ip and port using the command
```bash
python manage.py runserver <ip>:<port>
```
@ -199,20 +200,26 @@ After creating a project, you will see the "Import Data" page, or click `Import
- `CSV file`: file must contain a header with a `text` column or be one-column csv file.
- `JSON file`: each line contains a JSON object with a `text` key. JSON format supports line breaks rendering.
You can upload the following types of files (depending on project type):
- `Text file`: file must contain one sentence/document per line separated by new lines.
- `CSV file`: file must contain a header with `"text"` as the first column or be one-column csv file. If using labels the sencond column must be the labels.
- `Excel file`: file must contain a header with `"text"` as the first column or be one-column excel file. If using labels the sencond column must be the labels. Supports multiple sheets as long as format is the same.
- `JSON file`: each line contains a JSON object with a `text` key. JSON format supports line breaks rendering.
> Notice: Doccano won't render line breaks in annotation page for sequence labeling task due to the indent problem, but the exported JSON file still contains line breaks.
`example.txt` (or `example.csv`)
```python
`example.txt/csv/xlsx`
```txt
EU rejects German call to boycott British lamb.
President Obama is speaking at the White House.
He lives in Newark, Ohio.
...
```
`example.json`
```JSON
{"text": "EU rejects German call to boycott British lamb."}
{"text": "President Obama is speaking at the White House."}
@ -220,7 +227,7 @@ He lives in Newark, Ohio.
...
```
Any other columns (for csv) or keys (for json) are preserved and will be exported in the `metadata` column or key as is.
Any other columns (for csv/excel) or keys (for json) are preserved and will be exported in the `metadata` column or key as is.
Once you select a TXT/JSON file on your computer, click `Upload dataset` button. After uploading the dataset file, we will see the `Dataset` page (or click `Dataset` button list in the left bar). This page displays all the documents we uploaded in one project.
@ -230,7 +237,6 @@ Click `Labels` button in left bar to define your own labels. You should see the
Now, you are ready to annotate the texts. Just click the `Annotate Data` button in the navigation bar, you can start to annotate the documents you uploaded.
@ -251,11 +257,14 @@ by adding `external_id` to the imported file. For example:
Input file may look like this:
`import.json`
```JSON
{"text": "EU rejects German call to boycott British lamb.", "external_id": 1}
```
and the exported file will look like this:
`output.json`
```JSON
{"doc_id": 2023, "text": "EU rejects German call to boycott British lamb.", "labels": ["news"], "username": "root", "metadata": {"external_id": 1}}
```
@ -272,7 +281,6 @@ As with any software, doccano is under continuous development. If you have reque
Here are some tips might be helpful. [How to Contribute to Doccano Project](https://github.com/chakki-works/doccano/wiki/How-to-Contribute-to-Doccano-Project)
## Contact
For help and feedback, please feel free to contact [the author](https://github.com/Hironsan).