mirror of https://github.com/doccano/doccano.git
pythondatasetsactive-learningtext-annotationdatasetnatural-language-processingdata-labelingmachine-learningannotation-tool
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Hironsan
257dd3a384
|
6 years ago | |
---|---|---|
data | 6 years ago | |
doccano | 6 years ago | |
docs | 6 years ago | |
tests | 6 years ago | |
.gitignore | 6 years ago | |
ISSUE_TEMPLATE.md | 6 years ago | |
LICENSE | 6 years ago | |
README.md | 6 years ago | |
requirements.txt | 6 years ago | |
tox.ini | 6 years ago |
README.md
doccano
doccano is a document annotation tool. The purpose is making annotation process efficient. First, manually labeling small data in minutes using the labeling interface. Second, train built-in classification model using the labeled data and classify unlabeled data with their probability. Then, sort data in ascending order by the probability. You can efficiently annotate the data.
Features
- Active Learning based annotation
Requirements
- Python3.6+
- numpy 1.14.3+
- scikit-learn 0.19.1+
- scipy 1.1.0+
- django 2.0.5+
Put data into doccano/data directory.
Installation
To install doccano, simply run:
$ git clone https://github.com/chakki-works/doccano.git
$ cd doccano
$ pip install -r requirements.txt
Usage
First, run web application:
$ cd doccano/server
$ python run_server.py
Then, open http://localhost:8080 in your browser.