You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

983 B

doccano

doccano is a document annotation tool. The purpose is making annotation process efficient. First, manually labeling small data in minutes using the labeling interface. Second, train built-in classification model using the labeled data and classify unlabeled data with their probability. Then, sort data in ascending order by the probability. You can efficiently annotate the data.

doccano

Features

  • Active Learning based annotation

Requirements

  • Python3.6+
  • numpy 1.14.3+
  • scikit-learn 0.19.1+
  • scipy 1.1.0+
  • django 2.0.5+

Put data into doccano/data directory.

Installation

To install doccano, simply run:

$ git clone https://github.com/chakki-works/doccano.git
$ cd doccano
$ pip install -r requirements.txt

Usage

First, run web application:

$ cd doccano/server
$ python run_server.py

Then, open http://localhost:8080 in your browser.