You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

41 lines
983 B

6 years ago
6 years ago
6 years ago
6 years ago
  1. # doccano
  2. doccano is a document annotation tool. The purpose is making annotation process efficient. First, manually labeling small data in minutes using the labeling interface. Second, train built-in classification model using the labeled data and classify unlabeled data with their probability. Then, sort data in ascending order by the probability. You can efficiently annotate the data.
  3. ![doccano](docs/demo.png)
  4. ## Features
  5. * Active Learning based annotation
  6. ## Requirements
  7. * Python3.6+
  8. * numpy 1.14.3+
  9. * scikit-learn 0.19.1+
  10. * scipy 1.1.0+
  11. * django 2.0.5+
  12. Put data into [doccano/data](https://github.com/chakki-works/doccano/tree/master/data) directory.
  13. ## Installation
  14. To install doccano, simply run:
  15. ```bash
  16. $ git clone https://github.com/chakki-works/doccano.git
  17. $ cd doccano
  18. $ pip install -r requirements.txt
  19. ```
  20. ## Usage
  21. First, run web application:
  22. ```bash
  23. $ cd doccano/server
  24. $ python run_server.py
  25. ```
  26. Then, open <http://localhost:8080> in your browser.