You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

40 lines
967 B

6 years ago
6 years ago
6 years ago
  1. # doccano
  2. doccano is a document annotation tool. The purpose is making annotation process efficient. First, manually labeling small data in minutes using the labeling interface. Second, train built-in classification model using the labeled data and classify unlabeled data with their probability. Then, sort data in ascending order by the probability. You can efficiently annotate the data.
  3. ![doccano](docs/demo.png)
  4. ## Features
  5. * Active Learning based annotation
  6. ## Requirements
  7. * Python3.6+
  8. * numpy 1.14.3+
  9. * scikit-learn 0.19.1+
  10. * scipy 1.1.0+
  11. Put data into [doccano/data](https://github.com/chakki-works/doccano/tree/master/data) directory.
  12. ## Installation
  13. To install doccano, simply run:
  14. ```bash
  15. $ git clone https://github.com/chakki-works/doccano.git
  16. $ cd doccano
  17. $ pip install -r requirements.txt
  18. ```
  19. ## Usage
  20. First, run web application:
  21. ```bash
  22. $ cd doccano/server
  23. $ python run_server.py
  24. ```
  25. Then, open <http://localhost:8080> in your browser.