You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

43 lines
990 B

6 years ago
  1. # doccano
  2. doccano is a document annotation tool. The purpose is making annotation process efficient. First, manually labeling small data in minutes using the labeling interface. Second, train built-in classification model using the labeled data and classify unlabeled data with their probability. Then, sort data in ascending order by the probability. You can efficiently annotate the data.
  3. ![doccano](docs/placeholder.png)
  4. <!--
  5. ## Demo
  6. -->
  7. ## Features
  8. * Active Learning based annotation
  9. ## Requirements
  10. * Python3.6+
  11. * numpy 1.14.3+
  12. * scikit-learn 0.19.1+
  13. * scipy 1.1.0+
  14. Put data into [doccano/data](https://github.com/chakki-works/doccano/tree/master/data) directory.
  15. ## Installation
  16. To install namaco, simply run:
  17. ```bash
  18. $ git clone https://github.com/chakki-works/doccano.git
  19. $ cd doccano
  20. $ pip install -r requirements.txt
  21. ```
  22. ## Usage
  23. First, run web application:
  24. ```bash
  25. $ cd doccano/server
  26. $ python run_server.py
  27. ```
  28. Then, open <http://localhost:8080> in your browser.