You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

60 lines
2.5 KiB

  1. # Get started with doccano
  2. ## What is doccano?
  3. doccano is an open-source data labeling tool for machine learning practitioners. You can perform different types of labeling tasks with many data formats. You can try doccano from the [demo page](http://doccano.herokuapp.com).
  4. ![Demo image](https://raw.githubusercontent.com/doccano/doccano/master/docs/images/demo/demo.gif)
  5. You can also integrate doccano with your script because it exposes the features as REST APIs. By using the APIs, you can label your data by using some machine learning model. See API documentation in detail.
  6. ## Labeling workflow with doccano
  7. Start and finish a labeling project with doccano by the following steps:
  8. 1. Install doccano.
  9. 2. Run doccano.
  10. 3. Set up the labeling project. Select the type of labeling project and configure project settings.
  11. 4. Import dataset. You can also import labeled datasets.
  12. 5. Add users to the project.
  13. 6. Define the annotation guideline.
  14. 7. Start labeling the data.
  15. 8. Export the labeled dataset.
  16. ## Quick start
  17. 1. Install doccano:
  18. ```bash
  19. pip install doccano
  20. ```
  21. 2. Run doccano:
  22. ```bash
  23. doccano init
  24. doccano createuser
  25. doccano webserver
  26. # In another terminal, run the following command:
  27. doccano task
  28. ```
  29. 3. Open doccano UI at <http://localhost:8000>.
  30. 4. Sign up with a username and password created by the `doccano createuser`.
  31. 5. Click `Create` to create a project and start labeling data.
  32. 6. Click `Import dataset` on the dataset page and import the dataset you want to use.
  33. 7. Click `Start annotation` and label the data.
  34. 8. Click `Export dataset` on the dataset page and export the labeled dataset.
  35. ## Architecture
  36. You can customize doccano to suit your needs. The architecture of doccano consists of two parts: backend and frontend.
  37. | Module | Technology | Description |
  38. | ---------------- | ------------------------------------------- | ------------------------------------------ |
  39. | [doccano backend](https://github.com/doccano/doccano/tree/master/backend) | Python, [Django](https://www.djangoproject.com/), and [Django Rest Framework](https://www.django-rest-framework.org/) | Perform data labeling via REST APIs. |
  40. | [doccano frontend](https://github.com/doccano/doccano/tree/master/frontend) | Javascript web app using [Vue.js](https://vuejs.org/) and [Nuxt.js](https://nuxtjs.org/) | Perform data labeling in a user interface. |
  41. ## Contact
  42. For help and feedback, please feel free to contact [the author](https://github.com/Hironsan).