You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

88 lines
12 KiB

  1. # Developer Guide
  2. The important directories are as follows:
  3. ```bash
  4. ├── backend/
  5. ├── docker/
  6. ├── frontend/
  7. └── tools/
  8. ```
  9. ## backend
  10. The `backend/` directory includes the backend's REST API code. These APIs are built by [Python 3.8+](https://www.python.org/) and [Django 4.0+](https://www.djangoproject.com). The all of the packages are managed by Poetry, Python packaging and dependency management software. The directory structure of the backend follows mainly [Django](https://www.djangoproject.com) one. The following table shows the main files and directories:
  11. | file or directory | description |
  12. | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  13. | api/ | Django application. In the older versions, this manages all the APIs. Now, there is only an API to check the status of Celery tasks. |
  14. | auto_labeling/ | Django application. This manages the features related to auto labeling. |
  15. | config/ | Django settings. This includes multiple setting files like production and development. |
  16. | data_export/ | Django application. This manages the features related to data export. |
  17. | data_import/ | Django application. This manages the features related to data import. |
  18. | examples/ | Django application. This manages the features related to manipulate [examples](https://developers.google.com/machine-learning/glossary#example). |
  19. | label_types/ | Django application. This manages the feature related to label types. |
  20. | labels/ | Django application. This manages the feature related to labeling. |
  21. | metrics/ | Django application. This manages the feature related to project metrics like the progress for each user, label distribution and so on. |
  22. | projects/ | Django application. This manages the feature related to project manipulation. A project includes its members, examples, label types, and labels. |
  23. | roles/ | Django application. This manages the feature related to roles. There are three roles: administrator, annotator, approver. These roles are assigned to the project members and defines their permission. |
  24. | users/ | Django application. This manages the feature related to users. |
  25. | cli.py | This defines the command line interfaces. If you install doccano by Python package, this file is used to setup database, create a superuser, run webserver and so on. |
  26. | manage.py | Django management script. See [django-admin and manage.py](https://docs.djangoproject.com/en/4.0/ref/django-admin/) in detail. |
  27. | poetry.lock | Related to Poetry. This file prevents you from automatically getting the latest versions of your dependencies. See [Basic usage](https://python-poetry.org/docs/basic-usage/) in Poetry documentation. |
  28. | pyproject.toml | This file contains build system requirements and information, which are used by pip to build the package. See [pyproject.toml](https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/) and [The pyproject.toml file in Poetry](https://python-poetry.org/docs/pyproject/) in detail. |
  29. If you want to setup the backend environment, please see [Installation guide](./install_and_upgrade_doccano.md#install-from-source).
  30. Also, you can set the following environment variables:
  31. | Environment Variable | Description |
  32. | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  33. | SECRET_KEY | A secret key for a particular doccano installation. This is used to provide cryptographic signing, and should be set to a unique, unpredictable value. You should change the fixed default value. See [SECRET_KEY](https://docs.djangoproject.com/en/4.1/ref/settings/#std-setting-SECRET_KEY) in detail. |
  34. | DEBUG | A boolean that turns on/off debug mode. If `DEBUG` is `True`, the detailed error message will be shown. The default value is `True`. See [DEBUG](https://docs.djangoproject.com/en/4.1/ref/settings/) in detail. |
  35. | DATABASE_URL | A string to specify the database configuration. The string schema is in line with [dj-database-url](https://github.com/jazzband/dj-database-url). See the page for the detailed information. |
  36. | IMPORT_BATCH_SIZE | A number to specify the batch size for importing dataset. The larger the value, the faster the dataset imports. The default value is `1000`. |
  37. | MAX_UPLOAD_SIZE | A number to specify the max upload file size. The default value is 1073741824(1024^3=1GB). |
  38. | ENABLE_FILE_TYPE_CHECK | A boolean that turns on/off file type check on importing datasets. If `ENABLE_FILE_TYPE_CHECK` is `True`, the MIME types of the files are checked. |
  39. | CELERY_BROKER_URL | A string to point to your broker’s service URL. See [Configuration and defaults](https://docs.celeryq.dev/en/stable/userguide/configuration.html) in detail. |
  40. ## docker
  41. | file | description |
  42. | ----------------------- | ------------------------------------------------------------------------------------------------------------------------ |
  43. | nginx/ | The `nginx` directory contains a NGINX configuration files. They are used only in `docker-compose.prod.yml`. |
  44. | .env.example | The example of `.env` file. This is used only in `docker-compose.prod.yml`. |
  45. | docker-compose.prod.yml | This file contains Docker Compose configuration to run a production environment. We adopted the three tier architecture. |
  46. | Dockerfile | The dockerfile. You can pull the image from [doccano/doccano](https://hub.docker.com/r/doccano/doccano). |
  47. | Dockerfile.heroku | The dockerfile for Heroku. |
  48. | Dockerfile.nginx | The dockerfile to build nginx container. This is used only in `docker-compose.prod.yml`. |
  49. | Dockerfile.prod | The dockerfile to build application container. This is used only in `docker-compose.prod.yml`. |
  50. The architecture of the `docker-compose.prod.yml` is as follows:
  51. ![](images/developer_guide/architecture_docker_compose.png)
  52. On the other hand, the one of the `Dockerfile` is as follows:
  53. ![](images/developer_guide/architecture_docker.png)
  54. ## frontend
  55. The `frontend/` directory contains frontend code. The `frontent` directory structure follows [Nuxt.js](https://ru.nuxtjs.org) one. See the [Nuxt.js documentation](https://nuxtjs.org/guide/directory-structure/) in details.
  56. ## tools
  57. The `tools` directory contains some shell scripts. They are mainly used in Docker containers:
  58. | file | description |
  59. | ----------------- | -------------------------------------------------------------------------------------------------------------------- |
  60. | create-package.sh | This script creates doccano's Python package. Note that yarn and poetry must already be installed. |
  61. | heroku.sh | This script is used to create django's superuser in Heroku. |
  62. | prod-celery.sh | This script is used to run celery in `docker-compose.prod.yml`. |
  63. | prod-flower.sh | This script is used to run Flower in `docker-compose.prod.yml`. |
  64. | prod-django.sh | This script is used to run gunicorn in `docker-compose.prod.yml`. In addition, create roles, superuser, and migrate. |
  65. | run.sh | This script is used in `Dockerfile`. After creating roles and superuser, run gunicorn and celery. |
  66. ## Architecture of Python package
  67. ![](images/developer_guide/architecture_python_package.png)