You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

362 lines
12 KiB

2 years ago
  1. # Install doccano
  2. Install doccano on local or in the cloud. Choose the installation method that works best for your environment:
  3. - [Install doccano](#install-doccano)
  4. - [System requirements](#system-requirements)
  5. - [Web browser support](#web-browser-support)
  6. - [Port requirements](#port-requirements)
  7. - [Install with pip](#install-with-pip)
  8. - [Use PostgreSQL as a database](#use-postgresql-as-a-database)
  9. - [Use RabbitMQ as a message broker](#use-rabbitmq-as-a-message-broker)
  10. - [Use Flower to monitor Celery tasks](#use-flower-to-monitor-celery-tasks)
  11. - [Install with Docker](#install-with-docker)
  12. - [Build a local image with Docker](#build-a-local-image-with-docker)
  13. - [Use Flower](#use-flower)
  14. - [Install with Docker Compose](#install-with-docker-compose)
  15. - [Install from source](#install-from-source)
  16. - [Backend](#backend)
  17. - [Frontend](#frontend)
  18. - [How to create a Python package](#how-to-create-a-python-package)
  19. - [Install to cloud](#install-to-cloud)
  20. - [Upgrade doccano](#upgrade-doccano)
  21. - [After v1.6.0](#after-v160)
  22. - [Before v1.6.0](#before-v160)
  23. ## System requirements
  24. You can install doccano on a Linux, Windows, or macOS machine running Python 3.8+.
  25. ### Web browser support
  26. doccano is tested with the latest version of Google Chrome and is expected to work in the latest versions of:
  27. - Google Chrome
  28. - Apple Safari
  29. If using other web browsers, or older versions of supported web browsers, unexpected behavior could occur.
  30. ### Port requirements
  31. doccano uses port 8000 by default. To use a different port, specify it when running doccano webserver.
  32. ## Install with pip
  33. To install doccano with pip, you need Python 3.8+. Run the following:
  34. ```bash
  35. pip install doccano
  36. ```
  37. After you install doccano, start the server with the following command:
  38. ```bash
  39. # Initialize database. First time only.
  40. doccano init
  41. # Create a super user. First time only.
  42. doccano createuser --username admin --password pass
  43. # Start a web server.
  44. doccano webserver --port 8000
  45. ```
  46. In another terminal, run the following command:
  47. ```bash
  48. # Start the task queue to handle file upload/download.
  49. doccano task
  50. ```
  51. Open <http://localhost:8000/>.
  52. ### Use PostgreSQL as a database
  53. By default, SQLite 3 is used for the default database system. You can also use other database systems like PostgreSQL, MySQL, and so on. Here we will show you how to use PostgreSQL.
  54. First, you need to install `psycopg2-binary` as an additional dependency:
  55. ```bash
  56. pip install psycopg2-binary
  57. ```
  58. Next, set up PostgreSQL. You can set up PostgreSQL directly, but here we will use Docker. Let's run the `docker run` command with the user name(`POSTGRES_USER`), password(`POSTGRES_PASSWORD`), and database name(`POSTGRES_DB`). For other options, please refer to the [official documentation](https://hub.docker.com/_/postgres).
  59. ```bash
  60. docker run -d \
  61. --name doccano-postgres \
  62. -e POSTGRES_USER=doccano_admin \
  63. -e POSTGRES_PASSWORD=doccano_pass \
  64. -e POSTGRES_DB=doccano \
  65. -v doccano-db:/var/lib/postgresql/data \
  66. -p 5432:5432 \
  67. postgres:13.8-alpine
  68. ```
  69. Then, set `DATABASE_URL` environment variable according to your PostgreSQL credentials. The schema is in line with dj-database-url. Please refer to the [official documentation](https://github.com/jazzband/dj-database-url) for the detailed information.
  70. ```bash
  71. # export DATABASE_URL="postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DB}?sslmode=disable"
  72. export DATABASE_URL="postgres://doccano_admin:doccano_pass@localhost:5432/doccano?sslmode=disable"
  73. ```
  74. That's it. Now you can start by running the `doccano init` command.
  75. ### Use RabbitMQ as a message broker
  76. doccano uses Celery and a message broker to handle long tasks like importing/exporting datasets. By default, SQLite3 is used for the default message broker. You can also use other message brokers like RabbitMQ, Redis, and so on. Here we will show you how to use RabbitMQ.
  77. First, set up RabbitMQ. You can set up RabbitMQ directly, but here we will use Docker. Let's run the `docker run` command with the user name(`RABBITMQ_DEFAULT_USER`), password(`RABBITMQ_DEFAULT_PASS`). For other options, please refer to the [official documentation](https://hub.docker.com/_/rabbitmq).
  78. ```bash
  79. docker run -d \
  80. --hostname doccano \
  81. --name doccano-rabbit \
  82. -e RABBITMQ_DEFAULT_USER=doccano_rabit \
  83. -e RABBITMQ_DEFAULT_PASS=doccano_pass \
  84. -p 5672:5672 \
  85. rabbitmq:3.10.7-alpine
  86. ```
  87. Then, set `CELERY_BROKER_URL` environment variable according to your RabbitMQ credentials. If you want to know the schema, please refer to the [official documentation](https://docs.celeryq.dev/en/stable/userguide/configuration.html#broker-settings).
  88. ```bash
  89. # export CELERY_BROKER_URL='amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@localhost:5672//'
  90. export CELERY_BROKER_URL='amqp://doccano_rabit:doccano_pass@localhost:5672//'
  91. ```
  92. That's it. Now you can start webserver and task queue by running the `doccano webserver` and `doccano task` command. Notice that the both commands needs `DATABASE_URL` and `CELERY_BROKER_URL` environment variables if you would change them.
  93. ### Use Flower to monitor Celery tasks
  94. If you want to monitor and manage celery tasks, you can use [Flower](https://flower.readthedocs.io/en/latest/index.html). The `–basic_auth` option accepts _user:password_ pairs separated by a comma. If configured, any client trying to access this Flower instance will be prompted to provide the credentials specified in this argument:
  95. ```bash
  96. doccano flower --basic_auth=user1:password1,user2:password2
  97. ```
  98. Open <http://localhost:5555/>.
  99. ## Install with Docker
  100. doccano is also available as a [Docker](https://www.docker.com/) container. Make sure you have Docker installed on your machine.
  101. To install and start doccano at <http://localhost:8000>, run the following command:
  102. ```bash
  103. docker pull doccano/doccano
  104. docker container create --name doccano \
  105. -e "ADMIN_USERNAME=admin" \
  106. -e "ADMIN_EMAIL=admin@example.com" \
  107. -e "ADMIN_PASSWORD=password" \
  108. -v doccano-db:/data \
  109. -p 8000:8000 doccano/doccano
  110. ```
  111. Next, start doccano by running the container:
  112. ```bash
  113. docker container start doccano
  114. ```
  115. To stop the container, run `docker container stop doccano -t 5`.
  116. All data created in the container persist across restarts.
  117. If you want to use the latest features, please specify `nightly` tag:
  118. ```bash
  119. docker pull doccano/doccano:nightly
  120. ```
  121. ### Build a local image with Docker
  122. If you want to build a local image, run:
  123. ```bash
  124. docker build -t doccano:latest . -f docker/Dockerfile
  125. ```
  126. ### Use Flower
  127. Set `FLOWER_BASIC_AUTH` environment variable and open `5555` port. The variable accepts _user:password_ pairs separated by a comma.
  128. ```bash
  129. docker container create --name doccano \
  130. -e "ADMIN_USERNAME=admin" \
  131. -e "ADMIN_EMAIL=admin@example.com" \
  132. -e "ADMIN_PASSWORD=password" \
  133. -e "FLOWER_BASIC_AUTH=username:password"
  134. -v doccano-db:/data \
  135. -p 8000:8000 -p 5555:5555 doccano/doccano
  136. ```
  137. ## Install with Docker Compose
  138. You need to install Git and to clone the repository:
  139. ```bash
  140. git clone https://github.com/doccano/doccano.git
  141. cd doccano
  142. ```
  143. To install and start doccano at <http://localhost>, run the following command:
  144. ```bash
  145. docker-compose -f docker/docker-compose.prod.yml --env-file .env up
  146. ```
  147. You can override the default setting by rewriting the `.env` file. See [./docker/.env.example](https://github.com/doccano/doccano/blob/master/docker/.env.example) in detail.
  148. ## Install from source
  149. If you want to develop doccano, consider downloading the source code using Git and running doccano locally. First of all, clone the repository:
  150. ```bash
  151. git clone https://github.com/doccano/doccano.git
  152. cd doccano
  153. ```
  154. ### Backend
  155. The doccano backend is built in Python 3.8+ and uses [Poetry](https://github.com/python-poetry/poetry) as a dependency manager. If you haven't installed them yet, please see [Python](https://www.python.org/downloads/) and [Poetry](https://python-poetry.org/docs/) documentation.
  156. First, to install the defined dependencies for our project, just run the `install` command. After that, activate the virtual environment by running `shell` command:
  157. ```bash
  158. cd backend
  159. poetry install
  160. poetry shell
  161. ```
  162. Second, set up the database and run the development server. Doccano uses [Django](https://www.djangoproject.com/) and [Django Rest Framework](https://www.django-rest-framework.org/) as a backend. We can set up them by using Django command:
  163. ```bash
  164. python manage.py migrate
  165. python manage.py create_roles
  166. python manage.py create_admin --noinput --username "admin" --email "admin@example.com" --password "password"
  167. python manage.py runserver
  168. ```
  169. In another terminal, you need to run Celery to use import/export dataset feature:
  170. ```bash
  171. cd doccano/backend
  172. celery --app=config worker --loglevel=INFO --concurrency=1
  173. ```
  174. After you change the code, don't forget to run [mypy](https://mypy.readthedocs.io/en/stable/index.html), [flake8](https://flake8.pycqa.org/en/latest/), [black](https://github.com/psf/black), and [isort](https://github.com/PyCQA/isort). These ensure code consistency. To run them, just run the following commands:
  175. ```bash
  176. poetry run task mypy
  177. poetry run task flake8
  178. poetry run task black
  179. poetry run task isort
  180. ```
  181. Similarly, you can run the test by executing the following command:
  182. ```bash
  183. poetry run task test
  184. ```
  185. Did you pass the test? Great!
  186. ### Frontend
  187. The doccano frontend is built in Node.js and uses [Yarn](https://yarnpkg.com/) as a package manager. If you haven't installed them yet, please see [Node.js](https://nodejs.org/en/) and [Yarn](https://yarnpkg.com/) documentation.
  188. First, to install the defined dependencies for our project, just run the `install` command.
  189. ```bash
  190. cd frontend
  191. yarn install
  192. ```
  193. Then run the `dev` command to serve with hot reload at <localhost:3000>:
  194. ```bash
  195. yarn dev
  196. ```
  197. After you change the code, don't forget to run
  198. the following commands to ensure code consistency:
  199. ```bash
  200. yarn lintfix
  201. yarn precommit
  202. yarn fix:prettier
  203. ```
  204. ### How to create a Python package
  205. During development, you may want to create a Python package and verify it works correctly. In such a case, you can create a package by running the following command in the root directory of your project:
  206. ```bash
  207. ./tools/create-package.sh
  208. ```
  209. This command builds the frontend, copies the files, and packages them. This will take a few minutes. After finishing the command, you will find `sdist` and `wheel` in `backend/dist`:
  210. ```bash
  211. Building doccano (1.5.5.post335.dev0+6be6d198)
  212. - Building sdist
  213. - Built doccano-1.5.5.post335.dev0+6be6d198.tar.gz
  214. - Building wheel
  215. - Built doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  216. ```
  217. Then, you can install the package via `pip install` command:
  218. ```bash
  219. pip install doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  220. ```
  221. ## Install to cloud
  222. doccano also supports one-click deployment to cloud providers. Click the following button, configure the environment, and access the UI.
  223. | Service | Button |
  224. |---------|---|
  225. | AWS | [![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://console.aws.amazon.com/cloudformation/home?#/stacks/new?stackName=doccano&templateURL=https://doccano.s3.amazonaws.com/public/cloudformation/template.aws.yaml) |
  226. | Heroku | [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://dashboard.heroku.com/new?template=https%3A%2F%2Fgithub.com%2Fdoccano%2Fdoccano) |
  227. ## Upgrade doccano
  228. Caution: If you use SQLite3 as a database, upgrading the package would lose your database.
  229. The migrate command has been supported since v1.6.0.
  230. ### After v1.6.0
  231. To upgrade to the latest version of doccano, reinstall or upgrade using pip.
  232. ```bash
  233. pip install -U doccano
  234. ```
  235. If you need to update the database scheme, run the following:
  236. ```bash
  237. doccano migrate
  238. ```
  239. ### Before v1.6.0
  240. First, you need to copy the database file and media directory in the case of SQLite3:
  241. ```bash
  242. mkdir -p ~/doccano
  243. # Replace your path.
  244. cp venv/lib/python3.8/site-packages/backend/db.sqlite3 ~/doccano/
  245. cp -r venv/lib/python3.8/site-packages/backend/media ~/doccano/
  246. ```
  247. Then, upgrade the package:
  248. ```bash
  249. pip install -U doccano
  250. ```
  251. At the end, run the migration:
  252. ```bash
  253. doccano migrate
  254. ```