You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

366 lines
12 KiB

2 years ago
  1. # Install doccano
  2. Install doccano on local or in the cloud. Choose the installation method that works best for your environment:
  3. - [Install doccano](#install-doccano)
  4. - [System requirements](#system-requirements)
  5. - [Web browser support](#web-browser-support)
  6. - [Port requirements](#port-requirements)
  7. - [Install with pip](#install-with-pip)
  8. - [Use PostgreSQL as a database](#use-postgresql-as-a-database)
  9. - [Use RabbitMQ as a message broker](#use-rabbitmq-as-a-message-broker)
  10. - [Use Flower to monitor Celery tasks](#use-flower-to-monitor-celery-tasks)
  11. - [Install with Docker](#install-with-docker)
  12. - [Build a local image with Docker](#build-a-local-image-with-docker)
  13. - [Use Flower](#use-flower)
  14. - [Install with Docker Compose](#install-with-docker-compose)
  15. - [Install from source](#install-from-source)
  16. - [Backend](#backend)
  17. - [Frontend](#frontend)
  18. - [How to create a Python package](#how-to-create-a-python-package)
  19. - [Install to cloud](#install-to-cloud)
  20. - [Upgrade doccano](#upgrade-doccano)
  21. - [After v1.6.0](#after-v160)
  22. - [Before v1.6.0](#before-v160)
  23. ## System requirements
  24. You can install doccano on a Linux, Windows, or macOS machine running Python 3.8+.
  25. ### Web browser support
  26. doccano is tested with the latest version of Google Chrome and is expected to work in the latest versions of:
  27. - Google Chrome
  28. - Apple Safari
  29. If using other web browsers, or older versions of supported web browsers, unexpected behavior could occur.
  30. ### Port requirements
  31. doccano uses port 8000 by default. To use a different port, specify it when running doccano webserver.
  32. ## Install with pip
  33. To install doccano with pip, you need Python 3.8+. Run the following:
  34. ```bash
  35. pip install doccano
  36. ```
  37. After you install doccano, start the server with the following command:
  38. ```bash
  39. # Initialize database. First time only.
  40. doccano init
  41. # Create a super user. First time only.
  42. doccano createuser --username admin --password pass
  43. # Start a web server.
  44. doccano webserver --port 8000
  45. ```
  46. In another terminal, run the following command:
  47. ```bash
  48. # Start the task queue to handle file upload/download.
  49. doccano task
  50. ```
  51. Open <http://localhost:8000/>.
  52. ### Use PostgreSQL as a database
  53. By default, SQLite 3 is used for the default database system. You can also use other database systems like PostgreSQL, MySQL, and so on. Here we will show you how to use PostgreSQL.
  54. First, you need to install `psycopg2-binary` as an additional dependency:
  55. ```bash
  56. pip install psycopg2-binary
  57. ```
  58. Next, set up PostgreSQL. You can set up PostgreSQL directly, but here we will use Docker. Let's run the `docker run` command with the user name(`POSTGRES_USER`), password(`POSTGRES_PASSWORD`), and database name(`POSTGRES_DB`). For other options, please refer to the [official documentation](https://hub.docker.com/_/postgres).
  59. ```bash
  60. docker run -d \
  61. --name doccano-postgres \
  62. -e POSTGRES_USER=doccano_admin \
  63. -e POSTGRES_PASSWORD=doccano_pass \
  64. -e POSTGRES_DB=doccano \
  65. -v doccano-db:/var/lib/postgresql/data \
  66. -p 5432:5432 \
  67. postgres:13.8-alpine
  68. ```
  69. Then, set `DATABASE_URL` environment variable according to your PostgreSQL credentials. The schema is in line with dj-database-url. Please refer to the [official documentation](https://github.com/jazzband/dj-database-url) for the detailed information.
  70. ```bash
  71. # export DATABASE_URL="postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DB}?sslmode=disable"
  72. export DATABASE_URL="postgres://doccano_admin:doccano_pass@localhost:5432/doccano?sslmode=disable"
  73. ```
  74. That's it. Now you can start by running the `doccano init` command.
  75. ### Use RabbitMQ as a message broker
  76. doccano uses Celery and a message broker to handle long tasks like importing/exporting datasets. By default, SQLite3 is used for the default message broker. You can also use other message brokers like RabbitMQ, Redis, and so on. Here we will show you how to use RabbitMQ.
  77. First, set up RabbitMQ. You can set up RabbitMQ directly, but here we will use Docker. Let's run the `docker run` command with the user name(`RABBITMQ_DEFAULT_USER`), password(`RABBITMQ_DEFAULT_PASS`). For other options, please refer to the [official documentation](https://hub.docker.com/_/rabbitmq).
  78. ```bash
  79. docker run -d \
  80. --hostname doccano \
  81. --name doccano-rabbit \
  82. -e RABBITMQ_DEFAULT_USER=doccano_rabit \
  83. -e RABBITMQ_DEFAULT_PASS=doccano_pass \
  84. -p 5672:5672 \
  85. rabbitmq:3.10.7-alpine
  86. ```
  87. Then, set `CELERY_BROKER_URL` environment variable according to your RabbitMQ credentials. If you want to know the schema, please refer to the [official documentation](https://docs.celeryq.dev/en/stable/userguide/configuration.html#broker-settings).
  88. ```bash
  89. # export CELERY_BROKER_URL='amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@localhost:5672//'
  90. export CELERY_BROKER_URL='amqp://doccano_rabit:doccano_pass@localhost:5672//'
  91. ```
  92. That's it. Now you can start webserver and task queue by running the `doccano webserver` and `doccano task` command. Notice that the both commands needs `DATABASE_URL` and `CELERY_BROKER_URL` environment variables if you would change them.
  93. ### Use Flower to monitor Celery tasks
  94. If you want to monitor and manage celery tasks, you can use [Flower](https://flower.readthedocs.io/en/latest/index.html). The `–basic_auth` option accepts _user:password_ pairs separated by a comma. If configured, any client trying to access this Flower instance will be prompted to provide the credentials specified in this argument:
  95. ```bash
  96. doccano flower --basic_auth=user1:password1,user2:password2
  97. ```
  98. Open <http://localhost:5555/>.
  99. ## Install with Docker
  100. doccano is also available as a [Docker](https://www.docker.com/) container. Make sure you have Docker installed on your machine.
  101. To install and start doccano at <http://localhost:8000>, run the following command:
  102. ```bash
  103. docker pull doccano/doccano
  104. docker container create --name doccano \
  105. -e "ADMIN_USERNAME=admin" \
  106. -e "ADMIN_EMAIL=admin@example.com" \
  107. -e "ADMIN_PASSWORD=password" \
  108. -v doccano-db:/data \
  109. -p 8000:8000 doccano/doccano
  110. ```
  111. Next, start doccano by running the container:
  112. ```bash
  113. docker container start doccano
  114. ```
  115. To stop the container, run `docker container stop doccano -t 5`.
  116. All data created in the container persist across restarts.
  117. If you want to use the latest features, please specify `nightly` tag:
  118. ```bash
  119. docker pull doccano/doccano:nightly
  120. ```
  121. ### Build a local image with Docker
  122. If you want to build a local image, run:
  123. ```bash
  124. docker build -t doccano:latest . -f docker/Dockerfile
  125. ```
  126. ### Use Flower
  127. Set `FLOWER_BASIC_AUTH` environment variable and open `5555` port. The variable accepts _user:password_ pairs separated by a comma.
  128. ```bash
  129. docker container create --name doccano \
  130. -e "ADMIN_USERNAME=admin" \
  131. -e "ADMIN_EMAIL=admin@example.com" \
  132. -e "ADMIN_PASSWORD=password" \
  133. -e "FLOWER_BASIC_AUTH=username:password"
  134. -v doccano-db:/data \
  135. -p 8000:8000 -p 5555:5555 doccano/doccano
  136. ```
  137. ## Install with Docker Compose
  138. You need to install Git and to clone the repository:
  139. ```bash
  140. git clone https://github.com/doccano/doccano.git
  141. cd doccano
  142. ```
  143. To install and start doccano at <http://localhost>, run the following command:
  144. ```bash
  145. cd docker
  146. cp .env.example .env
  147. # Edit with the editor of your choice, in this example nano is used (ctrl+x, then "y" to save).
  148. nano .env
  149. docker-compose -f docker-compose.prod.yml --env-file .env up
  150. ```
  151. You can override the default setting by rewriting the `.env` file. See [./docker/.env.example](https://github.com/doccano/doccano/blob/master/docker/.env.example) in detail.
  152. ## Install from source
  153. If you want to develop doccano, consider downloading the source code using Git and running doccano locally. First of all, clone the repository:
  154. ```bash
  155. git clone https://github.com/doccano/doccano.git
  156. cd doccano
  157. ```
  158. ### Backend
  159. The doccano backend is built in Python 3.8+ and uses [Poetry](https://github.com/python-poetry/poetry) as a dependency manager. If you haven't installed them yet, please see [Python](https://www.python.org/downloads/) and [Poetry](https://python-poetry.org/docs/) documentation.
  160. First, to install the defined dependencies for our project, just run the `install` command. After that, activate the virtual environment by running `shell` command:
  161. ```bash
  162. cd backend
  163. poetry install
  164. poetry shell
  165. ```
  166. Second, set up the database and run the development server. Doccano uses [Django](https://www.djangoproject.com/) and [Django Rest Framework](https://www.django-rest-framework.org/) as a backend. We can set up them by using Django command:
  167. ```bash
  168. python manage.py migrate
  169. python manage.py create_roles
  170. python manage.py create_admin --noinput --username "admin" --email "admin@example.com" --password "password"
  171. python manage.py runserver
  172. ```
  173. In another terminal, you need to run Celery to use import/export dataset feature:
  174. ```bash
  175. cd doccano/backend
  176. celery --app=config worker --loglevel=INFO --concurrency=1
  177. ```
  178. After you change the code, don't forget to run [mypy](https://mypy.readthedocs.io/en/stable/index.html), [flake8](https://flake8.pycqa.org/en/latest/), [black](https://github.com/psf/black), and [isort](https://github.com/PyCQA/isort). These ensure code consistency. To run them, just run the following commands:
  179. ```bash
  180. poetry run task mypy
  181. poetry run task flake8
  182. poetry run task black
  183. poetry run task isort
  184. ```
  185. Similarly, you can run the test by executing the following command:
  186. ```bash
  187. poetry run task test
  188. ```
  189. Did you pass the test? Great!
  190. ### Frontend
  191. The doccano frontend is built in Node.js and uses [Yarn](https://yarnpkg.com/) as a package manager. If you haven't installed them yet, please see [Node.js](https://nodejs.org/en/) and [Yarn](https://yarnpkg.com/) documentation.
  192. First, to install the defined dependencies for our project, just run the `install` command.
  193. ```bash
  194. cd frontend
  195. yarn install
  196. ```
  197. Then run the `dev` command to serve with hot reload at <localhost:3000>:
  198. ```bash
  199. yarn dev
  200. ```
  201. After you change the code, don't forget to run
  202. the following commands to ensure code consistency:
  203. ```bash
  204. yarn lintfix
  205. yarn precommit
  206. yarn fix:prettier
  207. ```
  208. ### How to create a Python package
  209. During development, you may want to create a Python package and verify it works correctly. In such a case, you can create a package by running the following command in the root directory of your project:
  210. ```bash
  211. ./tools/create-package.sh
  212. ```
  213. This command builds the frontend, copies the files, and packages them. This will take a few minutes. After finishing the command, you will find `sdist` and `wheel` in `backend/dist`:
  214. ```bash
  215. Building doccano (1.5.5.post335.dev0+6be6d198)
  216. - Building sdist
  217. - Built doccano-1.5.5.post335.dev0+6be6d198.tar.gz
  218. - Building wheel
  219. - Built doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  220. ```
  221. Then, you can install the package via `pip install` command:
  222. ```bash
  223. pip install doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  224. ```
  225. ## Install to cloud
  226. doccano also supports one-click deployment to cloud providers. Click the following button, configure the environment, and access the UI.
  227. | Service | Button |
  228. |---------|---|
  229. | AWS | [![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://console.aws.amazon.com/cloudformation/home?#/stacks/new?stackName=doccano&templateURL=https://doccano.s3.amazonaws.com/public/cloudformation/template.aws.yaml) |
  230. | Heroku | [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://dashboard.heroku.com/new?template=https%3A%2F%2Fgithub.com%2Fdoccano%2Fdoccano) |
  231. ## Upgrade doccano
  232. Caution: If you use SQLite3 as a database, upgrading the package would lose your database.
  233. The migrate command has been supported since v1.6.0.
  234. ### After v1.6.0
  235. To upgrade to the latest version of doccano, reinstall or upgrade using pip.
  236. ```bash
  237. pip install -U doccano
  238. ```
  239. If you need to update the database scheme, run the following:
  240. ```bash
  241. doccano migrate
  242. ```
  243. ### Before v1.6.0
  244. First, you need to copy the database file and media directory in the case of SQLite3:
  245. ```bash
  246. mkdir -p ~/doccano
  247. # Replace your path.
  248. cp venv/lib/python3.8/site-packages/backend/db.sqlite3 ~/doccano/
  249. cp -r venv/lib/python3.8/site-packages/backend/media ~/doccano/
  250. ```
  251. Then, upgrade the package:
  252. ```bash
  253. pip install -U doccano
  254. ```
  255. At the end, run the migration:
  256. ```bash
  257. doccano migrate
  258. ```