You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

330 lines
11 KiB

  1. # Install doccano
  2. Install doccano on local or in the cloud. Choose the installation method that works best for your environment:
  3. - [Install doccano](#install-doccano)
  4. - [System requirements](#system-requirements)
  5. - [Web browser support](#web-browser-support)
  6. - [Port requirements](#port-requirements)
  7. - [Install with pip](#install-with-pip)
  8. - [Use PostgreSQL as a database](#use-postgresql-as-a-database)
  9. - [Use RabbitMQ as a message broker](#use-rabbitmq-as-a-message-broker)
  10. - [Install with Docker](#install-with-docker)
  11. - [Build a local image with Docker](#build-a-local-image-with-docker)
  12. - [Install with Docker Compose](#install-with-docker-compose)
  13. - [Install from source](#install-from-source)
  14. - [Backend](#backend)
  15. - [Frontend](#frontend)
  16. - [How to create a Python package](#how-to-create-a-python-package)
  17. - [Install to cloud](#install-to-cloud)
  18. - [Upgrade doccano](#upgrade-doccano)
  19. - [After v1.6.0](#after-v160)
  20. - [Before v1.6.0](#before-v160)
  21. ## System requirements
  22. You can install doccano on a Linux, Windows, or macOS machine running Python 3.8+.
  23. ### Web browser support
  24. doccano is tested with the latest version of Google Chrome and is expected to work in the latest versions of:
  25. - Google Chrome
  26. - Apple Safari
  27. If using other web browsers, or older versions of supported web browsers, unexpected behavior could occur.
  28. ### Port requirements
  29. doccano uses port 8000 by default. To use a different port, specify it when running doccano webserver.
  30. ## Install with pip
  31. To install doccano with pip, you need Python 3.8+. Run the following:
  32. ```bash
  33. pip install doccano
  34. ```
  35. After you install doccano, start the server with the following command:
  36. ```bash
  37. # Initialize database. First time only.
  38. doccano init
  39. # Create a super user. First time only.
  40. doccano createuser --username admin --password pass
  41. # Start a web server.
  42. doccano webserver --port 8000
  43. ```
  44. In another terminal, run the following command:
  45. ```bash
  46. # Start the task queue to handle file upload/download.
  47. doccano task
  48. ```
  49. Open <http://localhost:8000/>.
  50. ### Use PostgreSQL as a database
  51. By default, SQLite 3 is used for the default database system. You can also use other database systems like PostgreSQL, MySQL, and so on. Here we will show you how to use PostgreSQL.
  52. First, you need to install `psycopg2-binary` as an additional dependency:
  53. ```bash
  54. pip install psycopg2-binary
  55. ```
  56. Next, set up PostgreSQL. You can set up PostgreSQL directly, but here we will use Docker. Let's run the `docker run` command with the user name(`POSTGRES_USER`), password(`POSTGRES_PASSWORD`), and database name(`POSTGRES_DB`). For other options, please refer to the [official documentation](https://hub.docker.com/_/postgres).
  57. ```bash
  58. docker run -d \
  59. --name doccano-postgres \
  60. -e POSTGRES_USER=doccano_admin \
  61. -e POSTGRES_PASSWORD=doccano_pass \
  62. -e POSTGRES_DB=doccano \
  63. -v doccano-db:/var/lib/postgresql/data \
  64. -p 5432:5432 \
  65. postgres:13.8-alpine
  66. ```
  67. Then, set `DATABASE_URL` environment variable according to your PostgreSQL credentials. The schema is in line with dj-database-url. Please refer to the [official documentation](https://github.com/jazzband/dj-database-url) for the detailed information.
  68. ```bash
  69. # export DATABASE_URL="postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DB}?sslmode=disable"
  70. export DATABASE_URL="postgres://doccano_admin:doccano_pass@localhost:5432/doccano?sslmode=disable"
  71. ```
  72. That's it. Now you can start by running the `doccano init` command.
  73. ### Use RabbitMQ as a message broker
  74. doccano uses Celery and a message broker to handle long tasks like importing/exxporting datasets. By default, SQLite3 is used for the default message broker. You can also use other message brokers like RabbitMQ, Redis, and so on. Here we will show you how to use RabbitMQ.
  75. First, set up RabbitMQ. You can set up RabbitMQ directly, but here we will use Docker. Let's run the `docker run` command with the user name(`RABBITMQ_DEFAULT_USER`), password(`RABBITMQ_DEFAULT_PASS`). For other options, please refer to the [official documentation](https://hub.docker.com/_/rabbitmq).
  76. ```bash
  77. docker run -d \
  78. --hostname doccano \
  79. --name doccano-rabbit \
  80. -e RABBITMQ_DEFAULT_USER=doccano_rabit \
  81. -e RABBITMQ_DEFAULT_PASS=doccano_pass \
  82. -p 5672:5672 \
  83. rabbitmq:3.10.7-alpine
  84. ```
  85. Then, set `CELERY_BROKER_URL` environment variable according to your RabbitMQ credentials. If you want to know the schema, please refer to the [official documentation](https://docs.celeryq.dev/en/stable/userguide/configuration.html#broker-settings).
  86. ```bash
  87. # export CELERY_BROKER_URL='amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@localhost:5672//'
  88. export CELERY_BROKER_URL='amqp://doccano_rabit:doccano_pass@localhost:5672//'
  89. ```
  90. That's it. Now you can start webserver and task queue by running the `doccano webserver` and `doccano task` command. Notice that the both commands needs `DATABASE_URL` and `CELERY_BROKER_URL` environment variables if you would change them.
  91. ## Install with Docker
  92. doccano is also available as a [Docker](https://www.docker.com/) container. Make sure you have Docker installed on your machine.
  93. To install and start doccano at <http://localhost:8000>, run the following command:
  94. ```bash
  95. docker pull doccano/doccano
  96. docker container create --name doccano \
  97. -e "ADMIN_USERNAME=admin" \
  98. -e "ADMIN_EMAIL=admin@example.com" \
  99. -e "ADMIN_PASSWORD=password" \
  100. -v doccano-db:/data \
  101. -p 8000:8000 doccano/doccano
  102. ```
  103. Next, start doccano by running the container:
  104. ```bash
  105. docker container start doccano
  106. ```
  107. To stop the container, run `docker container stop doccano -t 5`.
  108. All data created in the container persist across restarts.
  109. ### Build a local image with Docker
  110. If you want to build a local image, run:
  111. ```bash
  112. docker build -t doccano:latest . -f docker/Dockerfile
  113. ```
  114. ## Install with Docker Compose
  115. You need to install Git and to clone the repository:
  116. ```bash
  117. git clone https://github.com/doccano/doccano.git
  118. cd doccano
  119. ```
  120. To install and start doccano at <http://localhost>, run the following command:
  121. ```bash
  122. docker-compose -f docker/docker-compose.prod.yml --env-file .env up
  123. ```
  124. You can override the default setting by rewriting the `.env` file. See [./docker/.env.example](https://github.com/doccano/doccano/blob/master/docker/.env.example) in detail.
  125. ## Install from source
  126. If you want to develop doccano, consider downloading the source code using Git and running doccano locally. First of all, clone the repository:
  127. ```bash
  128. git clone https://github.com/doccano/doccano.git
  129. cd doccano
  130. ```
  131. ### Backend
  132. The doccano backend is built in Python 3.8+ and uses [Poetry](https://github.com/python-poetry/poetry) as a dependency manager. If you haven't installed them yet, please see [Python](https://www.python.org/downloads/) and [Poetry](https://python-poetry.org/docs/) documentation.
  133. First, to install the defined dependencies for our project, just run the `install` command. After that, activate the virtual environment by running `shell` command:
  134. ```bash
  135. cd backend
  136. poetry install
  137. poetry shell
  138. ```
  139. Second, set up the database and run the development server. Doccano uses [Django](https://www.djangoproject.com/) and [Django Rest Framework](https://www.django-rest-framework.org/) as a backend. We can set up them by using Django command:
  140. ```bash
  141. python manage.py migrate
  142. python manage.py create_roles
  143. python manage.py create_admin --noinput --username "admin" --email "admin@example.com" --password "password"
  144. python manage.py runserver
  145. ```
  146. In another terminal, you need to run Celery to use import/export dataset feature:
  147. ```bash
  148. cd doccano/backend
  149. celery --app=config worker --loglevel=INFO --concurrency=1
  150. ```
  151. After you change the code, don't forget to run [mypy](https://mypy.readthedocs.io/en/stable/index.html), [flake8](https://flake8.pycqa.org/en/latest/), [black](https://github.com/psf/black), and [isort](https://github.com/PyCQA/isort). These ensure code consistency. To run them, just run the following commands:
  152. ```bash
  153. poetry run task mypy
  154. poetry run task flake8
  155. poetry run task black
  156. poetry run task isort
  157. ```
  158. Similarly, you can run the test by executing the following command:
  159. ```bash
  160. poetry run task test
  161. ```
  162. Did you pass the test? Great!
  163. ### Frontend
  164. The doccano frontend is built in Node.js and uses [Yarn](https://yarnpkg.com/) as a package manager. If you haven't installed them yet, please see [Node.js](https://nodejs.org/en/) and [Yarn](https://yarnpkg.com/) documentation.
  165. First, to install the defined dependencies for our project, just run the `install` command.
  166. ```bash
  167. cd frontend
  168. yarn install
  169. ```
  170. Then run the `dev` command to serve with hot reload at <localhost:3000>:
  171. ```bash
  172. yarn dev
  173. ```
  174. After you change the code, don't forget to run
  175. the following commands to ensure code consistency:
  176. ```bash
  177. yarn lintfix
  178. yarn precommit
  179. yarn fix:prettier
  180. ```
  181. ### How to create a Python package
  182. During development, you may want to create a Python package and verify it works correctly. In such a case, you can create a package by running the following command in the root directory of your project:
  183. ```bash
  184. ./tools/create-package.sh
  185. ```
  186. This command builds the frontend, copies the files, and packages them. This will take a few minutes. After finishing the command, you will find `sdist` and `wheel` in `backend/dist`:
  187. ```bash
  188. Building doccano (1.5.5.post335.dev0+6be6d198)
  189. - Building sdist
  190. - Built doccano-1.5.5.post335.dev0+6be6d198.tar.gz
  191. - Building wheel
  192. - Built doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  193. ```
  194. Then, you can install the package via `pip install` command:
  195. ```bash
  196. pip install doccano-1.5.5.post335.dev0+6be6d198-py3-none-any.whl
  197. ```
  198. ## Install to cloud
  199. doccano also supports one-click deployment to cloud providers. Click the following button, configure the environment, and access the UI.
  200. | Service | Button |
  201. |---------|---|
  202. | AWS | [![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://console.aws.amazon.com/cloudformation/home?#/stacks/new?stackName=doccano&templateURL=https://doccano.s3.amazonaws.com/public/cloudformation/template.aws.yaml) |
  203. | Heroku | [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://dashboard.heroku.com/new?template=https%3A%2F%2Fgithub.com%2Fdoccano%2Fdoccano) |
  204. ## Upgrade doccano
  205. Caution: If you use SQLite3 as a database, upgrading the package would lose your database.
  206. The migrate command has been supported since v1.6.0.
  207. ### After v1.6.0
  208. To upgrade to the latest version of doccano, reinstall or upgrade using pip.
  209. ```bash
  210. pip install -U doccano
  211. ```
  212. If you need to update the database scheme, run the following:
  213. ```bash
  214. doccano migrate
  215. ```
  216. ### Before v1.6.0
  217. First, you need to copy the database file and media directory in the case of SQLite3:
  218. ```bash
  219. mkdir -p ~/doccano
  220. # Replace your path.
  221. cp venv/lib/python3.8/site-packages/backend/db.sqlite3 ~/doccano/
  222. cp -r venv/lib/python3.8/site-packages/backend/media ~/doccano/
  223. ```
  224. Then, upgrade the package:
  225. ```bash
  226. pip install -U doccano
  227. ```
  228. At the end, run the migration:
  229. ```bash
  230. doccano migrate
  231. ```