From e6badb20c01e0b9966e89a433397073b3af64595 Mon Sep 17 00:00:00 2001 From: Hironsan Date: Thu, 13 Jun 2019 15:12:44 +0900 Subject: [PATCH] Add documentation by mkdocs --- docs/advanced/aws_https_settings.md | 100 +++++++++++++++++++++++++++ docs/advanced/oauth2_settings.md | 36 ++++++++++ docs/faq.md | 55 +++++++++++++++ docs/getting-started.md | 102 ++++++++++++++++++++++++++++ docs/index.md | 59 ++++++++++++++++ ROADMAP.md => docs/roadmap.md | 18 ++--- mkdocs.yml | 41 +++++++++++ 7 files changed, 402 insertions(+), 9 deletions(-) create mode 100644 docs/advanced/aws_https_settings.md create mode 100644 docs/advanced/oauth2_settings.md create mode 100644 docs/faq.md create mode 100644 docs/getting-started.md create mode 100644 docs/index.md rename ROADMAP.md => docs/roadmap.md (94%) create mode 100644 mkdocs.yml diff --git a/docs/advanced/aws_https_settings.md b/docs/advanced/aws_https_settings.md new file mode 100644 index 00000000..5b84d57f --- /dev/null +++ b/docs/advanced/aws_https_settings.md @@ -0,0 +1,100 @@ +# HTTPS settings for doccano in AWS + + +1. Create hosted zone in Route 53 +2. Create certificate in ACM +3. Create EC2 instance +4. Create ELB +5. Create A record in Route 53 + + + +# Create hosted zone in Route 53 + +HTTPS need a domain name. If you don't have one, you can register it by the AWS Route 53 service, or you can get one from other domain seller. + +After you get a domain name, you can create Hosted Zone by Route 53. + +If you register domain from Route 53, you can find it in the `Hosted Zone`. + +![2B0FF02C-42DA-41D1-BFA1-31018BE006ED](https://camo.githubusercontent.com/998dab1eca0e9673ab98d92b65b199cb4e2f96ea/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b665463677931673132397a346c3733726a333131783065673078332e6a7067) + + + +# Create certificate in ACM + +![22F3520E-909A-4215-B73A-DBB452E3D4E2](https://camo.githubusercontent.com/e3e0a24d2265728072d9e65220a41d2ddd6b42bb/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326132653362306a6a3331666c3062683433312e6a7067) + +You should replace the domain name by yours. + +![image-20190314145326046](https://camo.githubusercontent.com/faf83a9ee1774d92a01de9f69e48ed002c7a827e/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b66546367793167313261336a356d33756a333166393066613077342e6a7067) + + + + + +![image-20190314145344449](https://camo.githubusercontent.com/874362144a3547629383ad93e1f13831e35d0b82/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132613375736232626a33306b6b3039626a73762e6a7067) + + + +![4FC120A2-6DB5-4F03-A209-12C22EDD6097](https://camo.githubusercontent.com/b75bc07e8d96b796872c697de951ab44d74d04d3/68747470733a2f2f7773342e73696e61696d672e636e2f6c617267652f303036744b665463677931673132613873643730786a3331667630686637646d2e6a7067) + + + +Don't forget to Create record in Route 53 in step 4. + +After you request a certificate, wait for a while, You should see the status become 'Issued'. + + + +![3AAE20BC-FC34-4738-AED0-D7D67929F6FF](https://camo.githubusercontent.com/82528820652678c19ee46ff5a0f07dbfaba31f5e/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b66546367793167313261356a776270726a333136743066387139622e6a7067) + +# Create EC2 instance + +In this part, you can just click the launch button to create a EC2 instance. + +[![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano) + +# Create ELB + +![image-20190314150439785](https://camo.githubusercontent.com/158c2fb2957546ed8bb82694497b60b9c7f38aa5/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326166376a676a746a3330663230337a3734742e6a7067) + +Click the `Create Load Balancer` button and select `Application Load Balancer`. + +Fill the name, change protocol to HTTPS, and do not forget add at least two availability zones. Make sure the zone that EC2 instance created is included. + +![02BE83A7-4C43-48BE-BCF0-95D2DF7C603D](https://camo.githubusercontent.com/c4cc530aea78e66ea99eab905804cae66ab20a04/68747470733a2f2f7773342e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616861756566736a3330796e306d6e6774732e6a7067) + +Select the certificate we created early. + +![image-20190314151004337](https://camo.githubusercontent.com/455140fc7b7a22a18e96e5f2aa31d9fd0e7c7722/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616b75693576366a333071763063303431382e6a7067) + +You can select the same security groups created when you create the EC2 instance. + +![image-20190314151110756](https://camo.githubusercontent.com/5d029d4fa494420ed077be6b57ab60935d378e7f/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616c7a796735756a33313272306139676f392e6a7067) + +Or you can create a new one + +![image-20190314151253917](https://camo.githubusercontent.com/e620c6738ff95f3311edf708b80a949f8b79f565/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616e736d3931706a333163313062646469652e6a7067) + + + +Fill the target group name and leave others defualt. + +![image-20190314151314109](https://camo.githubusercontent.com/f22b99c57ca9b8114683f1501942dcc3cc0874f1/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616f34797661746a3330716630666a74616d2e6a7067) + +Add the instance to registered. + +![image-20190314151358736](https://camo.githubusercontent.com/515649dce66466e9cefa730fc1a35a398ecb260d/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616f777667736f6a333136793066346164672e6a7067) + +Then review and create. + +# Create A record in Route 53 + +Back to route 53, and click `Create Record Set`. Fill the subname and the ELB name in the `Alias Target`. + +![image-20190314151601030](https://camo.githubusercontent.com/82944e13e1ef3f4015484417a50635c9352dae33/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132617231383931666a33306278306e6d6a746d2e6a7067) + +Finally, you can access the doccano by HTTPS. + +![image-20190314151841872](https://camo.githubusercontent.com/85dfef30b4b01df5e0d8e339b38e5a31592dd103/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326174746563636b6a3330716730396d6a73612e6a7067) diff --git a/docs/advanced/oauth2_settings.md b/docs/advanced/oauth2_settings.md new file mode 100644 index 00000000..77afb836 --- /dev/null +++ b/docs/advanced/oauth2_settings.md @@ -0,0 +1,36 @@ +This document aims to instruct how to setup OAuth for doccano. doccano now supports social login via GitHub and Active Directory by [#75](https://github.com/chakki-works/doccano/pull/75). In this document, we show GitHub OAuth as an example. + + +## Create OAuth App + +1. In the upper-right corner of GitHub, click your profile photo, then click **Settings**. +2. In the left sidebar, click **Developer settings**. +3. In the left sidebar, click **OAuth Apps**. +4. Click **New OAuth App**. +5. In "Application name", type the name of your app. +6. In "Homepage URL", type the full URL to your app's website. +7. In "Authorization callback URL", type the callback URL(e.g. ) of your app. +8. Click Register application. + +## Set enviromental variables + +Once the application is registered, your app's `Client ID` and `Client Secret` will be displayed on the following page: +![image](https://user-images.githubusercontent.com/6737785/51811605-1073d480-22f1-11e9-8be0-726a8ee5e832.png) + +1. Copy the `Client ID` and `Client Secret` from the Developer Applications of your app on GitHub. +2. Set the `Client ID` and `Client Secret` to enviromental variables: + +```bash +export OAUTH_GITHUB_KEY=YOUR_CLIENT_ID +export OAUTH_GITHUB_SECRET=YOUR_CLIENT_SECRET +``` + +## Run server + +```bash +python manage.py runserver +``` + +Go to login page: + +![image](https://user-images.githubusercontent.com/6737785/51812454-e7edd980-22f4-11e9-80c6-2f18fbc49108.png) \ No newline at end of file diff --git a/docs/faq.md b/docs/faq.md new file mode 100644 index 00000000..b17c5bf7 --- /dev/null +++ b/docs/faq.md @@ -0,0 +1,55 @@ +## I can't install doccano. + +Following list is ordered by from easy to hard. If you are not familiar with Python development, please consider easy setup. + +1. [One click deployment to Cloud Service.](https://github.com/chakki-works/doccano#deployment) + * Only you have to do is create an account. Especially [Heroku](https://www.heroku.com/home) does not require your credit card (if free plan). + * [![Deploy to Azure](https://azuredeploy.net/deploybutton.svg)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fchakki-works%2Fdoccano%2Fmaster%2Fazuredeploy.json) + * [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy) + * [![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano) + * > Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS). +2. [Use Docker](https://docs.docker.com/install/) + * Docker doesn't bother you by the OS, Python version, etc problems. Because an environment for application is packed as a container. + * Get doccano's image: `docker pull chakkiworks/doccano` + * Create & Run doccano container: `docker run -d --name doccano -p 8000:80 chakkiworks/doccano` + * Create a user: `docker exec doccano tools/create-admin.sh "admin" "admin@example.com" "password"` + * Stop doccano container: `docker stop doccano` + * Re-Launch doccano container: `docker start doccano` +3. Install from source + * **I want to remember you that this is the hardest setup way. You have to install Python/Node.js and type many commands.** + * [Install Python](https://www.python.org/downloads/) + * [Install Node.js](https://nodejs.org/en/download/) + * Get the source code of doccano: `git clone https://github.com/chakki-works/doccano.git` + * Move to doccano directory: `cd doccano` + * Create environment for doccano: `virtualenv venv` + * Activate environment: `source venv/bin/activate` + * Install required packages: `pip install -r requirements.txt` + * Move server directory: `cd app/server` + * Build frontend library: `npm install` + * Build frontend source code: `npm run build` + * Back to server directory: `cd ../` + * Initialize doccano: `python manage.py migrate` + * Create user: `python manage.py createsuperuser` + * Run doccano: `python manage.py runserver` + * Stop doccano: Ctrl+C + * Re-Launch doccano: `python manage.py runserver` (Confirm you are at `app/server` directory and environment is active). + +## I can't upload my data. + +Please check the following list. + +- File encoding: `UTF-8` is appropriate. +- Filename: alphabetic file name is suitable. +- File format selection: File format radio button should be selected properly. +- When you are using JSON/JSONL: Confirm JSON data is valid. + - You can use [JSONLint](https://jsonlint.com/) or some other tool (when JSONL, pick one data and check it). +- When you are using CSV: Confirm CSV data is valid. + - You can use Excel or some tools that have import CSV feature. +- Lack of line: Data file should not contain blank line. +- Lack of field: Data file should not contain blank field. + +**You don't need your real & all data to validate file format. The picked data & masked data is suitable if your data is large or secret.** + +## I want to add annotators. + +* You can create other annotators by [Django Admin site](https://djangobook.com/django-admin-site/). diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 00000000..c52c707c --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,102 @@ +# Getting started + +## Quick install guide + +First of all, you have to clone the repository: + +```bash +git clone https://github.com/chakki-works/doccano.git +cd doccano +``` + +To install doccano, there are three options: + +### Option1: Pull the production Docker image + +```bash +docker pull chakkiworks/doccano +``` + +### Option2: Pull the development Docker-Compose images + +```bash +docker-compose pull +``` + +### Option3: Setup Python environment + +First we need to install the dependencies. Run the following commands: + +```bash +pip install -r requirements.txt +cd app +``` + +Next we need to start the webpack server so that the frontend gets compiled continuously. +Run the following commands in a new shell: + +```bash +cd server/static +npm install +npm run build +# npm start # for developers +cd .. +``` + +## Usage + +Let’s start the development server and explore it. + +Depending on your installation method, there are two options: + +### Option1: Running the Docker image as a Container + +First, run a Docker container: + +```bash +docker run -d --name doccano -p 8000:80 chakkiworks/doccano +``` + +Then, execute `create-admin.sh` script for creating a superuser. + +```bash +docker exec doccano tools/create-admin.sh "admin" "admin@example.com" "password" +``` + +### Option2: Running the development Docker-Compose stack + +We can use docker-compose to set up the webpack server, django server, database, etc. all in one command: + +```bash +docker-compose up +``` + +Now, open a Web browser and go to . You should see the login screen: + +![Login form](./login_form.png) + +### Option3: Running Django development server + +Before running, we need to make migration. Run the following command: + +```bash +python manage.py migrate +``` + +Next we need to create a user who can login to the admin site. Run the following command: + +```bash +python manage.py create_admin --noinput --username "admin" --email "admin@example.com" --password "password" +``` + +Developers can also validate that the project works as expected by running the tests: + +```bash +python manage.py test server.tests +``` + +Finally, to start the server, run the following command: + +```bash +python manage.py runserver +``` \ No newline at end of file diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 00000000..e60f5cf0 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,59 @@ +# Welcome to doccano + +## Text Annotation for Human + +doccano is an open source text annotation tool for human. It provides annotation features for text classification, sequence labeling and sequence to sequence. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create project, upload data and start annotation. You can build dataset in hours. + + +## Demo + +You can enjoy [annotation demo](http://doccano.herokuapp.com). + +### [Named entity recognition](https://doccano.herokuapp.com/demo/named-entity-recognition/) + +First demo is one of the sequence labeling tasks, named-entity recognition. You just select text spans and annotate it. Since doccano supports shortcut key, so you can quickly annotate text spans. + +![Named Entity Recognition](./named_entity_annotation.gif) + +### [Sentiment analysis](https://doccano.herokuapp.com/demo/text-classification/) + +Second demo is one of the text classification tasks, topic classification. Since there may be more than one category, you can annotate multi-labels. + +![Text Classification](./text_classification.gif) + +### [Machine translation](https://doccano.herokuapp.com/demo/translation/) + +Final demo is one of the sequence to sequence tasks, machine translation. Since there may be more than one responses in sequence to sequence tasks, you can create multi responses. + +![Machine Translation](./translation.gif) + +## Quick Deployment + +### Azure + +Doccano can be deployed to Azure ([Web App for Containers](https://azure.microsoft.com/en-us/services/app-service/containers/) + +[PostgreSQL database](https://azure.microsoft.com/en-us/services/postgresql/)) by clicking on the button below: + +[![Deploy to Azure](https://azuredeploy.net/deploybutton.svg)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fchakki-works%2Fdoccano%2Fmaster%2Fazuredeploy.json) + +### Heroku + +Doccano can be deployed to [Heroku](https://www.heroku.com/) by clicking on the button below: + +[![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy) + +Of course, you can deploy doccano by using [heroku-cli](https://devcenter.heroku.com/articles/heroku-cli). + +```bash +heroku create +heroku stack:set container +git push heroku master +``` + +### AWS + +Doccano can be deployed to AWS ([Cloudformation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html)) by clicking on the button below: + +[![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano) + +> Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS). diff --git a/ROADMAP.md b/docs/roadmap.md similarity index 94% rename from ROADMAP.md rename to docs/roadmap.md index 68bc42c9..ecb963c9 100644 --- a/ROADMAP.md +++ b/docs/roadmap.md @@ -2,7 +2,7 @@ Doccano is a fast-moving, community supported project. This roadmap provides guidance about priorities and focus areas of the doccano team and lists the functionality expected in upcoming releases of doccano. Many of these areas are driven by community use cases, and we welcome further contributions to doccano. -# Current status +## Current status Doccano is now able to: @@ -18,11 +18,11 @@ Doccano is now able to: * Confirm annotation statistics. * Access via web API. -# Roadmap +## Roadmap This is a list of features on the short term roadmap and beyond: -## APIs +### APIs * Implement login and registration API. * Implement OAuth 2.0 API. @@ -32,14 +32,14 @@ This is a list of features on the short term roadmap and beyond: * Optimize performance for statistics API. * More documentation and tutorials. -## Project management +### Project management * Enable to manage user by project administrators. * Implement RBAC and enable to assign a role to a user by project administrators. * Enhance annotation statistics. -## Annotation +### Annotation * Increase the number of annotation tasks such as relation extraction, entity linking, aspect-based sentiment analysis, visual question answering and so on. * Introduce a plugin feature to define custom tasks by a user. @@ -49,7 +49,7 @@ This is a list of features on the short term roadmap and beyond: * More documentation and tutorials. -## Upload and download +### Upload and download * Enable to import data from cloud storage like s3. * Improve UX by showing progress bar. @@ -57,7 +57,7 @@ This is a list of features on the short term roadmap and beyond: * Support for custom tokenization. * More performance optimizations. -## Accessibility +### Accessibility * Support smartphone to enable anyone to annotate anywhere. * Enable to customize font and font-family. @@ -65,7 +65,7 @@ This is a list of features on the short term roadmap and beyond: * Enable to customize site theme per user. -## Entire project +### Entire project * Design Vue component and use it to implement frontend. * Introduce frontend testing framework. @@ -76,7 +76,7 @@ This is a list of features on the short term roadmap and beyond: * Improve project management structure to accelerate the project improvement. * Create GitHub page by using mkdocs and move wiki contents to it. -## Community and engagement +### Community and engagement * New resources for community discussion and feedback. * Gather and highlight novel doccano use cases. diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 00000000..8b124b9f --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,41 @@ +# Project information +site_name: 'doccano' +site_description: 'A Text Annotation tool for Human' +site_author: 'Hiroki Nakayama' +site_url: 'https://chakki-works.github.io/doccano/' + +# Repository +repo_name: 'chakki-works/doccano' +repo_url: 'https://github.com/chakki-works/doccano' + +# Copyright +copyright: 'Copyright © 2018 - 2019 Hiroki Nakayama' + +theme: + name: 'material' + palette: + primary: 'cyan' + accent: 'cyan' + show_sidebar: true + +extra: + social: + - type: 'github' + link: 'https://github.com/Hironsan' + - type: 'twitter' + link: 'https://twitter.com/Hironsan13' + +# Page tree +nav: + - Doccano: index.md + - Getting started: getting-started.md + - Advanced: + - AWS HTTPS settings: advanced/aws_https_settings.md + - OAuth2 settings: advanced/oauth2_settings.md + #- Release notes: release-notes.md + #- Author's notes: authors-notes.md + - FAQ: faq.md + - Contributing: CONTRIBUTING.md + - Code of Conduct: CODE_OF_CONDUCT.md + - Roadmap: roadmap.md + - License: LICENSE.md \ No newline at end of file