Browse Source

Add documentation by mkdocs

pull/251/head
Hironsan 5 years ago
parent
commit
e6badb20c0
7 changed files with 402 additions and 9 deletions
  1. 100
      docs/advanced/aws_https_settings.md
  2. 36
      docs/advanced/oauth2_settings.md
  3. 55
      docs/faq.md
  4. 102
      docs/getting-started.md
  5. 59
      docs/index.md
  6. 18
      docs/roadmap.md
  7. 41
      mkdocs.yml

100
docs/advanced/aws_https_settings.md

@ -0,0 +1,100 @@
# HTTPS settings for doccano in AWS
1. Create hosted zone in Route 53
2. Create certificate in ACM
3. Create EC2 instance
4. Create ELB
5. Create A record in Route 53
# Create hosted zone in Route 53
HTTPS need a domain name. If you don't have one, you can register it by the AWS Route 53 service, or you can get one from other domain seller.
After you get a domain name, you can create Hosted Zone by Route 53.
If you register domain from Route 53, you can find it in the `Hosted Zone`.
![2B0FF02C-42DA-41D1-BFA1-31018BE006ED](https://camo.githubusercontent.com/998dab1eca0e9673ab98d92b65b199cb4e2f96ea/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b665463677931673132397a346c3733726a333131783065673078332e6a7067)
# Create certificate in ACM
![22F3520E-909A-4215-B73A-DBB452E3D4E2](https://camo.githubusercontent.com/e3e0a24d2265728072d9e65220a41d2ddd6b42bb/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326132653362306a6a3331666c3062683433312e6a7067)
You should replace the domain name by yours.
![image-20190314145326046](https://camo.githubusercontent.com/faf83a9ee1774d92a01de9f69e48ed002c7a827e/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b66546367793167313261336a356d33756a333166393066613077342e6a7067)
![image-20190314145344449](https://camo.githubusercontent.com/874362144a3547629383ad93e1f13831e35d0b82/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132613375736232626a33306b6b3039626a73762e6a7067)
![4FC120A2-6DB5-4F03-A209-12C22EDD6097](https://camo.githubusercontent.com/b75bc07e8d96b796872c697de951ab44d74d04d3/68747470733a2f2f7773342e73696e61696d672e636e2f6c617267652f303036744b665463677931673132613873643730786a3331667630686637646d2e6a7067)
Don't forget to Create record in Route 53 in step 4.
After you request a certificate, wait for a while, You should see the status become 'Issued'.
![3AAE20BC-FC34-4738-AED0-D7D67929F6FF](https://camo.githubusercontent.com/82528820652678c19ee46ff5a0f07dbfaba31f5e/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b66546367793167313261356a776270726a333136743066387139622e6a7067)
# Create EC2 instance
In this part, you can just click the launch button to create a EC2 instance.
[![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano)
# Create ELB
![image-20190314150439785](https://camo.githubusercontent.com/158c2fb2957546ed8bb82694497b60b9c7f38aa5/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326166376a676a746a3330663230337a3734742e6a7067)
Click the `Create Load Balancer` button and select `Application Load Balancer`.
Fill the name, change protocol to HTTPS, and do not forget add at least two availability zones. Make sure the zone that EC2 instance created is included.
![02BE83A7-4C43-48BE-BCF0-95D2DF7C603D](https://camo.githubusercontent.com/c4cc530aea78e66ea99eab905804cae66ab20a04/68747470733a2f2f7773342e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616861756566736a3330796e306d6e6774732e6a7067)
Select the certificate we created early.
![image-20190314151004337](https://camo.githubusercontent.com/455140fc7b7a22a18e96e5f2aa31d9fd0e7c7722/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616b75693576366a333071763063303431382e6a7067)
You can select the same security groups created when you create the EC2 instance.
![image-20190314151110756](https://camo.githubusercontent.com/5d029d4fa494420ed077be6b57ab60935d378e7f/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616c7a796735756a33313272306139676f392e6a7067)
Or you can create a new one
![image-20190314151253917](https://camo.githubusercontent.com/e620c6738ff95f3311edf708b80a949f8b79f565/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616e736d3931706a333163313062646469652e6a7067)
Fill the target group name and leave others defualt.
![image-20190314151314109](https://camo.githubusercontent.com/f22b99c57ca9b8114683f1501942dcc3cc0874f1/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616f34797661746a3330716630666a74616d2e6a7067)
Add the instance to registered.
![image-20190314151358736](https://camo.githubusercontent.com/515649dce66466e9cefa730fc1a35a398ecb260d/68747470733a2f2f7773322e73696e61696d672e636e2f6c617267652f303036744b665463677931673132616f777667736f6a333136793066346164672e6a7067)
Then review and create.
# Create A record in Route 53
Back to route 53, and click `Create Record Set`. Fill the subname and the ELB name in the `Alias Target`.
![image-20190314151601030](https://camo.githubusercontent.com/82944e13e1ef3f4015484417a50635c9352dae33/68747470733a2f2f7773312e73696e61696d672e636e2f6c617267652f303036744b665463677931673132617231383931666a33306278306e6d6a746d2e6a7067)
Finally, you can access the doccano by HTTPS.
![image-20190314151841872](https://camo.githubusercontent.com/85dfef30b4b01df5e0d8e339b38e5a31592dd103/68747470733a2f2f7773332e73696e61696d672e636e2f6c617267652f303036744b6654636779316731326174746563636b6a3330716730396d6a73612e6a7067)

36
docs/advanced/oauth2_settings.md

@ -0,0 +1,36 @@
This document aims to instruct how to setup OAuth for doccano. doccano now supports social login via GitHub and Active Directory by [#75](https://github.com/chakki-works/doccano/pull/75). In this document, we show GitHub OAuth as an example.
## Create OAuth App
1. In the upper-right corner of GitHub, click your profile photo, then click **Settings**.
2. In the left sidebar, click **Developer settings**.
3. In the left sidebar, click **OAuth Apps**.
4. Click **New OAuth App**.
5. In "Application name", type the name of your app.
6. In "Homepage URL", type the full URL to your app's website.
7. In "Authorization callback URL", type the callback URL(e.g. <https://example.com/social/complete/github/>) of your app.
8. Click Register application.
## Set enviromental variables
Once the application is registered, your app's `Client ID` and `Client Secret` will be displayed on the following page:
![image](https://user-images.githubusercontent.com/6737785/51811605-1073d480-22f1-11e9-8be0-726a8ee5e832.png)
1. Copy the `Client ID` and `Client Secret` from the Developer Applications of your app on GitHub.
2. Set the `Client ID` and `Client Secret` to enviromental variables:
```bash
export OAUTH_GITHUB_KEY=YOUR_CLIENT_ID
export OAUTH_GITHUB_SECRET=YOUR_CLIENT_SECRET
```
## Run server
```bash
python manage.py runserver
```
Go to login page:
![image](https://user-images.githubusercontent.com/6737785/51812454-e7edd980-22f4-11e9-80c6-2f18fbc49108.png)

55
docs/faq.md

@ -0,0 +1,55 @@
## I can't install doccano.
Following list is ordered by from easy to hard. If you are not familiar with Python development, please consider easy setup.
1. [One click deployment to Cloud Service.](https://github.com/chakki-works/doccano#deployment)
* Only you have to do is create an account. Especially [Heroku](https://www.heroku.com/home) does not require your credit card (if free plan).
* [![Deploy to Azure](https://azuredeploy.net/deploybutton.svg)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fchakki-works%2Fdoccano%2Fmaster%2Fazuredeploy.json)
* [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy)
* [![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano)
* > Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS).
2. [Use Docker](https://docs.docker.com/install/)
* Docker doesn't bother you by the OS, Python version, etc problems. Because an environment for application is packed as a container.
* Get doccano's image: `docker pull chakkiworks/doccano`
* Create & Run doccano container: `docker run -d --name doccano -p 8000:80 chakkiworks/doccano`
* Create a user: `docker exec doccano tools/create-admin.sh "admin" "admin@example.com" "password"`
* Stop doccano container: `docker stop doccano`
* Re-Launch doccano container: `docker start doccano`
3. Install from source
* **I want to remember you that this is the hardest setup way. You have to install Python/Node.js and type many commands.**
* [Install Python](https://www.python.org/downloads/)
* [Install Node.js](https://nodejs.org/en/download/)
* Get the source code of doccano: `git clone https://github.com/chakki-works/doccano.git`
* Move to doccano directory: `cd doccano`
* Create environment for doccano: `virtualenv venv`
* Activate environment: `source venv/bin/activate`
* Install required packages: `pip install -r requirements.txt`
* Move server directory: `cd app/server`
* Build frontend library: `npm install`
* Build frontend source code: `npm run build`
* Back to server directory: `cd ../`
* Initialize doccano: `python manage.py migrate`
* Create user: `python manage.py createsuperuser`
* Run doccano: `python manage.py runserver`
* Stop doccano: Ctrl+C
* Re-Launch doccano: `python manage.py runserver` (Confirm you are at `app/server` directory and environment is active).
## I can't upload my data.
Please check the following list.
- File encoding: `UTF-8` is appropriate.
- Filename: alphabetic file name is suitable.
- File format selection: File format radio button should be selected properly.
- When you are using JSON/JSONL: Confirm JSON data is valid.
- You can use [JSONLint](https://jsonlint.com/) or some other tool (when JSONL, pick one data and check it).
- When you are using CSV: Confirm CSV data is valid.
- You can use Excel or some tools that have import CSV feature.
- Lack of line: Data file should not contain blank line.
- Lack of field: Data file should not contain blank field.
**You don't need your real & all data to validate file format. The picked data & masked data is suitable if your data is large or secret.**
## I want to add annotators.
* You can create other annotators by [Django Admin site](https://djangobook.com/django-admin-site/).

102
docs/getting-started.md

@ -0,0 +1,102 @@
# Getting started
## Quick install guide
First of all, you have to clone the repository:
```bash
git clone https://github.com/chakki-works/doccano.git
cd doccano
```
To install doccano, there are three options:
### Option1: Pull the production Docker image
```bash
docker pull chakkiworks/doccano
```
### Option2: Pull the development Docker-Compose images
```bash
docker-compose pull
```
### Option3: Setup Python environment
First we need to install the dependencies. Run the following commands:
```bash
pip install -r requirements.txt
cd app
```
Next we need to start the webpack server so that the frontend gets compiled continuously.
Run the following commands in a new shell:
```bash
cd server/static
npm install
npm run build
# npm start # for developers
cd ..
```
## Usage
Let’s start the development server and explore it.
Depending on your installation method, there are two options:
### Option1: Running the Docker image as a Container
First, run a Docker container:
```bash
docker run -d --name doccano -p 8000:80 chakkiworks/doccano
```
Then, execute `create-admin.sh` script for creating a superuser.
```bash
docker exec doccano tools/create-admin.sh "admin" "admin@example.com" "password"
```
### Option2: Running the development Docker-Compose stack
We can use docker-compose to set up the webpack server, django server, database, etc. all in one command:
```bash
docker-compose up
```
Now, open a Web browser and go to <http://127.0.0.1:8000/login/>. You should see the login screen:
![Login form](./login_form.png)
### Option3: Running Django development server
Before running, we need to make migration. Run the following command:
```bash
python manage.py migrate
```
Next we need to create a user who can login to the admin site. Run the following command:
```bash
python manage.py create_admin --noinput --username "admin" --email "admin@example.com" --password "password"
```
Developers can also validate that the project works as expected by running the tests:
```bash
python manage.py test server.tests
```
Finally, to start the server, run the following command:
```bash
python manage.py runserver
```

59
docs/index.md

@ -0,0 +1,59 @@
# Welcome to doccano
## Text Annotation for Human
doccano is an open source text annotation tool for human. It provides annotation features for text classification, sequence labeling and sequence to sequence. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create project, upload data and start annotation. You can build dataset in hours.
## Demo
You can enjoy [annotation demo](http://doccano.herokuapp.com).
### [Named entity recognition](https://doccano.herokuapp.com/demo/named-entity-recognition/)
First demo is one of the sequence labeling tasks, named-entity recognition. You just select text spans and annotate it. Since doccano supports shortcut key, so you can quickly annotate text spans.
![Named Entity Recognition](./named_entity_annotation.gif)
### [Sentiment analysis](https://doccano.herokuapp.com/demo/text-classification/)
Second demo is one of the text classification tasks, topic classification. Since there may be more than one category, you can annotate multi-labels.
![Text Classification](./text_classification.gif)
### [Machine translation](https://doccano.herokuapp.com/demo/translation/)
Final demo is one of the sequence to sequence tasks, machine translation. Since there may be more than one responses in sequence to sequence tasks, you can create multi responses.
![Machine Translation](./translation.gif)
## Quick Deployment
### Azure
Doccano can be deployed to Azure ([Web App for Containers](https://azure.microsoft.com/en-us/services/app-service/containers/) +
[PostgreSQL database](https://azure.microsoft.com/en-us/services/postgresql/)) by clicking on the button below:
[![Deploy to Azure](https://azuredeploy.net/deploybutton.svg)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fchakki-works%2Fdoccano%2Fmaster%2Fazuredeploy.json)
### Heroku
Doccano can be deployed to [Heroku](https://www.heroku.com/) by clicking on the button below:
[![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy)
Of course, you can deploy doccano by using [heroku-cli](https://devcenter.heroku.com/articles/heroku-cli).
```bash
heroku create
heroku stack:set container
git push heroku master
```
### AWS
Doccano can be deployed to AWS ([Cloudformation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html)) by clicking on the button below:
[![AWS CloudFormation Launch Stack SVG Button](https://cdn.rawgit.com/buildkite/cloudformation-launch-stack-button-svg/master/launch-stack.svg)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3-external-1.amazonaws.com/cf-templates-10vry9l3mp71r-us-east-1/20190732wl-new.templatexloywxxyimi&stackName=doccano)
> Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS).

ROADMAP.md → docs/roadmap.md

@ -2,7 +2,7 @@
Doccano is a fast-moving, community supported project. This roadmap provides guidance about priorities and focus areas of the doccano team and lists the functionality expected in upcoming releases of doccano. Many of these areas are driven by community use cases, and we welcome further contributions to doccano.
# Current status
## Current status
Doccano is now able to:
@ -18,11 +18,11 @@ Doccano is now able to:
* Confirm annotation statistics.
* Access via web API.
# Roadmap
## Roadmap
This is a list of features on the short term roadmap and beyond:
## APIs
### APIs
* Implement login and registration API.
* Implement OAuth 2.0 API.
@ -32,14 +32,14 @@ This is a list of features on the short term roadmap and beyond:
* Optimize performance for statistics API.
* More documentation and tutorials.
## Project management
### Project management
* Enable to manage user by project administrators.
* Implement RBAC and enable to assign a role to a user by project administrators.
* Enhance annotation statistics.
## Annotation
### Annotation
* Increase the number of annotation tasks such as relation extraction, entity linking, aspect-based sentiment analysis, visual question answering and so on.
* Introduce a plugin feature to define custom tasks by a user.
@ -49,7 +49,7 @@ This is a list of features on the short term roadmap and beyond:
* More documentation and tutorials.
## Upload and download
### Upload and download
* Enable to import data from cloud storage like s3.
* Improve UX by showing progress bar.
@ -57,7 +57,7 @@ This is a list of features on the short term roadmap and beyond:
* Support for custom tokenization.
* More performance optimizations.
## Accessibility
### Accessibility
* Support smartphone to enable anyone to annotate anywhere.
* Enable to customize font and font-family.
@ -65,7 +65,7 @@ This is a list of features on the short term roadmap and beyond:
* Enable to customize site theme per user.
## Entire project
### Entire project
* Design Vue component and use it to implement frontend.
* Introduce frontend testing framework.
@ -76,7 +76,7 @@ This is a list of features on the short term roadmap and beyond:
* Improve project management structure to accelerate the project improvement.
* Create GitHub page by using mkdocs and move wiki contents to it.
## Community and engagement
### Community and engagement
* New resources for community discussion and feedback.
* Gather and highlight novel doccano use cases.

41
mkdocs.yml

@ -0,0 +1,41 @@
# Project information
site_name: 'doccano'
site_description: 'A Text Annotation tool for Human'
site_author: 'Hiroki Nakayama'
site_url: 'https://chakki-works.github.io/doccano/'
# Repository
repo_name: 'chakki-works/doccano'
repo_url: 'https://github.com/chakki-works/doccano'
# Copyright
copyright: 'Copyright &copy; 2018 - 2019 Hiroki Nakayama'
theme:
name: 'material'
palette:
primary: 'cyan'
accent: 'cyan'
show_sidebar: true
extra:
social:
- type: 'github'
link: 'https://github.com/Hironsan'
- type: 'twitter'
link: 'https://twitter.com/Hironsan13'
# Page tree
nav:
- Doccano: index.md
- Getting started: getting-started.md
- Advanced:
- AWS HTTPS settings: advanced/aws_https_settings.md
- OAuth2 settings: advanced/oauth2_settings.md
#- Release notes: release-notes.md
#- Author's notes: authors-notes.md
- FAQ: faq.md
- Contributing: CONTRIBUTING.md
- Code of Conduct: CODE_OF_CONDUCT.md
- Roadmap: roadmap.md
- License: LICENSE.md
Loading…
Cancel
Save