맞춤형 의학연구 애플리케이션을 위한 개발 환경 구축

ANPANMAN Co.,Ltd.

Jinseob Kim

October 26, 2018

Executive Summary

맞춤형 의학연구 애플리케이션을 위해

  1. 마이크로서비스 아키텍처(microservice architecture) 구축

    • Rstudioshiny server가 설치된 Docker 이미지 제작

    • Docker swarm을 이용해 배포

    • 서버의 종류와 갯수에 구애받지 않음

  2. https 보안이 적용된 subdomain 주소 부여

    • 동적 프록시 서버(dynamic proxy server) 프로그램인 Traefik 이용

    • 서비스가 추가될 때 마다(ex: 홈페이지, Jupyter) 자동 적용.

  3. ShinyApps

    • 흔히 이용되는 의학통계 방법들을 ShinyApps 로 만들어 위의 환경에 배포

    • 데이터 라벨(label) 정보 활용 - 라벨이 적용된 논문용 테이블/그림

1. 마이크로서비스 아키텍처

https://blog.philipphauer.de/microservices-nutshell-pros-cons/

https://blog.philipphauer.de/microservices-nutshell-pros-cons/

여행용 파우치

https://funshop.akamaized.net/products/0000045775/HF-INLUGGAGE-POUCH-LINGERIE-%EC%83%81%EC%84%B8%ED%8E%98%EC%9D%B4%EC%A7%80_01.jpg

https://funshop.akamaized.net/products/0000045775/HF-INLUGGAGE-POUCH-LINGERIE-%EC%83%81%EC%84%B8%ED%8E%98%EC%9D%B4%EC%A7%80_01.jpg

여행용 파우치 장단점

장점

  1. 깔끔하다.

  2. 치우기 쉽다.

  3. 다른 가방으로 옮기기 쉽다.

  4. 가방 종류에 구애받지 않는다.

단점

  1. 실제 쓸 수 있는 공간이 줄어든다.

  2. 분리해서 넣기 귀찮다.

  3. 물건 찾을 때 지퍼를 한번 더 열어야 된다.

Microservice 장단점

장점

  1. 깔끔하다.

  2. 삭제가 쉽다.

  3. 다른 컴퓨터에 재설치 쉽다.

  4. 컴퓨터/서버 종류에 구애받지 않는다.

단점

  1. 실제 쓸 수 있는 용량이 줄어든다.

  2. 서비스마다 모듈 만들기 귀찮다.

  3. 성능저하 우려

가상머신(Virtual machine) 활용이 대표적.

Docker

https://doi.org/10.1371/journal.pone.0152686

https://doi.org/10.1371/journal.pone.0152686

Docker hub 활용 예

http://edu.delestra.com/docker-slides/img/docker_hub_auto_build.png

http://edu.delestra.com/docker-slides/img/docker_hub_auto_build.png

rshiny DockerFile

FROM ubuntu:latest

RUN sed -i 's/archive.ubuntu.com/mirror.kakao.com/g' /etc/apt/sources.list && \
    sed -i 's/security.ubuntu.com/mirror.kakao.com/g' /etc/apt/sources.list  && \
    sed -i 's/extras.ubuntu.com/mirror.kakao.com/g' /etc/apt/sources.list

MAINTAINER Jinseob Kim "jinseob2kim@gmail.com"

# Setup apt to be happy with no console input
ENV DEBIAN_FRONTEND noninteractive


# Install dependencies and Download 
RUN apt-get update && apt-get install -y \
    udev \
    locales \
    software-properties-common \
    file \
    curl \
    git \
    sudo \
    wget \
    gdebi-core \
    vim \
    psmisc \
    tzdata \
    libxml2-dev \
    libcairo2-dev \
    libgit2-dev \
    tk-table \
    libcurl4-gnutls-dev \
    libssl-dev \
    libxt-dev \
    supervisor && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Prevent bugging us later about timezones
RUN ln -fs /usr/share/zoneinfo/Asia/Seoul /etc/localtime && dpkg-reconfigure --frontend noninteractive tzdata

# Use UTF-8
RUN locale-gen en_US.UTF-8
ENV LANG en_US.UTF-8


# Update R -latest version
RUN echo "deb http://cran.rstudio.com/bin/linux/ubuntu bionic-cran35/" | sudo tee -a /etc/apt/sources.list && \
    gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9 && \
    gpg -a --export E084DAB9 | sudo apt-key add - && \
    apt-get update && \
    apt-get install -y r-base r-base-dev

# Install Rstudio-server
ARG RSTUDIO_VERSION

RUN RSTUDIO_LATEST=$(wget --no-check-certificate -qO- https://s3.amazonaws.com/rstudio-server/current.ver) && \ 
    [ -z "$RSTUDIO_VERSION" ] && RSTUDIO_VERSION=$RSTUDIO_LATEST || true && \
    wget -q http://download2.rstudio.org/rstudio-server-${RSTUDIO_VERSION}-amd64.deb && \
    dpkg -i rstudio-server-${RSTUDIO_VERSION}-amd64.deb && \
    rm rstudio-server-*-amd64.deb 


# Install Shiny server
RUN wget --no-verbose https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-14.04/x86_64/VERSION -O "version.txt" && \
    VERSION=$(cat version.txt)  && \
    wget --no-verbose "https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-14.04/x86_64/shiny-server-$VERSION-amd64.deb" -O ss-latest.deb && \
    gdebi -n ss-latest.deb && \
    rm -f version.txt ss-latest.deb && \
    R -e "install.packages(c('shiny', 'rmarkdown', 'DT', 'data.table', 'ggplot2', 'devtools', 'epiDisplay', 'tableone', 'svglite', 'plotROC', 'pROC', 'labelled', 'geepack', 'lme4', 'PredictABEL', 'shinythemes', 'maxstat', 'manhattanly', 'Cairo', 'future', 'promises', 'GGally', 'fst', 'blogdown', 'metafor', 'roxygen2'), repos='https://cran.rstudio.com/')" && \
    R -e "devtools::install_github(c('jinseob2kim/jstable', 'jinseob2kim/jskm', 'emitanaka/shinycustomloader', 'Appsilon/shiny.i18n', 'metrumresearchgroup/sinew'))" 
    


## User setting
COPY ini.sh /etc/ini.sh


## Github
RUN git config --system credential.helper 'cache --timeout=3600' && \ 
    git config --system push.default simple 


## Multiple run
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
RUN mkdir -p /var/log/supervisor \
    && chmod 777 -R /var/log/supervisor


EXPOSE 8787 3838 


CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"] 

Rocker project

https://www.rocker-project.org/images/

https://www.rocker-project.org/images/

개발환경 구축 컨셉

Docker 이미지 실행

docker run --rm -d \ 
    -p 3838:3838 -p 8787:8787 \
    -e USER=js -e PASSWORD=js -e ROOT=TRUE\
    jinseob2kim/docker-rshiny

로컬 컴퓨터 - http://localhost:8787, http://localhost:3838 로 접속. 서버 - Your IP:8787, Your IP:3838

서버의 갯수?

서버의 종류와 갯수에 구애받지 않는 마이크로서비스 아키텍처(microservice architecture)…

https://www.penflip.com/akira.ohio/appcatalyst-hands-on-lab-en/blob/master/images/docker-ppt-swarm-1.png/?raw=true

https://www.penflip.com/akira.ohio/appcatalyst-hands-on-lab-en/blob/master/images/docker-ppt-swarm-1.png/?raw=true

Docker swarm

https://www.upcloud.com/support/docker-swarm-orchestration/

https://www.upcloud.com/support/docker-swarm-orchestration/

과정

  1. 서버들에 Docker 설치

  2. 서버들을 묶음: manager 서버와 worker 서버.

  3. manager 서버에서 Docker 이미지를 실행하면 자동으로 한가한 서버에 배치.

  4. 어떤 서버 주소로 접속해도 실행 가능.
    • manager IP:8787, worker IP:8787 모두 OK

Manager & worker node

https://pbs.twimg.com/media/DP5VZC8UIAAnV6j.jpg:large

https://pbs.twimg.com/media/DP5VZC8UIAAnV6j.jpg:large

어떤 IP로 접속해도 실행 가능

http://callistaenterprise.se/assets/blogg/docker/docker-in-swarm-mode-on-docker-in-docker/docker-swarm.png

http://callistaenterprise.se/assets/blogg/docker/docker-in-swarm-mode-on-docker-in-docker/docker-swarm.png

예: Docker swarm 으로 서버 2개 묶기

Docker가 설치된 2개 서버: manager, worker node

In manager node

  1. Init Docker Swarm mode
manger_ip = $(123.456.789.10)
docker swarm init --advertise-addr $manager_ip
  1. Get Swarm tokens
worker_token=$(docker swarm join-token worker -q)

In worker node

  1. Join worker nodes
docker swarm join --token $worker_token $manager_ip:2377

https://www.youtube.com/watch?v=2RQbpnRxx-Y

주의 (1) - Port setting for swarm

AWS Security Group Example

주의 (2)

연결 가능한 서버끼리만 묶을 수 있다.

  1. AWS끼리(O), Azure끼리(O), Digitalocean끼리(O)

  2. AWS와 Azure(X), AWS와 Digitalocean(X)

  3. AWS(Azure, Digitalocean)와 자체서버(X)

서비스 실행: rstudio & shiny server

자체 이미지 docker-rshiny

docker service create \
    --publish 8787:8787 \
    --publish 3838:3838 \ 
    -e USER=js -e PASSWORD=js -e ROOT=TRUE \
    --name rshiny \
    jinseob2kim/docker-rshiny

추가: tensorflow docker 실행

docker service create \
    --name tf \
    --publish 8888:8888\
     tensorflow/tensorflow

사용자가 늘어나면?

docker service scale 명령어 이용, 여러 서버에 이미지 설치.

docker service scale rshiny=2

다시 줄이기

docker service scale rshiny=1

옵션: Docker-machine

https://docs.docker.com/machine/overview/#whats-the-difference-between-docker-engine-and-docker-machine

https://docs.docker.com/machine/overview/#whats-the-difference-between-docker-engine-and-docker-machine

https://docs.docker.com/machine/overview/#whats-the-difference-between-docker-engine-and-docker-machine

https://docs.docker.com/machine/overview/#whats-the-difference-between-docker-engine-and-docker-machine

Docker-machine 설치

base=https://github.com/docker/machine/releases/download/v0.15.0 &&
curl -L $base/docker-machine-$(uname -s)-$(uname -m) >/tmp/docker-machine &&
sudo install /tmp/docker-machine /usr/local/bin/docker-machine
docker-machine version

예: Digital ocean - manager 이름으로 서버 생성

export DIGITALOCEAN_ACCESS_TOKEN=<YOUR_DIGITALOCEAN_ACCESS_TOKEN>
export DIGITALOCEAN_IMAGE="ubuntu-18-04-x64"
export DIGITALOCEAN_REGION="sgp1"
echo "### Creating manager nodes ..."

for c in {1..1} ; do
  docker-machine create \
     --driver digitalocean \
     --digitalocean-access-token $DIGITALOCEAN_ACCESS_TOKEN \
     --digitalocean-image $DIGITALOCEAN_IMAGE \
     --digitalocean-region $DIGITALOCEAN_REGION \
     --digitalocean-size "s-2vcpu-4gb" \
     manager$c &&\
  docker-machine ssh manager$c "adduser js --gecos 'First Last,RoomNumber,WorkPhone,HomePhone' --disabled-password && sh -c 'echo js:js | sudo chpasswd' && usermod -aG sudo js"
done

AWS

export AWS_ACCESS_KEY_ID=<YOUR_AWS_ACEESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY>
export AWS_INSTANCE_TYPE="t2.micro" 
export AWS_INSTANCE_REGION="ap-northeast-2"
export AWS_SECURITY_GROUP="launch-wizard-2"
export AWS_VPC_ID=<YOUR_AWS_VPC_ID>
export AWS_ZONE=c


for c in {1..1} ; do
docker-machine create \
  --driver amazonec2 \
  --amazonec2-access-key $AWS_ACCESS_KEY_ID \
  --amazonec2-secret-key $AWS_SECRET_ACCESS_KEY \
  --amazonec2-region $AWS_INSTANCE_REGION \
  --amazonec2-vpc-id $AWS_VPC_ID \
  --amazonec2-open-port 3838 \
  --amazonec2-open-port 8787 \
  --amazonec2-open-port 8000 \
  --amazonec2-open-port 8080 \
  --amazonec2-open-port 2377 \
  --amazonec2-open-port 7946 \
  --amazonec2-open-port 7946/udp \
  --amazonec2-open-port 4789 \
  --amazonec2-open-port 4789/udp \
  --amazonec2-open-port 8888 \
  --amazonec2-open-port 80 \
  --amazonec2-open-port 443 \
  manager$c && \
  docker-machine ssh manager$c "adduser js --gecos 'First Last,RoomNumber,WorkPhone,HomePhone' --disabled-password && sh -c 'echo js:js | sudo chpasswd' && usermod -aG sudo js"
done

AZURE

export sub=<YOUR_AZURE_SUBSCRIPTION_VALUE>

for c in {1..1} ; do
docker-machine create \
    --driver azure \
    --azure-location "koreacentral" \
    --azure-size Standard_B1s \
    --azure-subscription-id $sub \
    --azure-open-port 3838 \
    --azure-open-port 8787 \
    --azure-open-port 8000 \
    --azure-open-port 8080 \
    --azure-open-port 2377 \
    --azure-open-port 7946 \
    --azure-open-port 7946/udp \
    --azure-open-port 4789 \
    --azure-open-port 4789/udp \
    --azure-open-port 8888 \
    --azure-open-port 80 \
    --azure-open-port 443 \
    manager$c && \
    docker-machine ssh manager$c "adduser js --gecos 'First Last,RoomNumber,WorkPhone,HomePhone' --disabled-password && sh -c 'echo js:js | sudo chpasswd' && usermod -aG sudo js"
done

묶을 서버 추가오기 : worker node

export DIGITALOCEAN_SIZE="s-1vcpu-1gb"
echo "### Creating worker nodes ..."
for c in {1..1} ; do
    docker-machine create \
  --driver digitalocean \
  --digitalocean-access-token $DIGITALOCEAN_ACCESS_TOKEN \
  --digitalocean-image $DIGITALOCEAN_IMAGE \
  --digitalocean-region $DIGITALOCEAN_REGION \
  --digitalocean-size $DIGITALOCEAN_SIZE \
  worker$c && \
  docker-machine ssh worker$c "adduser js --gecos 'First Last,RoomNumber,WorkPhone,HomePhone' --disabled-password && sh -c 'echo js:js | sudo chpasswd' && usermod -aG sudo js"
done

서버 묶기 : Docker-machine 활용

manager1worker1 노드를 docker swarm를 활용하여 묶자.

# Get IP from leader node
leader_ip=$(docker-machine ip manager1)

# Init Docker Swarm mode
echo "### Initializing Swarm mode ..."
eval $(docker-machine env manager1)
docker swarm init --advertise-addr $leader_ip

# Swarm tokens
manager_token=$(docker swarm join-token manager -q)
worker_token=$(docker swarm join-token worker -q)

# Joinig manager nodes
echo "### Joining manager modes ..."
for c in {1..1} ; do
    eval $(docker-machine env manager$c)
    docker swarm join --token $manager_token $leader_ip:2377
done

# Join worker nodes
echo "### Joining worker modes ..."
for c in {1..1} ; do
    eval $(docker-machine env worker$c)
    docker swarm join --token $worker_token $leader_ip:2377
done


# Clean Docker client environment
echo "### Cleaning Docker client environment ..."
eval $(docker-machine env -u)

2. Dynamic proxy & https: Traefik

Problem

리버스 프록시(reverse proxy) 프로그램이 필요하다.

https://diarmuid.ie/media/nginx-docker-reverse-proxy.png

https://diarmuid.ie/media/nginx-docker-reverse-proxy.png

Problem: nginx

  1. Docker 와 궁합이 안좋다?
  1. https 적용 불가능

HTTP 구글 크롬서 퇴출 수순…7월부터 “안전하지 않다” 경고

  1. Subdomain 불가능

Traefik

Docker swarm 을 위한 dynamic proxy 프로그램

https://ian-says.com/articles/traefik-proxy-docker-lets-encrypt/

Overview Traefik

https://hub.docker.com/r/ghiltoniel/traefik-react/

https://hub.docker.com/r/ghiltoniel/traefik-react/

Run Traefik

  1. 도메인 추가: *.DOMAINNAME

도메인 설정 CNAME*.DOMAINNAME를 추가해야 된다.

2. Traefik 용 network 만들기

# Run in manager node
eval $(docker-machine env manager1)

# Create network for swarm
docker network create --driver=overlay traefik-net

3. Let’s Encrypt 설정

# For Let's Encrypt
docker-machine ssh manager1 "DOMAINNAME=anpanman.co.kr && \ 
                             mkdir /home/js/opt && \ 
                             mkdir /home/js/opt/traefik && \
                             cd /home/js/opt/traefik && \
                             touch acme.json && chmod 600 acme.json && \
                             wget -O traefik.toml  https://raw.githubusercontent.com/jinseob2kim/swarm-setting/master/opt/traefik/traefik.toml"

traefik.toml

defaultEntryPoints = ["http", "https"]

logLevel = "INFO"

[api]
dashboard = true
address = ":8080"

[entryPoints]
  [entryPoints.http]
  address = ":80"
    [entryPoints.http.redirect]
      entryPoint = "https"
  [entryPoints.https]
  address = ":443"
    [entryPoints.https.redirect]
      regex = "^https://anpanman.co.kr/(.*)"
      replacement = "https://www.anpanman.co.kr/$1"
      permanent = true
    [entryPoints.https.tls]  


[acme]
email = "jinseob2kim@gmail.com"
storage = "acme.json"
entryPoint = "https"
onHostRule = true
onDemand = false


## *.anpanman.co.kr & anpanman.co.kr should be in DNS "A or CNAME": digitalocean case.
[acme.dnsChallenge]
  provider = "digitalocean"
  delayBeforeCheck = 0 

4. Run Traefik

eval $(docker-machine env manager1)
DOMAINNAME="anpanman.co.kr"

# Create traefik service
docker service create \
    --name traefik \
    --constraint=node.role==manager \
    --publish 80:80 --publish 443:443\
    --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
    --mount type=bind,source=/root/acme.json,target=/acme.json \
    --mount type=bind,source=/root/traefik.toml,target=/traefik.toml \
    -e DO_AUTH_TOKEN=$DIGITALOCEAN_ACCESS_TOKEN \
    -l traefik.port=8080 \
    -l traefik.frontend.rule=Host:monitor.$DOMAINNAME\
    --network traefik-net \
    traefik \
    --logLevel=INFO \
    --docker \
    --docker.swarmMode \
    --docker.watch \
    --docker.domain=$DOMAINNAME

https://monitor.anpanman.co.kr 에서 dashboard를 볼 수 있다.

서비스 재실행: rstudio & shiny server

Traefik 를 적용하여 재실행하자.

docker service create \
    --name rshiny \
    --label traefik.shiny.port=3838 \
    --label traefik.rstudio.port=8787 \
    --label traefik.shiny.frontend.rule="Host:app.$DOMAINNAME" \
    --label traefik.rstudio.frontend.rule="Host:server.$DOMAINNAME" \
    -e USER=js -e PASSWORD=js -e ROOT=TRUE \
    --network traefik-net \
     jinseob2kim/docker-rshiny

https://server.anpanman.co.kr 에서 rstudio server를, https://app.anpanman.co.kr 에서 shiny server를 실행할 수 있다.

서비스 추가: 홈페이지

proxy server 프로그램인 nginxdocker image 를 이용하였고, blogdown 패키지 를 활용해서 홈페이지를 만들었다.

docker service create \
    --name nginx \
    --label traefik.port=80 \
    --label traefik.frontend.rule="Host:${DOMAINNAME},www.${DOMAINNAME}" 
    --network traefik-net \
    nginx 

https://anpanman.co.kr, https://www.anpanman.co.kr 에서 nginx 실행환경을 볼 수 있다.

중간 정리

  1. 필요한 서비스를 미리 Docker image 로 만들었다.

  2. Docker-machine 을 이용하여 Docker가 설치된 클라우드 서버를 여러 개 생성한 후

  3. Docker swarm 을 통해 서버들을 묶었다.

  4. 이제 서비스를 실행하면 Swarm 환경이 알아서 적절한 서버를 골라 실행한다.

  5. Traefik 을 이용하여 서비스를 추가할 때마다 그에 맞는 subdomain 주소를 자동으로 할당하였다.

  6. Let’s Encrypt 을 통한 https 인증이 자동으로 적용된다.

사용 후기: (9월 말)

  1. Docker swarm 은 오버.

    • 서비스 24시간 계속 실행 필요?

    • 서버 하나 먹통되더라도 서비스 유지 필수?

    • 대규모 프로젝트?

  2. 가내수공업은 Docker 로 충분.

현재

3. 의학연구용 ShinyApps 만들기

주 활용 패키지

DT

library(DT)
datatable(iris, extension= "Buttons", rownames = F,
          options = list(dom = '<lf<rt>Bip>', lengthMenu = list(c(10, 25, -1), c('10', '25', 'All')), pageLength = 10,
                        buttons = list('copy', 'print', 
                                       list(extend = 'collection', 
                                            buttons = list(list(extend = 'csv', filename= "table"),
                                                           list(extend = 'excel', filename= "table"), 
                                                           list(extend = 'pdf', filename= "table")
                                                           ), 
                                            text = 'Download')
                                       )
                        )
          ) %>% 
  formatStyle('Sepal.Length', fontWeight = styleInterval(5, c('normal', 'bold'))) %>%
  formatStyle(
    'Sepal.Width',
    color = styleInterval(c(3.4, 3.8), c('white', 'blue', 'red')),
    backgroundColor = styleInterval(3.4, c('gray', 'yellow'))
  ) %>%
  formatStyle(
    'Petal.Length',
    background = styleColorBar(iris$Petal.Length, 'steelblue'),
    backgroundSize = '100% 90%',
    backgroundRepeat = 'no-repeat',
    backgroundPosition = 'center'
  ) %>%
  formatStyle(
    'Species',
    transform = 'rotateX(45deg) rotateY(20deg) rotateZ(30deg)',
    backgroundColor = styleEqual(
      unique(iris$Species), c('lightblue', 'lightgreen', 'lightpink')
    )
  )

shinycustomloader

https://user-images.githubusercontent.com/7620319/38162696-cafcd18e-3531-11e8-8228-f08defa97ae0.gif

https://user-images.githubusercontent.com/7620319/38162696-cafcd18e-3531-11e8-8228-f08defa97ae0.gif

Label

Table 1: tableone package

https://github.com/kaz-yos/tableone

https://github.com/kaz-yos/tableone

Main results

library(epiDisplay)
model0 <- glm(case ~ induced + spontaneous, family=binomial, data=infert)
logistic.display(model0, crude = T, crude.p.value = T)$table
## Log-likelihood = -139.806
## No. of observations = 248
## AIC value = 285.612

Plot

Shiny module and Rstudio addin

jsmodule

Rstudio addin: propensity score analysis

Rstudio addin: propensity score analysis

다중언어 지원

https://cdn-ak.f.st-hatena.com/images/fotolife/k/ksmzn/20171209/20171209204102.gif

https://cdn-ak.f.st-hatena.com/images/fotolife/k/ksmzn/20171209/20171209204102.gif

Examples

  1. 범용 기초통계 앱

  2. 범용 Propensity score 분석앱

  3. 건강설문조사 리포트

  4. 대장암 환자 연구: 강릉아산병원

  5. 이완기 압력 계산: 삼성서울병원

  6. 심장질환 위험인자 연구: 계명대 동산의료원

  7. 다중 언어: 한/영

Q & A