Installation Runbook (AEN 4.1.0)

Overview

This installation runbook walks through the steps needed to install a basic Anaconda Enterprise Notebooks system comprised of the front-end server, gateway, and two or more compute machines.

The installation runbook is designed to be used as a step-by-step guide during the installation process. It is intended for two audiences:

  • Those who have direct access to the internet for installation, and
  • Those where such access is restricted for security reasons or otherwise not available.

For these restricted (referred to as “Air gap”) environments, Continuum ships the entire Anaconda product suite on a portable storage medium or as a downloadable TAR archive. Please follow the “Air gap” instructions where noted.

If you have any questions about the installation runbook instructions, please contact your sales representative or Priority Support team.

Components

AEN Server: The administrative front-end to the system. This is where users login to the system, where user accounts are stored, and where admins can manage the system.

AEN Gateway: The gateway is a reverse proxy that authenticates users and automatically directs them to the proper AEN Compute machine for their project. Users will not notice this component as it automatically routes them. One could put a gateway in each datacenter in a tiered scale-out fashion.

AEN Compute nodes: This is where projects are stored and run. AEN Compute machines only need to be reachable by the AEN Gateway, so they can be completely isolated by a firewall.

../../../_images/ae-notebooks/4.1.0/install/components.png

Table of Contents

  • Anaconda Enterprise Installation Run Book
    • Overview
    • Audience
    • Components
    • Table of Contents
  • Installation Requirements
    • Hardware Requirements
    • Software Requirements
    • Linux System Accounts Required
    • Software Prerequisites
    • Security Requirements
    • Network Requirements
    • Other Requirements
  • Install Preparation
    • Download the Installers
    • Gather IP addresses of FQDNs
  • Install AEN Server
    • AEN Server Preparation - Prerequisites
      • Download Prerequisite RPMs
      • Install Prerequisite RPMs
    • Run the AEN Server Installer
      • Setup Variables and Change Permissions
      • Run AEN Server Installer
      • Start ElasticSearch
      • Test the AEN Server Install
      • Update the License
  • Install AEN Gateway
    • Setup Variables and Change Permissions
    • Run Wakari Gateway Installer
    • Register the AEN Gateway
      • Ensure Proper Permissions
      • Start the Gateway
      • Verify the AEN Gateway has Registered
  • Install AEN Compute
    • Set Variables and Change Permissions
    • Run AEN Compute Installer
    • Configure AEN Compute Node
    • Configure conda to use local on-site Anaconda Enterprise Repo
      • Edit the condarc on the Compute Node
      • Configure Anaconda Client
  • Optional Configuration
    • Configure common AEN Compute options
      • Change the project directory
      • Create groups with the same id
      • Use numeric usernames
    • Verify and Tune Search Indexing
    • Setting a Default Project Environment
    • Configure a remote mongodb
    • SELinux Enforcing Mode
  • Wrapping Up

Installation Requirements

Hardware Requirements

AEN Server

  • 2+GB RAM
  • 2+CPU cores
  • 20GB storage

AEN Gateway

  • 2 GB RAM
  • 2 CPU cores

AEN Compute (N-machines)

Configure to meet the needs of the projects. At least:

  • 2GB RAM
  • 2 CPU cores
  • At least 20 GB

NOTE: We recommend putting ``/opt/wakari`` and ``/projects`` on the same filesystem. If the project and conda env directories are on separate filesystems then more disk space will be required on compute nodes and performance will be worse.

Software Requirements

  • RHEL/CentOS versions 6.5 to 6.8 on all nodes (Other operating systems are supported, however this document assumes RHEL or CentOS)
  • /opt/wakari: Ability to install here and at least 5GB of storage.
  • /projects: Size depends on number and size of projects. At least 20GB of storage.
  • ACL: These directories need the filesystem mounted with Posix ACL support (Posix.1e). Check with mount and tune2fs -l /path/to/filesystem | grep options

Linux System Accounts Required

Some Linux system accounts (UIDs) are added to the system during installation. If your organization requires special actions, here is the list of UIDs:

  • mongod (RHEL) or mongodb (Ubuntu/Debian): created by the RPM or deb package
  • elasticsearch: created by RPM or deb package
  • nginx: created by RPM or deb package
  • AEN_SRVC_ACCT: created during installation of Anaconda Enterprise Notebooks, and defaults to “wakari”

Software Prerequisites

AEN Server

  • Mongo Version: >= 2.6.8 and < 3.0
  • Nginx version: >= 1.4.0
  • ElasticSearch: >= 1.7.2
  • Oracle JRE 7 and 8

AEN Compute

  • git
  • X Window System

Security Requirements

  • root or sudo access
  • SELinux in Permissive or Disabled mode - check with getenforce

Network Requirements

  • TCP Ports
direction type port protocol optional configurable comments
inbound TCP 80 HTTP No No Server
in/out TCP 8089   No No Gateway
in/out TCP 5002   No No Compute

Other Requirements

Assuming the above requirements are met, there are no additional dependencies necessary for AEN.

Note: While not a requirement for running the software, these instructions use curl to download packages used in the install process. You may use other appropriate means to put the needed files into the installation directory.

Installation Preparation

Download the Installers

Download the installers and copy them to the corresponding servers.

  • Regular Installation:

    curl -O $RPM_CDN/aen-server-4.1.0-Linux-x86_64.sh
    curl -O $RPM_CDN/aen-gateway-4.1.0-Linux-x86_64.sh
    curl -O $RPM_CDN/aen-compute-4.1.0-Linux-x86_64.sh
    

Note: the $RPM_CDN server will be provided by your sales rep.

Gather IP addresses or FQDNs

AEN is very sensitive to the IP address or domain name used to connect to the Server and Gateway components. If users will be using the domain name, you should install the components using the domain name instead of the IP addresses. The authentication system requires the proper hostnames when authenticating users between the services.

Fill in the domain names or IP addresses of the components below and record the user name and auto­-generated password for the administrative user account in the box below after installing the AEN Server component.

Component | Name or IP address
AEN Server |
AEN Gateway |
AEN Compute |

Notes:

  • we will refer to the values of these IP entries or DNS entries as, e.g., <AEN_SERVER_IP> or <AEN_SERVER_FQDN>, particularly in examples of shell commands. Consider actually assigning those values to environment variables with similar names.

Setup Variables

AEN Server Address

Define an environment variable for the AEN Server address (FQDN or IP):

export AEN_SERVER=<AEN_SERVER_IP>  # <from table above>

Note that the address (FQDN or IP) specified for the AEN server must be resolvable by your intended AEN users web clients. You may verify your hostname as follows:

echo $AEN_SERVER

AEN Functional ID

AEN must be installed and executed by a Linux account called the AEN Service Account. The username of the AEN Service Account is called the AEN Functional ID (NFI). The AEN Service Account is created during AEN installation if it does not exist and is used to run all AEN services.

The default NFI username is wakari. Another popular choice is aen_admin. Set the environment variable “AEN_SRVC_ACCT” to “wakari” or your chosen name before installation:

export AEN_SRVC_ACCT="aen_admin"

This name will then be the username of the AEN Service Account and the username of the AEN Admin account.

When upgrading AEN, set the NFI to the NFI of the current installation.

AEN Functional Group

The name of the AEN Functional Group (NFG) is often set to “wakari” or “aen_admin” but may be given any name. This Linux group includes the AEN Service Account, so all files and directories that have the owner NFI also have the group NFG.

When upgrading AEN, set the NFG to the NFG of the current installation.

Set the AEN Functional Group (NFG) with this command before installation, either using “wakari” or replacing it with your chosen name:

export AEN_SRVC_GRP="aen_admin"

AEN Install sudo Command

During AEN installation the installers perform various operations that require root level privileges. By default the installers use the sudo command to perform these operations. Set the following environment variable *before* installation to override the default sudo command to perform root level operations or no command at all when the user running the installers has root privileges and the sudo command is not needed or available: AEN_SUDO_INSTALL_CMD

Examples:

export AEN_SUDO_INSTALL_CMD=""
export AEN_SUDO_INSTALL_CMD="sudo2"

AEN sudo Command

By default the AEN services uses sudo -u to perform operations on behalf of other users. Such operations include mkdir, chmod, cp and mv. Set the following environment variable before installation to override the default sudo command when sudo is not available on the system: AEN_SUDO_CMD.

Note, AEN must have the ability to perform operations on behalf of other users. This environment variable cannot be set to an empty string or null. The AEN_SUDO_CMD must support the -u command line parameter similar to the sudo command.

Example:

export AEN_SUDO_CMD="sudo2"

Note on Post-Install Customization

Please review the post-installation documentation for additional information on configuration options.

While root/sudo privileges are required during installation, root/sudo privileges are not required during normal operations after install, if user accounts are managed outside the software (for example, via LDAP). However root/sudo privileges are required to start the services, thus in the service config files there may still need to be a AEN_SUDO_CMD entry.

Install AEN Server

The AEN server is the administrative front­end to the system. This is where users login to the system, where user accounts are stored, and where admins can manage the system.

AEN Server Preparation - Prerequisites

Download Prerequisite RPMs

  • Regular Installation:
RPM_CDN="https://820451f3d8380952ce65-4cc6343b423784e82fd202bb87cf87cf.ssl.cf1.rackcdn.com"
curl -O $RPM_CDN/nginx-1.6.2-1.el6.ngx.x86_64.rpm
curl -O $RPM_CDN/mongodb-org-tools-2.6.8-1.x86_64.rpm
curl -O $RPM_CDN/mongodb-org-shell-2.6.8-1.x86_64.rpm
curl -O $RPM_CDN/mongodb-org-server-2.6.8-1.x86_64.rpm
curl -O $RPM_CDN/mongodb-org-mongos-2.6.8-1.x86_64.rpm
curl -O $RPM_CDN/mongodb-org-2.6.8-1.x86_64.rpm
curl -O $RPM_CDN/elasticsearch-1.7.2.noarch.rpm
curl -O $RPM_CDN/jre-8u65-linux-x64.rpm

Install Prerequisite RPMs

sudo yum install -y *.rpm
sudo /etc/init.d/mongod start
sudo /etc/init.d/elasticsearch stop
sudo chkconfig --add elasticsearch

Run the AEN Server Installer

Set Variables and Change Permissions

export AEN_SERVER=<FQDN HOSTNAME> # Use the real FQDN
chmod a+x aen-*.sh                # Set installer to be executable

Run AEN Server Installer

sudo -E ./aen-server-4.1.0-Linux-x86_64.sh -w $AEN_SERVER
<license text>
...
...

PREFIX=/opt/wakari/wakari-server
Logging to /tmp/wakari_server.log
Checking server name
Ready for pre-install steps
Installing miniconda
...
...
Checking server name
Loading config from /opt/wakari/wakari-server/etc/wakari/config.json
Loading config from /opt/wakari/wakari-server/etc/wakari/wk-server-config.json


===================================

Created password '<RANDOM_PASSWORD>' for user 'wakari'

===================================


Starting Wakari daemons...
installation finished.

After successfully completing the installation script, the installer will create the administrator account (AEN_SRVC_ACCT user) and assign it a password:

Created password '<RANDOM_PASSWORD>' for user 'wakari'

Record this password. It will be needed in the following steps. It is also available in the installation log file found at /tmp/wakari_server.log

Start ElasticSearch

Start elasticsearch to read the new config file:

sudo service elasticsearch start

Test the AEN Server install

Visit http://$AEN_SERVER. You should be shown the “license expired” page.

Update the License

From the “license expired” page, follow the onscreen instructions to upload your license file. After submitting, you should see the login page.

Install AEN Gateway

The gateway is a reverse proxy that authenticates users and automatically directs them to the proper AEN Compute machine for their project. Users will not notice this component as it automatically routes them.

Set Variables and Change Permissions

export AEN_SERVER=<FQDN HOSTNAME> # Use the real FQDN
export AEN_GATEWAY_PORT=8089
export AEN_GATEWAY=<FQDN HOSTNAME>  # will be needed shortly
chmod a+x aen-*.sh                # Set installer to be executable

Run Wakari Gateway Installer

sudo -E ./aen-gateway-4.1.0-Linux-x86_64.sh -w $AEN_SERVER
<license text>
...
...

PREFIX=/opt/wakari/wakari-gateway
Logging to /tmp/wakari_gateway.log
...
...
Checking server name
Please restart the Gateway after running the following command
to connect this Gateway to the AEN Server
...

Note: replace password with the password of the NFI user that was generated during server installation.

Register the AEN Gateway

The AEN Gateway needs to register with the AEN Server. This needs to be authenticated, so the NFI user’s credentials created during the AEN Server install need to be used. This needs to be run as sudo or root to write the configuration file: /opt/wakari/wakari-gateway/etc/wakari/wk-gateway-config.json

sudo /opt/wakari/wakari-gateway/bin/wk-gateway-configure \
--server http://$AEN_SERVER --host $AEN_GATEWAY \
--port $AEN_GATEWAY_PORT --name Gateway --protocol http \
--summary Gateway --username $AEN_SRVC_ACCT \
--password '<USE PASSWORD SET ABOVE>'

Ensure Proper Permissions

sudo chown $AEN_SRVC_ACCT /opt/wakari/wakari-gateway/etc/wakari/wk-gateway-config.json

Start the Gateway

sudo service wakari-gateway start

Verify the AEN Gateway has Registered

  1. Login to the AEN Server using Chrome or Firefox browser using the AEN_SRVC_ACCT user.

  2. Click the Admin link in the toolbar

    ../../../_images/ae-notebooks/4.1.0/install/admin-menu.png
  3. Click the Datacenters sub­section and then click your datacenter:

    ../../../_images/ae-notebooks/4.1.0/install/datacenter-leftnav.png
  4. Verify that your datacenter is registered and status is {"status": "ok", "messages": []}

    ../../../_images/ae-notebooks/4.1.0/install/datacenter.png

Install AEN Compute

This is where projects are stored and run. Adding multiple AEN Compute machines allows one to scale-out horizontally to increase capacity. Projects can be created on individual compute nodes to spread the load.

Set Variables and Change Permissions

export AEN_SERVER=<FQDN HOSTNAME> # Use the real FQDN
chmod a+x aen-*.sh                # Set installer to be executable

Run AEN Compute Installer

sudo -E ./aen-compute-4.1.0-Linux-x86_64.sh -w $AEN_SERVER
...
...
PREFIX=/opt/wakari/wakari-compute
Logging to /tmp/wakari_compute.log
Checking server name
...
...
Initial clone of root environment...
Starting Wakari daemons...
installation finished.
Do you wish the installer to prepend the wakari-compute install location
to PATH in your /root/.bashrc ? [yes|no]
[no] >>> yes

Configure AEN Compute Node

Once installed, you need to configure the Compute Launcher on AEN Server.

  1. Point your browser at the AEN Server
  2. Login as the AEN_SRVC_ACCT user
  3. Click on the Admin link in the top navbar
  4. Click on Enterprise Resources in the left navbar
  5. Click on Add Resource
  6. Select the correct (probably the only) Data Center to associate this Compute Node with
  7. For URL, enter http://$AEN_COMPUTE:5002.

Note: If the Compute Launcher is located on the same box as the Gateway, we recommend using http://localhost:5002 for the URL value.

  1. Add a Name and Description for the compute node
  2. Click the Add Resource button to save the changes.

Configure conda to use local on-site Anaconda Enterprise Repo

This integrates Anaconda Enterprise Notebooks to use a local on-site Anaconda Enterprise Repository server instead of Anaconda.org.

Edit the condarc on the Compute Node

Note: If there are some channels below that you haven’t mirrored, you should remove them from the configuration.

#/opt/wakari/anaconda/.condarc
channels:
    - defaults

create_default_packages:
    - anaconda-client
    - python
    - ipython-we
    - pip

# Default channels is needed for when users override the system .condarc
# with ~/.condarc.  This ensures that "defaults" maps to your Anaconda Repository and not
# repo.anaconda.com
default_channels:
    - http://<your Anaconda Repository name:8080/conda/anaconda
    - http://<your Anaconda Repository name:8080/conda/wakari
    - http://<your Anaconda Repository name:8080/conda/anaconda-cluster
    - http://<your Anaconda Repository name:8080/conda/r-channel

# Note:  You must add the "conda" subdirectory to the end
channel_alias: http://<your Anaconda Repository name:8080/conda

Configure Anaconda Client

Anaconda client lets users work with the Anaconda Repository from the command-line. Things like the following: search for packages, login, upload packages, etc. The command below will set this value globally for all users.

Run the following command filling in the proper value. Requires sudo since config file is written to root file system: /etc/xdg/binstar/config.yaml. This sets the default config for anaconda-client for all users on compute node.

sudo /opt/wakari/anaconda/bin/anaconda config --set url http://<your Anaconda Repository>:8080/api -s

Congratulations! You’ve now successfully installed and configured Anaconda Enterprise Notebooks.

Optional Configuration

Optional: Configure common AEN Compute options

To make any of the changes described below, please edit the following file: /opt/wakari/wakari-compute/etc/wakari/wk-compute-launcher-config.json

Then restart the AEN Compute service:

sudo service wakari-compute restart

Increase HTTP timeout between Gateway and Compute nodes

The default timeout is 600 seconds (10 minutes). To adjust this edit the httpTimeout key:

"httpTimeout": 600

Note: The httpTimeout must also be set on the Gateway Node with the same key at /opt/wakari/wakari-gateway/etc/wakari/wk-gateway-config.json

Change the project directory

NOTE: We recommend putting ``/opt/wakari`` and ``/projects`` on the same filesystem. If the project and conda env directories are on separate filesystems then more disk space will be required on compute nodes and performance will be worse.

To make aen-compute service use a different directory than /projects for storing the projects, modify the configuration file referenced above as follows:

"projectRoot" : "/nfs/storage/services/wakari/projects",

The directory /nfs/storage/services/wakari/projects specified as projectRoot above must exist for this to succeed.

Create groups with the same id

Additionally, if the /projects folder resides on an NFSv3 volume and you have a setup with several compute nodes, AEN will create local users with a different uid on each node.

To make the AEN Compute service create groups with the same id, edit the configuration file referenced above so that it contains the key identicalGID and the value true as in the following example. If you don’t see the identicalGID key, add it, and notice that you must add a comma at the beginning of the line. If you add this line as the last key, remove any comma at the end of the line.

, "identicalGID": true

Use numeric usernames

To use numeric usernames, you must modify the configuration file referenced above so that it contains the key numericUsernames and the value true as in the following example. If you don’t see the numericUsernames key, add it, and notice that you must add a comma at the beginning of the line. If you add this line as the last key, remove any comma at the end of the line.

, "numericUsernames": true

Optional: Verify and Tune Search Indexing

Verify that the AEN Compute node can communicate with the AEN Server. This is required for search indexing to work correctly.

curl -m 5 $AEN_SERVER > /dev/null

Ensure that there are sufficient inotify watches available for the number of subdirectories within the project root filesystem. Some Linux distributions default to a low number of watches, which may prevent the search indexer from monitoring project directories for changes.

cat /proc/sys/fs/inotify/max_user_watches

If necessary, this can be increased with the following command:

echo fs.inotify.max_user_watches=100000 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

Ensure that there are sufficient inotify user instances available, at least one per project.

cat /proc/sys/fs/inotify/max_user_instances

If necessary, this can be increased with the following command:

echo fs.inotify.max_user_instances=1000 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

Optional: Setting up a Default Project Environment

Anaconda Enterprise Notebooks includes a full installation of the Anaconda python distribution, along with several additional packages, located in the root conda environment in the path /opt/wakari/anaconda. A copy of this environment is created for each new AEN Project.

To configure a different set of packages as the defaults, create a new conda environment in the directory /opt/wakari/anaconda/envs/default. For example, to do so using a python 3.4 base environment, run the following command:

sudo -u $AEN_SRVC_ACCT /opt/wakari/anaconda/bin/conda create -p /opt/wakari/anaconda/envs/default python=3.4

Then use conda to install any additional packages into the environment as needed. After creating the environment, clone it once to ensure that it works correctly:

sudo -u $AEN_SRVC_ACCT /opt/wakari/anaconda/bin/conda create -p /opt/wakari/testenv --clone /opt/wakari/anaconda/envs/default
sudo -u $AEN_SRVC_ACCT rm -rf /opt/wakari/testenv

The default project environment will be cloned into the project workspace the first time the project is started. To convert an existing project, run the following command to clone the environment, replacing /projects/owner/project/envs/<ENV_NAME> with the path to the new environment you would like to create within the project:

sudo -u $AEN_SRVC_ACCT /opt/wakari/anaconda/bin/conda create -n /projects/owner/project/envs/<ENV_NAME> --clone /opt/wakari/anaconda/envs/default

Then open the Compute Resource Config for the project and set the project environment path there.

Configure a remote mongodb

First you will need to stop the AEN Server, AEN Gateway and AEN compute:

sudo service wakari-server stop
sudo service wakari-gateway stop
sudo service wakari-compute stop

Now, in order to configure a remote database to work with AEN-Server, you will need to edit /opt/wakari/wakari-server/etc/wakari/config.json, create a new key called MONGO_URL and as a value you will need to add the database information. The final file should look like:

{
  "MONGO_URL": "mongodb://MONGO-USER:MONGO-PASSWORD@MONGO-URL:MONGO-PORT",
  "WAKARI_SERVER": "http://YOUR-IP",
  "USE_SES": false,
  "CDN": "http://YOUR-UP/static/",
  "ANON_USER": "anonymous"
}

You can migrate the data from the former database into the new one, there is a guide about this in the MongoDB documentation website. Once the migration has been performed you can start back the services with:

sudo service wakari-server start
sudo service wakari-gateway start
sudo service wakari-compute start

Optional: SELinux Enforcing Mode

In order to run SELinux in Enforcing mode, there are a few ports that must be set which can be done using the semanage port command.

The semange command relies on policycoreutils-python. To install (if needed):

sudo yum -y install policycoreutils-python

Enable port 5000 for core aen-server:

sudo semanage port -m -t http_port_t -p tcp 5000

The -m flag is for modifying an existing usage of a port. If you get an error Port tcp/5000 is not defined change the flag to -a to add the port.

Enable ports 9200 and 9300 for elasticsearch:

sudo semanage port -a -t http_port_t -p tcp 9200
sudo semanage port -a -t http_port_t -p tcp 9300

Please see the Administrative documentation for additional information.

Wrapping Up

Congratulations. You now have a fully installed Anaconda Enterprise Notebooks system!

For additional documentation on topics such as creating user accounts and instructions of users who wish to use the system for collaborative analysis, please see other documentation resources.

Should you encounter any issues while installing AEN or have additional questions, please do not hesitate to contact your enterprise support representative.