Installation (AEN 4.1.2)

Overview

This installation procedure covers the steps needed to install a basic Anaconda Enterprise Notebooks (AEN) system comprised of a front-end Server, one or more Gateways, and one or more Compute Nodes.

If you have any questions about installation instructions, please contact your sales representative or Priority Support team.

Components

The AEN platform consists of three main service groups: AEN Server, AEN Gateway, and AEN Compute. These services can be either be distributed across multiple servers (recommended), or run on a single machine.

Server

The Server component is the administrative front-end to the system. This is where users login to the system, where user accounts are stored, where admins can manage the system, and interfacing with the database.

The Server is the main entry point for all users. It handles setting up projects and ensuring users are sent to the correct Data Center for a given Project.

Anaconda Enterprise Notebooks uses MongoDB to store internal data. This is typically run on the same host as the Server but can also be deployed on a separate host.

The Server uses NGINX to handle the user-facing web interface. NGINX acts as a request proxy. The actual Server web-process runs on a high numbered port listening only on localhost, and NGINX forwards requests there. The NGINX server is also responsible for static content.

Gateway

The Gateway is a reverse proxy that authenticates users and automatically directs them to the proper AEN Compute machine for their project.

The Gateway provides a single access point to a set of Compute Nodes, and acts as a proxy service to manage authorization and mapping of URLs and ports to services that are running on Compute Nodes, thus providing a consistent uniform interface for the user.

Generally you need one Gateway for each physical location in your organization using AEN for firewall reasons.

Users will not notice the Gateway as it automatically routes requests to the proper Compute Node.

Compute Nodes

Compute Nodes are where Apps (such as Jupyter Notebook and Workbench) actually run. These are also the hosts that a user would see in a terminal session or if they used SSH to access the node. It is where all user-visible programs run. Each Project is associated with one or more Compute Nodes, and these in turn are part of a single Data Center. Compute Nodes need only be reachable by the AEN Gateway, so they can be completely isolated by a firewall.

Component organization

../../../_images/ae-notebooks/4.1.2/install/components.png

image1

Organizationally, each Anaconda Enterprise Notebooks installation has exactly one Server instance. One or more Gateway instances can be configured and each Compute Node can only connect to one Gateway. The collection of Compute Nodes served by a single Gateway will be referred to as a Data Center. New Data Centers can be added to the AEN installation at any time.

For example, a Anaconda Enterprise Notebooks deployment with two Data Centers, where one Gateway had a cluster of 20 physical computers, and the second Gateway had 30 virtual machines would have the following complement of services installed and running:

1  AEN Server instance
2  AEN Gateway instances
50 AEN Compute instances (20 + 30)

Anaconda Enterprise Notebooks users interact with the system predominantly through Projects, a set of conda environments, Jupyter Notebooks, and other Apps that can be accessed by a Team of users.

Projects are associated with a single Data Center within the AEN environment. The team of users includes one Owner, which is the user that created the Project.

Since Anaconda Enterprise Notebooks is web-based, it uses configurable HTTP ports on the Server.

Installers

The Anaconda Enterprise Notebooks installers are available to paid customers only. If you are interested in a demonstration of Anaconda Enterprise Notebooks, please contact us.

Distributed install

In a distributed install the Server and Gateway run on separate hosts.

Single box install

Both the Server and the Gateway need separate external ports since they are independent services that are running on the same host in the single-box installation.

Installation requirements

Ensure you have the proper hardware and software resources before installing AEN.

Hardware requirements

See System Requirements for all Anaconda Enterprise hardware requirements.

NOTE: We recommend putting ``/opt/wakari`` and ``/projects`` on the same filesystem. If the project and conda env directories are on separate filesystems then more disk space will be required on compute nodes and performance will be worse.

Software requirements

  • Red Hat/CentOS versions 6.5 to 7.2 on all nodes (Other Linux distros are supported, but this installation document assumes Red Hat or CentOS.)
  • Linux home directories are required since Jupyter looks in $HOME for profiles and extensions.
  • /opt/wakari: Ability to install here and at least 10 GB of storage.
  • /projects: Size depends on number and size of projects. At least 20 GB of storage.

Linux system accounts required

Some Linux system accounts (UIDs) are added to the system during installation. If your organization requires special actions, here is the list of UIDs:

  • mongod (Red Hat) or mongodb (Ubuntu/Debian): Created by the RPM or deb package
  • elasticsearch: Created by RPM or deb package
  • nginx: Created by RPM or deb package
  • AEN_SRVC_ACCT: Created during installation of Anaconda Enterprise Notebooks, and defaults to “wakari”
  • ANON_USER: An account such as public or anonymous on the Compute Node If this user is not found, AEN_SRVC_ACCT will try to create it, and if this fails, projects will fail to start.
  • ACL: These directories need the filesystem mounted with Posix ACL (Access Control List) support (Posix.1e). Check with mount and tune2fs -l /path/to/filesystem | grep options

Additional software requirements

AEN Server
  • Mongo Version: >= 2.6.8 and < 3.0
  • NGINX version: >= 1.6.2
  • ElasticSearch: >= 1.7.2
  • Oracle JRE 7 or 8
  • bzip2
AEN Gateway

No additional software prerequisites.

AEN Compute Node
  • git
  • bzip2
  • bash (Red Hat default) or zsh
  • X Window System

Note: If you don’t want to install the whole X Window System, you still need to install the following packages for R plotting support:

sudo yum install libXrender libXext libXdmcp libSM libICE libXt \
dejavu-sans-fonts dejavu-serif-fonts dejavu-fonts-common \
fontpackages-filesystem

Security requirements

  • Root or sudo access
  • SELinux in Permissive or Disabled mode

One way to change SELinux to either permissive or disabled mode is to edit the /etc/sysconfig/selinux file and set SELINUX parameters value to either disable or permissive. Edit the following file using either root or sudo access:

/etc/sysconfig/selinux

Edit the following and reboot for changes to take effect:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.

SELINUX=enforcing

# SELINUXTYPE= can take one of these two values:
    #     targeted - Targeted processes are protected,
    #     mls - Multi Level Security protection.

SELINUXTYPE=targeted

Verify changes with getenforce.

Network/TCP requirements

Note that all port numbers are configurable, but defaults are shown below.

Direction Type Default Port Protocol Optional Configurable Comments
Inbound TCP 80 HTTP or HTTPS No Yes Server
Inbound TCP 8089 HTTP or HTTPS No Yes Gateway
Inbound TCP 5002 HTTP No Yes Compute

Other requirements

Assuming the above requirements are met, there are no additional dependencies necessary for AEN.

Note: While not a requirement for running the software, these instructions use curl or wget to download packages used in the install process. You may use other appropriate means to put the needed files into the installation directory.

Install Steps

Carry out the procedures linked from the table below to perform a complete install of all Anaconda Enterprise Notebooks components.

The following optional install procedures may need to be performed, depending on how you set up your Data Center:

Additional post-install information: