Installation Instructions (AEN 4.0)

Anaconda Enterprise Notebooks (AEN) is a Python data analysis environment from Continuum Analytics. Accessed through a browser, Anaconda Enterprise Notebooks is a ready-to-use, powerful, fully-configured Python analytics environment. We believe that programmers, scientists, and analysts should spend their time analyzing data, not working to set up a system. Data should be shareable, and analysis should be repeatable. Reproducibility should extend beyond just code to include the runtime environment, configuration, and input data.

This installation guide walks through the steps needed to install a basic Anaconda Enterprise Notebooks system comprised of the front-end server, gateway and compute machines.

If you have any questions about the instructions, please contact your sales representative or Priority Support team, if applicable, for additional assistance.

System overview

The Anaconda Enterprise Notebooks (AEN) platform consists of three main service groups: AEN Server, AEN Gateway, and AEN Compute, below referred to simply as Server, Gateway, and Compute nodes respectively. These services can be run on a single machine or distributed across multiple servers.

  • Server: the entry point to Anaconda Enterprise Notebooks, managing users and projects
  • Gateway: a proxy service to handle URL and port mapping to ancillary services.
  • Compute Node: each compute node in the Anaconda Enterprise Notebooks system requires a Compute Launcher service to mediate access to the Server and the Gateway

image1

Organizationally, each Anaconda Enterprise Notebooks installation has exactly one Server instance. One or more Gateway instances can be configured and each Compute Node can only connect to one Gateway. The collection of Compute Nodes served by a single Gateway will be referred to as a Data Center. New Data Centers can be added to the AEN installation at any time.

For example, a Anaconda Enterprise Notebooks deployment with 2 Data Centers, where one Gateway had a cluster of 20 physical computers, and the second Gateway had 30 virtual machines would have the following complement of services installed and running:

1  AEN Server instance
2  AEN Gateway instances
50 AEN Compute instances (20 + 30)

Anaconda Enterprise Notebooks users interact with the system predominantly through Projects.

  • Project: a set of conda environments, Jupyter Notebooks, and other artefacts that can be accessed by a Team of users

Projects are associated with a single Data Center within the AEN environment. The Team of users includes one Owner, which is the user that created the Project.

Since Anaconda Enterprise Notebooks is web-based, it uses the standard HTTP port 80 or HTTPS port 443 on the Server.

Installers

The Anaconda Enterprise Notebooks installers are available only to paid customers. If you are interested in a demonstration of Anaconda Enterprise Notebooks, please contact us.

Components

Server

The Server component is responsible for login, accounts, admin, project creation and management and interfacing with the database. The Server is the main entry point for all users. It handles setting up projects and ensuring that users are sent to the correct Data Center for a given Project.

Anaconda Enterprise Notebooks uses MongoDB for its internal data persistency. This is typically run on the same host as the Server but can also be deployed on a separate host.

The Server uses nginx to handle the user-facing web interface. nginx acts as a request proxy. The actual Server web-process runs on a high numbered port listening only on localhost, and nginx forwards requests there. The nginx server is also responsible for static content.

Gateway

The Gateway provides a single access point to a set of Compute Nodes, and acts as a proxy service to manage authorization and mapping of URLs and ports to services that are running on Compute Nodes, thus providing a consistent uniform interface for the user. The Gateway may also be referred to as a Data Center because it serves as the proxy for the collection of compute nodes.

Compute node

Compute Nodes are where Apps (such as Jupyter Notebook and Workbench) actually run. These are also the hosts that a user would see in a terminal session or if they used SSH to access the node. It is where all user-visible programs run. Each Project is associated with one or more Compute Nodes, and these in turn are part of a single Data Center.

Distributed install

In a distributed install the Server and Gateway run on separate hosts.