Anaconda Scale

Distributed Computing

Note

This product is discontinued. This archived copy of the product documentation is provided for those customers who are still using it.

Anaconda Scale provides functionality to easily deploy Anaconda® packages and a distributed computation framework across a cluster. It helps you manage multiple conda environments and packages–including Python and R–on cluster nodes.

Anaconda Scale provides different options for deploying Anaconda on a cluster, including:

  • Centrally managed installation of Anaconda, including multiple environments such as Python and R
  • Anaconda parcel for Cloudera CDH, including custom-generated parcels
  • Deployment of conda packages and environments with Spark jobs

Features

  • Easily install Anaconda–including Anaconda Accelerate–across multiple cluster nodes
  • Provision distributed compute services with Dask
  • Perform interactive, distributed computations with single-user Jupyter Notebook
  • Easily launch and configure a cloud-based cluster on Amazon EC2

Compatibility

Anaconda Scale can be used with distributed computation frameworks such as Spark or Dask and works alongside enterprise Hadoop distributions such as Cloudera CDH or Hortonworks HDP. Anaconda Scale has been tested with the following Hadoop distributions and Spark versions:

  • Cloudera CDH 5.3.x through 5.11.x
  • Hortonworks HDP 2.2.x through 2.6.x (with Apache Ambari 2.2.x and 2.4.x)
  • Spark 1.3.x through 2.0.x

License

Anaconda Scale is available with Anaconda Enterprise. If you would like to use Anaconda Scale with a cluster on a bare-metal, on-premises, or cloud-based cluster, please contact us.