Cloudera Manager Parcels (AER 2.25)ΒΆ

Anaconda Repository provides a way to integrate with Cloudera Manager to distribute your Anaconda data science artifacts to your Hadoop cluster. You can create custom parcels with the packages you want, including your own packages.

NOTE: Creating custom parcels requires a local mirror of the Anaconda packages. Anaconda Repository will not connect to https://repo.anaconda.com to fetch packages that are not available locally. See the mirroring documentation.

To create a custom parcel, navigate to /<username>/installers. You can find this link in the dropdown menu, or in the Installers card on your homepage.

Now select the “Create new installer” button. This brings you to a package selection form. You can select which accounts to fetch packages from - the anaconda user will be added by default. After an account is selected, package names can be entered into the package field.

When creating a parcel, Anaconda Repository generates a file named construct.yaml which can be used with conda constructor, and a 64-bit Linux installer including the specified packages. To create just the installer script, click Create installer; to create a parcel, click Create parcel.

NOTE: By default, conda is not included in the custom parcel. If you wish to add additional packages to your environment, you can add those through the Anaconda Repository interface. If you wish to see the list of packages that are included in your custom parcel, the information is provided in /opt/cloudera/parcels/<PARCEL_NAME>/meta/parcel.json.

NOTE: The parcel is generated with the prefix of /opt/cloudera/parcels/<PARCEL_NAME>. This is the default location where activated parcels are loaded. If you are deploying parcels in a different directory, you can change this prefix with the PARCELS_ROOT configuration setting.

Once you have created a custom parcel, you can distribute it to your cluster by adding http://<repository ip>:<port>/<username>/installers/parcels/ as a Remote Parcel Repository URL. Cloudera Manager will detect the parcels hosted on Anaconda Repository, and provide the option to Download and Distribute the parcels.

By default, Anaconda Repository generates a parcel file for every compatible distribution. You can customize which parcel distributions are created by configuring the PARCEL_DISTRO_SUFFIXES configuration setting.