ML Cloud User Guide

Last update: April 16, 2024     Download PDF

User Guide

This documentation explains various aspects of using the ML Cloud systems. It contains both introductory and advanced material. The documentation will be continuously updated.

  • All users: read the Good Conduct section. The Ml Cloud is a shared resource and your actions can impact other users. (17/02/2023)

First time users might want to start at the beginning and work through the first few chapters, skipping those sub-chapters that contain advanced material. This way, you will learn how to log in to the ML Cloud familiarize yourself with the environment, find pre-installed software, run your containers and experiments on the ML Cloud.

News and notifications

News about planned/unplanned downtime, changes in hardware, and important changes in software will be published on the Portal. For more information on the different situations see below.

System status and activity

You can get a quick overview of the system utilization and status:

Maintenance and Downtime

The ML Cloud Team will schedule maintenance in one of the following three manners:

  1. Rolling reboots: Whenever possible, the ML Cloud Team will apply updates and do other maintenance in a rolling fashion in such a manner as to have either no or as little impact as possible to ML Cloud services.
  2. Partial outages: The ML Cloud Team will do these as needed but in a manner that impacts only some ML Cloud services at a time.
  3. Full outages These are outages that will affect ML Cloud Services depending on the system, such as outages of core networking services, data storage services, data centers power of cooling system maintenance or outages.

In the case of a planned downtime, a reservation will be made in the queuing system, so new jobs, that would not finish until the downtime, won’t start. A notification message will be present in the Portal System Status Page, as well as mailing list notification will be sent in advance. We apologize for any inconveniences this may cause.

AI Conference Deadlines

The ML Cloud Team is aware of the AI Conferences deadlines and will try to abide by them for regular maintenance schedules. If we have missed a conference from the list, please let us know from this form.