Skip to content

Ferranti Storage

File System Access Points

File System Quota Key Features
/weka/group/user GROUP Quota 0.5TB NVMe based. Not backed up.
/scratch_local none Local to the compute node. Data in $SCRATCH is not shared across nodes. Purged regularly.

Available Datasets

Some commonly used datasets have been provided for the users:

Dataset Name Location
ImageNet-ffcv /weka/datasets/ImageNet-ffcv
CLEVR_v1.0 /weka/datasets/CLEVR_v1.0
coco /weka/datasets/coco
Falcor3D_down128 /weka/datasets/Falcor3D_down128
ffcv_imagenet_data /weka/datasets/ffcv_imagenet_data
imagenet-styletransfer /mnt/qb/datasets/imagenet-styletransfer
kitti /weka/datasets/kitti
laion400m /weka/datasets/laion400m
laion_aesthetics /weka/datasets/laion_aesthetics
ModelNet40 /weka/datasets/ModelNet40
NMR_Dataset /weka/datasets/NMR_Dataset
stl10_binary /weka/datasets/stl10_binary
WeatherBench /mnt/qb/datasets/WeatherBench
PUG Dataset /weka/datasets/PUG
C4 (en, noclean) /weka/datasets/c4
synthclip /weka/datasets/SynthCLIP
gobjaverse (tar version) /weka/datasets/gobjaverse
mlcommons /weka/datasets/mlcommons
objaverse /weka/datasets/objaverse
ImageNet-C /weka/datasets/ImageNet-C
Imagenet2012 /weka/datasets/ImageNet2012
Imagenet-r /weka/datasets/imagenet-r
Imagenet-r /weka/datasets/imagenet-r

Do You Want A New Dataset?

If you would like an additional dataset installed for general use, please use the following form and/or contact us though the ticketing system.

Ferranti Ceph

For each group in MLCloud we have created a ceph s3 bucket named {{group}}0, for example for a mladm the bucket name will be mladm0

Each user belongs to mladm will be allowed only to put data in a sepcific bucket location, For example: User mfa624 belongs to group mladm , will be only able to upload data to mladm0/mfa624/.

Each bucket will have quota = 10TB.

How to access Ferranti Ceph S3 Buckets:

1. Generating access credentials for Ferranti Ceph S3 buckets

This details generating Ceph S3 access credentials for the Ferranti cluster.

On ferranti-login001 or ferranti-login002. You must first generate a token by running the following script , It will ask you to enter your LDAP username and LDAP password (Your credentials that you are using to access Nextcloud) if you have forgotten your password, you may perform a password reset on Nextcloud.

/usr/share/custom-scripts/generate_token.sh

NOTE : While entering your password no characters will appear on screen.

The output command for this is your Access Key, which you will need in the next step, and will be a string like this (this is a random example, not a valid key):

s9uk2E00L8Y6O741jDV9d61h73092LPk498Miq8AA1n2nX7142z24u1D376tk4346734d63Qvf36U23n50891w60P818ze98Du5116b38E94Z00M3rM1u5mpO0z0j64PC4aD5EOb87vgQTGb1v801181G3IeY2GM286r34s09349125Sjn3x85a=

NOTE: DO NOT SHARE THIS TOKEN WITH ANYONE, BECAUSE IT CONTAINS CLUSTER LOGIN INFORMATION.

How to Interact with Ceph buckets:

There are multiple cli tools you can use for interacting with S3 buckets , two of the most popular tools are s3cmd and awscli commands.

How to use s3cmd

  • Generate the ~/.s3cfg configuration file:

Execute command s3cmd --configure and answering the following prompts:

Enter new values or accept defaults in brackets with Enter.

Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.

Access Key: the Token that you generated in previous step.

Secret Key: Enter the word secret

Default Region [US]: Press Enter

S3 Endpoint [s3.amazonaws.com]: ferranti-s3.mlcloud.uni-tuebingen.de

DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: ferranti-s3.mlcloud.uni-tuebingen.de

Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3

Encryption password: Press Enter

Path to GPG program [/usr/bin/gpg]: Press Enter

Use HTTPS protocol [Yes]: Press Enter

HTTP Proxy server name: Press Enter

Test access with supplied credentials?: y

Save settings?: y

  • Verify if your access is working fine

First, try the following command:

s3cmd ls

If this command did not output an error, then your configuration is valid.

The following are a list of useful commands for interacting with S3 buckets.

s3cmd Examples:

  • List a bucket contents:

s3cmd ls s3://<bucket>

  • Upload a file to a bucket:

s3cmd put /weka/group/user/local/file/path/filename.txt s3://<bucket>/<user>/

  • Download a file from a bucket to Ferranti Weka storage:

s3cmd get s3://<bucket>/<user>/<file-name> /weka/group/user/local/file/path/

How to use awscli

  • Create aws config and credentials files, execute command

aws --profile <profile-name> configure

You can name your profile with anything , for example name it as your ldap username.

For example for a LDAP user named mfa624 , I will create profile with name mfa624

Execute command and answering the following prompts:

aws --profile mfa624 configure AWS Access Key ID [None]: "the Token that you generated in previous step" AWS Secret Access Key [None]: "" yes, add only 2 double quotations Default region name [None]: us-east-1 Default output format [None]: json`

  • Verify if your access is working fine

aws --profile mfa624 --endpoint-url https://ferranti-s3.mlcloud.uni-tuebingen.de s3 ls

If this command did not output an error, then your configuration is valid. replace mfa624 with your profile name

aws Examples:

  • List a bucket contents:

aws --profile <profile-name> --endpoint-url https://ferranti-s3.mlcloud.uni-tuebingen.de s3 ls s3://<bucket>

  • Upload a file to a bucket:

aws --profile <profile-name> --endpoint-url https://ferranti-s3.mlcloud.uni-tuebingen.de s3 cp /weka/group/user/local/file/path/filename.txt s3://<bucket>/<user>

  • Download a file from a bucket to Ferranti Weka storage:

aws --profile <profile-name> --endpoint-url https://ferranti-s3.mlcloud.uni-tuebingen.de s3 cp s3://<bucket>/<user>/<file-name> /weka/group/user/local/file/path/


Last update: July 16, 2025
Created: June 21, 2024