Warning: work in progress.

In this blog post you will learn how easy it is to setup object storage (cfr. AWS S3) using the Rook Ceph operator in Kubernetes. The Rook object storage quickstart is a good resource (https://rook.io/docs/rook/v1.9/ceph-object.html ) but I experienced a few bumps on the road, and I learned how easy it is to expose the storage with an user friendly web interface.

Requirements

The other day we needed a storage solution with following requirements:

  • Harbor big files (some sort of packages) in a folder structure
  • Fast and reliable
  • Easy to use: A web interface to navigate the storage and download files.
  • Remote accessible upload possibility, easy to use in automation
  • Fine grained authentication (for writes) if possible

The tools at hand

At the office we have an Openshift (open source OKD version) cluster with lightning fast NVMe SSD disks.

The Rook Ceph operator is used to create a Ceph cluster with 3 OSDs (~ disks) and failure domains (hosts).

Some options

A ’traditional’ approach would be a storage server running some RAID configured storage. A webserver such as Apache httpd could expose some folder on the machine. Protocols such as NFS, SSH could be alternative ways to access the storage with authentication and offer write access.

My first idea was to leverage Ceph File System and deploy some pod with a webserver and the mounted filesystem. Ceph supports exporting this storage using NFS. Since this Ceph storage is not exposed outside, it is the easiest way to access it for writes from outside Kubernetes.

So most requirements are covered but I prefer to avoid NFS when possible and the Ceph implementation might be a bit …, well … ‘young’. And it seems like a bunch of work, and who likes that, right?

Another option is object storage and stumbling on this document, the road was clear: https://docs.ceph.com/en/latest/cephfs/app-best-practices/#do-you-need-a-file-system

Building object storage

So let’s assume we have a Kubernetes cluster with a Ceph cluster using Rook with at least 3 OSD's and failure domains. To satisfy the reliability approach we want 3 replica’s of our data so 3 failure domains (=hosts) is the minimum.

I followed the Rook quickstart, customizing YAML (small things like names) where appropriate, and created a StorageClass as described.

Expose web interface with ‘folder’ structure

How to sync data