The term ‘cloud’ has been much more than just a buzzword for a long time now. The cloud, or rather the clouds, have become an essential part of our daily business. But what if – even though the wheat has long since been separated from the chaff in terms of IT expertise – I still don't know much about the cloud? Simply fly blind into the big, wide world of public clouds? Knock something together in a web frontend with a few clicks and see how it feels? ‘Cloudformation’? ‘DevOps’? Hang on… What?! To be able to test properly, I also need to dump my data into some bucket and get busy with it…? Well, everyone knows that practice makes perfect – so let’s go for it!
Well, we find that approach a little reckless. But what if there were a safe sandpit – so safe that I could play with the cloud almost totally blind, to build my experience and know-how? Sounds good, eh? Well, that's just what we thought too, and that's why we built precisely this fully containerised sandpit: the ‘Open Cloud Platform’ or ‘OCP’ for short.
But who are ‘we’ exactly? We’re a virtual team from the Cloud & Infrastructure Services Division at Otto Group IT. Our specialist background comprises the virtualization, storage, networking, and DevOps areas.
We’ve operated the VMware private cloud environment ‘SPICE’ successfully for many years. Our customers include internal teams as well as Group companies that acquire ‘Managed Services’ from us, usually in the form of Virtual Machines.
With the trend shift towards the public cloud, calls for an agile, scalable private cloud environment within the Group were also getting louder. We quickly realised there was an urgent need for action here in order to stay on the cutting edge and offer our customers the best-possible service. And so the Open Cloud Platform was born.
OCP is a private cloud within the Otto Group campus networks that’s closely based on hyperscaler functionalities. The users themselves hold the reins here. They can act with complete freedom within their private cloud area, try ideas out, build them, tear bits off, break them down – and start all over again.
Of course, OCP is also prepared for productive operations and already houses the first productive environments such as Th!ngs. Th!ngs is an application bundling many microservices – to cross-charge customers for Otto Group IT services via SAP, for instance. Public cloud accounts that Otto Group IT provides to its customers have also been cross-charged in Th!ngs since 2022.
Highly complex platforms with their own (virtual) networks, routers, DNS services, security groups, load-balancing, containers or compute instances – all these cloud resources can be consumed in OCP via WebUI or REST API. Just a moment… REST API, what was that? Exactly. Tools such as Terraform use REST API to set up the required infrastructure (IaC) fully automatically. We created a private cloud here in which the possibilities are practically unlimited. It also has the small but subtle difference that it’s impossible to accidentally upload something to the Internet that really doesn’t belong there. Everything takes place on pretty hefty cloud nodes at our Data Centres on the OTTO Campus and at an externally operated co-location in the Greater Hamburg area.
"Our entire tech stack is a colourful mix of 100% open source software, something we’re more than proud of."
Automation starts as soon as a tech has bolted the physical server into the rack and switched it on. This still relatively cold chunk of hardware blips a DHCP request into the network and receives the necessary boot parameters via PXE – we use MaaS for this. The server is kitted out fully automatically with an OS; the basic network configuration as well as hard disk partitioning etc. are also carried out here. After certain tags are assigned within MaaS that determine the role of the node (e.g. compute node), the machine is in ‘ready to deploy’ state, meaning the server is ready to become a cloud node. Yay!
Our cloud is based on the ‘OpenStack’ cloud framework which has been tried and tested over many years. Anyone who knows OpenStack knows that installing it is not (normally) a case of ‘step 1, step 2, step 3 – done’. In fact this was already the case with us, but we’ve gone one step further and have even done away with ‘step 1, step 2, step 3 – done’ completely.
Deep in our automation engine room is a Gitlab pipeline in which MaaS uses API to scans for nodes with precisely this status. As soon as it identifies such a node, another well-known friend is sent into the race – Ansible. Some people might now think, "ok, they take Ansible and install the OpenStack packages." Nope! We run OpenStack 100% containerized in Docker, including Hypervisor and SDN (Software Defined Network). We find Ansible to be just a slightly better Docker compose: it rolls out the required containers, selected based on the tags from MaaS, to the node and takes over the configuration of the various OpenStack components such as Nova, Neutron and Keystone etc.
Once again, we didn't reinvent the wheel here. The OpenStack community has realized that installing and operating OpenStack based on packages means a whole lot of pain. Our deployment tool of choice became Kolla/Kolla-Ansible. Since we have very special requirements, we still had to write our own code to make Kolla/Kolla-Ansible suit our needs. In line with the open source ethic, we have of course returned this code to the community.
So what else needs to be done after all OpenStack services are up and running on the new node? In a nutshell – nothing. Once the pipelines have been successfully completed, the node is productive and can usually already run newly created cloud resources.
The other way around, i.e. when a node needs to be retired, of course it works the same way. In OCP, we operate fully automated up/downscaling with hardware.
Does automation stop here? Not by a long shot! Our monitoring and logging is also thoroughly automated. This means the usual suspects such as Grafana, Prometheus, ELK, CheckMK, Alerta and many more are rolled out automatically and configured in containers. The resources that need to be monitored are also automatically integrated in the above-mentioned tools.
The Open Cloud Platform is being further developed and optimized constantly and will offer its users many more new features going forward.
We are happy to have created a product that offers a high level of data protection security and agility at the same time, and which is now used by more and more Otto Group Holding Divisions and Group companies. OCP provides a protected space for cloud newbies and is aimed at agile development teams for whom a public cloud is out of the question due to the sensitivity of their data.