I got asked to write a few sentences about openstack and why I don’t like it (anymore).
Background: I’m working at one of the largest hosting provider in europe, HEG. We got asked a few times if we want to start selling public cloud stuff (we already have VPS und private cloud in our portfolio), so we started to evaluate openstack, because this seemd to be “the new hotness”.
My requirements:
- host more than 10.000 instances
- shared storage for all of them
- provide private ip space for each customer
- full support for adding public ipv4 and ipv6 addresses to instances
- no single point of failure
- provide scaleable bandwith from 100mbit to at least 10gbit for each customer
- API for everything
- easy to maintain and to update/upgrade
These are a few of my personal requirements, lets compare this to openstack. They used iscsi a long time to provide blockdevices for virtual machines, this is fine in small environments but doesn’t scale, luckily they new also support ceph which is working fine. Openstack has many many services, all of them have APIs to interact with, but it is such a complex construct which makes it hard to understand and to modify. The APIs have docs but they weren’t easy to understand (but still useful, and hey, nothing is perfect). Major upgrades weren’t supported, the recommended way was to install new servers and migrate, this really sucks (but should have changed since the last release?). Their network design is a huge issue and a no-go. The neutron service is designed to build a fully meshed network via GRE tunnels and you only have one gateway to the outside (relies on openvswitch). This is a huge SPOF and not acceptable. It is possible to build active/passive neutron nodes, but even this is bad because it is hard to build a single node that handles multiple 10G links and traffic for more than 10.000 machines. Also, GRE doesn’t scale, more than 50 nodes in one availability zone weren’t recommended ( one zone = one fully meshed GRE setup). Since few months, you can use OpenContrail as an alternative solution to openvswitch. Their development is really slow, the code is unstable/partly broken and they use way to many technologies: rabbitmq, cassandra, zookeeper, redis, python, C/C++, and many more
Conclusion: Openstack is nice, and may works fine in smaller environments (less than 100 Nodes?), but it simply doesn’t scale in larger networks. It was easier to build a KVM infrastructure from scratch than deploying openstack. I’ve written down some information about a FOSS KVM solution that scales way better on github.