It’s funny how much better some words sound in a foreign language. Cycling fans will be well aware of how inspiring the cries of “Allez, Allez, Allez” are when riders push their bodies to the limit climbing Pyrenean mountains. It sounds far better than the English equivalent, “Go, Go, Go.”

And so too it is with the Cloud Native Computing Foundation (CNCF) today which has voted in a new project, Vitess. I’m pretty sure whomever names the storage project realized that “Vitesse” (with the final “e” as per correct French spelling) has a far more alluring ring to it than “Speed.”

Anyway, nomenclature aside, the CNCF has indeed welcomed another storage project into its midst, only a week after doing similarly with Rook. This, the 16th hosted project within the CNCF (alongside Kubernetes, Prometheus, OpenTracing, Fluentd, Linkerd, gRPC, CoreDNS, containerd, rkt, CNI, Envoy, Jaeger, Notary, TUF, and Rook) was originally created as an internal solution by YouTube to handle scaling for massive amounts of traffic.

Vitess is a database orchestration system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vitess allows application code and database queries to remain agnostic to the distribution of data onto multiple shards. With Vitess, organizations can even split and merge shards as needs grow, with an atomic cutover step that takes only a few seconds. Companies like BetterCloud, Flipkart, Quiz of Kings, Slack, Square Cash, Stitch Labs and YouTube are using Vitess across various stages of production and deployment. Companies including Booking.com, GitHub, HubSpot, Slack, and Square are also active contributors to the project.

The “why” around Vitess is best summed up by Sugu Sougoumarane, CTO at PlanetScale Data and Co-creator of Vitess:

Faced with rapid organic and internal growth at YouTube, we had to come up with something that would leap ahead of the curve instead of just fighting fires. When we finally built the initial feature list for Vitess, it was obvious that we were addressing problems that are common to all growing organizations. Our collaboration with Kubernetes over the last two years means anyone can now run Vitess the way YouTube does: dynamically scaled and scheduled in a container cluster.

Vitess has actually been under development since long before Kubernetes was even a glimmer in the eye of the Google crew. First started in 2010, the earliest version was a connection proxy that helped to buy some headroom, but over time the features evolved, while the tools and servers grew to be more efficient, fault tolerant, and manageable. This iterative journey led to what Vitess has become today: a distributed, cloud-based storage solution that exhibits some of the best properties of a relational database.

A more recent user of Vitess is Slack. Michael Demmer, Senior Staff Engineer at Slack explains their use-case:

Slack is in the midst of a major migration of the MySQL infrastructure at the core of our service, driven by the need for an architecture that scales to meet the growing demands of our largest customers and features under the pressure to maintain a stable and performant service that executes billions of MySQL transactions per hour. We needed a solution that would offer a familiar full featured SQL interface, and wanted to continue to use MySQL as the backing store to maintain our operations knowledge and comfort level. Vitess is a natural choice for this purpose and has served us well so far.

The main features that Vitess offers include:

  • Combines important MySQL features with the scalability of a NoSQL database
  • Enables MySQL to run in the cloud
  • Cloud-native functionality such as support for automatic failover/recovery, replication, and rolling upgrades
  • Vertical and horizontal sharding support, and virtually seamless dynamic re-sharding
  • Multiple sharding schemes, with the ability to plug-in custom ones
  • Query routing, rewriting and sanitization, blacklisting, streaming, and de-duping
  • Master management tools (handles reparenting)
  • Performance analysis tools

In terms of the current technical specifications of Vitess, some important points to note include that it’s backend components are implemented in Go and that it is continuously tested against Ubuntu 14.04 (Trusty) and Debian 8 (Jessie). CNCF suggests that other Linux distributions should work as well. Currently, Vitess supports MySQL 5.6, MariaDB 10.0, and any newer versions while VTGate server is the main entry point applications use to connect to Vitess. In terms of the data aspects of Vitess, it supports data backups to either a network mount (e.g. NFS) or to a blob store.

Welcome to CNCF, Vitess and we all look forward to seeing how this project develops.

Ben Kepes

Ben Kepes is a technology evangelist, an investor, a commentator and a business adviser. Ben covers the convergence of technology, mobile, ubiquity and agility, all enabled by the Cloud. His areas of interest extend to enterprise software, software integration, financial/accounting software, platforms and infrastructure as well as articulating technology simply for everyday users.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.