Apache Cassandra 4.0 promises increased stability, speed and ease of use

Future changes will be focused on lowering the learning curve, say DataStax executives

Apache Cassandra 4.0 was released on 20th July as a public beta for community testing. The timing of the final GA release is not yet known, but alterations from here on in will be minor refinements and bug fixes rather than breaking changes.

In a video presentation, co-founder and CTO of DataStax, Jonathan Ellis, and DataStax chief product officer, Ed Anuff, debated the changes to the community version of Cassandra and the company's enterprise and cloud products and what they signify for the way the database will seek to fit into a landscape increasingly dominated by cloud in the years ahead.

Cassandra 4.0 represents a coming of age for the NoSQL database originally spawned by Facebook in 2008. Future improvements to the architecture and feature set of Cassandra will be mostly incremental said Ellis, with a focus instead on making it easier to set up and configure without in-depth low-level knowledge of the stack and simpler to operate by companies and teams that lack "high priests of Cassandra".

"We want Cassandra to just work," he said.

Anuff spoke of the need to provide developers with all they need to use Cassandra as a general purpose store of the data their applications require in an increasingly diverse environment. So, for example 4.0 adds first class support for the Java frameworks Spring Boot, Spring Data and Quarkus. Third-party access tools available to DataStax Enterprise (DSE) include the DataStax Kafka Connector and Cassandra Reaper, a management tool built by recent DataStax acquisition The Last Pickle. There have also been moves to expose data held in Cassandra to a wider range of interfaces.

"API access to data from microservices via REST or GraphQL are becoming important for a whole new generation of developers using Node.js or Python." said Anuff. "Developers need a variety of different ways to access data and it's important the database is designed with that in mind."

Choosing a database to support rapid agile development and also scale-out operational performance "should not be an either/or choice," he added.

Apache Cassandra is developed and maintained by multiple vendors including Amazon, Netflix and DataStax, with numerous contributions from end users too. Many of the improvements from the DataStax side have come from the company's experience with rolling out Astra, the cloud database as a managed service which runs on Kubernetes.

"It's made us have to walk in our customer's shoes, living the process of having to run Cassandra at scale," added Anuff. "We've channelled those insights into DataStax Enterprise and into the Cassandra 4.0 community."

This informed the creation of the Cassandra Kubernetes operator released in March, as well as a forthcoming AIOps product called Vector, currently in private beta, which is designed to automatically monitor the health of Cassandra clusters. Another new product in Astra and DSE is called Guardrails, a configurable safety solution that enforces best practice, preventing developers making common errors with Cassandra. Graph features will be available in Astra early in 2021, said Anuff.

The community version of Cassandra 4.0 has a focus on stability, with more than a thousand bug fixes since the last release. Scaling operations are now five times faster, according to Apache, and there have been improvements in maintaining consistency across replicas. Observability has been boosted with new auditing capabilities and tools, and there are controls to enable granular data access on a per data centre basis.