How Postgres is taking the fight to the NoSQL pretenders

Long-time Postgres developer, EnterpriseDB's Dave Page, explains how open-source software development has changed and what's next for the venerable database

Dave Page's involvement with open source software began long before collaborative coding started to get taken seriously by business.

As a developer, he commenced working on the PostgreSQL database (often known simply as "Postgres") in 1997, just after the Berkeley University project had been released from academia. Page is now one of a core team of seven members who control the direction of development, a leading light in support group and event organiser PostgreSQL Europe, as well as performing his day job as chief architect and head of IT at Postgres vendor EnterpriseDB.

So what has changed since the early days?

"The community has certainly grown," he says, "But it's not grown massively partly because Postgres has very high standards for the code. It makes it very hard for new programmers to join the project. That's unfortunate, and we are trying to do something about it, but it's kind of a necessary evil to maintain the quality."

In line with this, the typical Postgres developer has become more professional too.

"When I started out the vast majority of people in the community were hobbyists or people who wanted to use Postgres at work, like I did, and were contributing. Nowadays the majority of the senior contributors are working for a company that's supporting Postgres in some way."

Among these supporting companies are Fujitsu, which has developed its own fork of Postgres, and NTT Japan, which has created a version solely for its own internal use. These big players are joined in the development effort by myriad small consultancies and development firms. Then there is EnterpriseDB, which has modified the code to make it compatible with Oracle and which delivers enterprise and cloud-based versions on a subscription basis, along with related toolsets and support.

Page estimates he spends about 20 per cent of his time doing "community work" with the rest devoted to his company's products.

"EnterpriseDB is very supportive of the community," he says. "A number of people who have joined from the community are given time to work on the open-source side of things. They're very hands off."

Joining forces with competitors to help to develop Postgres means there is always a certain tension, he says. It also means that some new features make it into the free community version while others remain proprietary, such as EnterpriseDB's Oracle compatibility, for example. Page explains how this works.

"With Postgres we collaborate, and in business we compete. We tend to only fork Postgres to create new features the community as a whole isn't interested in."

He contrasts this approach with that of MySQL, which is owned and controlled by a company that arguably has an interest in killing it off or at least starving it of attention.

"Oracle are in a strange position. They have a database that wants to be enterprise class [MySQL] and they have an enterprise class database. So it's not in their interest to spend too much effort on MySQL because it's going to start encroaching on the standard edition and enterprise edition revenue."

As Oracle seeks to lock in these revenue streams - in part by intensifying its audit activity - Page says the take-up of EnterpriseDB's Postgres offerings is increasing.

"Licensing is driving a lot of people to us, they want to get away from that," says Page, who also questions Oracle's commitment to cloud.

"It'll be interesting to see where they go with that, competing with Amazon. Even Amazon isn't making a profit in cloud."

SQL, NoSQLand other SQL

The once conservative world of databases has been turned on its head in recent years by new arrivals, including the NoSQL firms such as MongoDB, Couchbase, Basho and DataStax.

The new choices available mean that developers can now select the right tool for any particular job. So, many companies will run MongoDB together with Postgres and Oracle somewhere in their set-up rather than having to settle for a one-size-fits all.

One such company is polling firm YouGov, whose executive technical director Jason Coombs has moved some distributed workloads away from Postgres and onto MongoDB, which he claims provides better performance for distributed data, but retains the former for relational tasks. Another is Paul Barry, director of Temetra, who uses Riak for distributed workloads but said, "We still run Postgres, it has very desirable features in terms of transactions and data integrity for certain types of data - high reliability or high consistency data - which is better suited to the relational model."

These two companies have moved tasks away from Postgres. Perhaps in view of this trend, the distributed computing market is obviously one that the Postgres community is trying to crack.

In 2015, PostgreSQL introduced JSONB, a binary version of JSON storage, and a key-value store giving the database NoSQL capabilities and taking the fight to the newcomers. Page claims that rather than moving away from Postgres customers are instead shifting back from experiments with NoSQL, although he declines to provide any examples. He does say that for certain queries Postgres is a better choice though.

"We significantly outperformed MongoDB in terms of JSON storage and retrieval, and querying because we can query right down to the documents very, very efficiently so you can query on a specific attribute," says Page.

Another area in which Page sees considerable promise is the OpenStack cloud platform for private and hybrid cloud systems.

"I use OpenStack a lot, I really like it as an end-user," he says. "It gets me over all those qualms I had about pushing our internal services out to Amazon. I could have the flexibility of the cloud but I know it's running on systems that are six feet away from where I'm sitting.

"But one of the things it's really missing at the moment is the database layer. They have Trove but is very immature, designed for managing individual database servers, and I think that's where there's a lot of opportunity for us."