Case Study: Twitter's use of MySQL
Social network explains its reasons for using Oracle's open-source relational database management system
Social networking site Twitter told delegates at Oracle's MySQL Connect event in San Francisco today that it uses MySQL because it understands the system, and it's capable of processing the high frequency of queries it receives.
Jeremy Cole, DBA Team Manager at Twitter, said that the site processes around 400 million tweets per day, all of which have to be stored in the firm's databases.
"Processing 20,000 tweets per second is our current record," he said, explaining the scale of the task before his team.
Cole described his mission as to "keep the tweets flowing". He explained that this involves providing capacity additions, fixing broken databases and replacing hardware among the firm's several thousand MySQL servers.
One of his recent projects was to add server-side statement timeout to kill queries that run for more than five seconds, in order to prevent a broken or inefficient query from using up precious resources and delaying the entire queue.
He is also working on optimising Twitter's databases to operate on SSDs (solid state discs), onto which most of the firm's server fleet is planned to migrate.
Twitter uses MySQL as opposed to other relational tools principally because it has the necessary expertise to use it. Cole said that he has been involved in designing MySQL, which as an open source tool welcomes contributions from its community, and his team has extensive experience of using the system.
"MySQL is operable, we know how to use it and upgrade and downgrade it, to push out new releases and fix bugs.
"It's also a high-performance system, that means for us we can have most of our fleet running tens of thousands of queries per second per server. Twitter is all about real time, so if a query takes 50 seconds to run, that doesn't help us. We measure latency in terms of microseconds, so if it's fast, that helps us. Other RDBMS [Relational Database Management Solutions] tend not to be faster despite their claims."
[Turn to next page]
Case Study: Twitter's use of MySQL
Social network explains its reasons for using Oracle's open-source relational database management system
He admitted though that MySQL isn't a compete solution by itself, adding that Twitter uses it as a building block onto which it bolts other tools for certain functions.
"[MySQL] has really strong cores of features we understand and functionality we can trust, then we build other solutions on top, like Gizzard [a sharding framework for creating distributed datastores] and Galera [a synchronous multi-master cluster for MySQL].
"We use MySQL replication, with about 25 traditional master slave clusters, with anywhere from three to 100 machines across a few datacentres.
We also use InnoDB as a stable and well understood storage system that doesn't lose data. It handles sharing of the data to ensure we can scale the system up when needed. That allows us to grow from 10 to 30 servers, up to several hundred, which is our largest so far. There's no sign at this point that it couldn't keep going as far as we need."
When users tweet, the information is stored in an internal system called T-bird, which also stores metadata. This system is built on top of Gizzard, which is in itself built on MySQL. This is lower velocity than other systems at the firm, as it only processes tens of thousands of queries per second.
Cole added that despite the high volume capabilities he has built already, there is still a lot of work remaining for his team in terms of future development.
"There's a lot of work still to do, we're still looking at support for fine tuning caching behaviour. We need a smarter allocation of cache tables to improve our SLAs [service level agreements]. Only 10 per cent of the data is count-related, but 90 per cent of the queries relate to this, so the information needs to stay in the cache ideally [to be more readily accessible and improve query response times]. But it doesn't stay there at the moment, so we need to improve that.
"We need better auditing and logging support, for compliance as well as DBA investigation and debugging.
"And we monitor all systems at sixty second intervals, but need to get that down to one to five seconds for greater accuracy and to enable us to better parse the data."