The growth of cloud computing, the transformative potential of hyperlocal and contextual products (where being able to make decisions or buy things from any location while on the move is transforming application design), and the growth of mobile apps in the enterprise and beyond are all driving a need for robust database architecture that can manage real-time data transactions and analysis from high-volumes of simultaneous users accessing data around the globe.
The lifecycle maturity for many businesses building real-time and transactional applications is determined by magnitude. As Matt Vella, CTO of Forensiq — an online advertising fraud service — points out that transactions for their service grew first at a rate of 100 to 1000 times, but it is only when scaling another 1000 times on top of that that the need for robust, high-volume transactional database architecture comes into play. That may take some time to put in place, and along the eay, there are strategies that can be employed to manage transactional growth.
Steven Willmott is CEO and cofounder of 3Scale, an API provider that transacts billions of API calls each day on behalf of their customers. Their clients have their own APIs which each see high transaction volume daily, allowing end users to be able to look up data, make payments, and alter data details. In addition, 3Scale needs to monitor the data usage of all their customers in order to ensure performance of the API management system and to match call volumes with usage limits and authorizations. With the growth in APIs as the core way for their customers to manage data, they have faced the need to scale their database architecture to continue handling high-speed, high-volume transactions in real-time.
Volume, Latency and Fail-over
“There are three dimensions to scaling data transactions that are important,” says Willmott. “First there is raw volume. You want to be able to handle a large volume of transactions and data requests without any performance degradation. Most systems have a point where performance drops radically: you don’t go down, you just begin to manage the data very slowly.
“Secondly there is latency. Different calls have different resource needs: so if some of those calls are taking a second to return, that is a horrible experience for the end user.
“Finally, there is fail-over or distribution. If you have data requests coming from all over the world, you will want to have data centers or distribution points all over the world. If one of your data-centers in one location goes down, you will want to have fail-over so that the end user experience is not interrupted, and you will want to have data centers close to where you and your users are.”
Controlling Sharding: The Alternatives
While database services like Redis Labs are focusing on building in sharding features that help manage datasets that grown beyond the size of a single machine, others like NuoDB are creating features that provide alternatives to sharding.
Willmott sees sharding as something they wanted to maintain complete control over when handling the billions of transactions they need to manage.
“We use Redis in all of our locations, and as we were serving more and more traffic — we handle billions of transactions a month — we needed to shard it.” Willmott echoes what database services Aerospike and NuoDB have pointed out: “Redis doesn’t have any native clustering. So we actually stem the Twitter open source project, Twemproxy. It’s working really well for us. Netflix also just did a fork of it, called Dynomite.
“One of the problems with Redis is that you don’t know how many shards you need at present. It took us three to six months to get our sharding in place. It’s very challenging. We did it ourselves as we have extreme performance needs, so we didn’t feel a cloud solution was right for us.
“It is complicated to manage and you have to be really in the weeds. If you are not redis, then having a third party that can shard might be a good solution.
By using Twemproxy to manage their sharding capabilities, Willmott is comfortable with sticking with Redis: “There are a lot of choices around what data infrastructure you can use. What Redis is very good at is keeping track of things so you have atomic increments, we need to have real time tracking of transactions: Redis is great, you can’t really do that with other SQL databases. Anyone else is read the value, check the value, write the value – it is even longer to say.
Hybrid Database Architecture at Scale and Speed
While Redis Labs and Aerospike both offer database-as-a-service products, NuoDB, like competitor FoundationDB, are software products that customers deploy on their own — whether that be on-premises or in private or public cloud architectures.
CTO of NuoDB, Seth Proctor, is seeing customers use the database solution to scale up their transactional capabilities in any configuration:
“Since there are many value-propositions in moving to a distributed technology (continuous availability, physical distribution for lower latencies in multiple geographies, etc.). Our customers need TPS rates anywhere from low-hundreds for very complex logic up to hundreds of thousands per second for more rapid applications. We have shown demos of the software running at several million TPS on public clouds. Most of our customers are taking existing, rate-limited systems and moving to us to handle capacity today, but have room to grow as needed. Our distributed model means that customers don’t need to think about sizing for TPS up-front.”
Interim Scaling Solutions: Addressing External Factors and Creating API Abstraction Layers
Willmott insists that it is just as important to look at external pressures when businesses are scaling their internal database architecture to enhance volume, latency and fail-over capabilities. Rate limiting of API calls, for example, is one of the most common ways to shape the data traffic before it increases volume pressure on a database, for example. Willmott has previously presented five techniques to help businesses and application developers deal with scaling impacts by first managing the external pressure points.
“This doesn’t absolve you from needing to have very good scaling infrastructure, you still want to grow the number of transactions on the backend each month. But, at least from an API perspective, most people forget about the external opportunities and you want to look at both. The external strategies to manage transaction volume and latency risks are usually ones that don’t cost a lot, don’t require a lot of backend infrastructure investment or code rewrites, and they are usually strategies that will mean a different part of the company will look after it.
SlashDB: Using a single API to enable transactions
SlashDB also believe they have a technology ideally suited to helping businesses that already have legacy database systems in place, but may need to open up access to their data via API as a way to address database scaling needs along the way to high volume transactional infrastructure.
“Traditional databases such as Oracle, MS SQL Server and DB2 are the cornerstone of business data management infrastructure: the so-called stores of record,” explains Victor Olex, Founder of VT Enterprise which runs SlashDB. “But in today’s world, data management has to extend beyond enterprise walls and those systems do not always work as well at web scale. While NoSQL databases offer a scalable substitute, they come with a hidden cost: time and expense required to rewrite existing business applications or at the very least to feed those new Big Data stores with important enterprise data from the stores of record.
“When it comes to leveraging investments already made in traditional databases for the purposes of web and mobile, or to connect with NoSQL, SlashDB offers a thin API facade that instantly turns SQL databases into HTTP resources, complete with authorization, search, data format conversion and caching features. As a result, previously siloed SQL data can now be obtained in JSON, XML and other formats that both NoSQL and web applications can seamlessly work with.”
For database architecture based on legacy systems, Olex sees SlashDB as a solution that can help enterprises use data stored in systems of record as part of the data flow to enable application transactions in real time: “Reverse HTTP proxy caching is used to protect the database from having to respond to repetitive queries and boosts the system performance to web server speeds and scalability.”
Scaling real-time transactional processing is increasingly moving to production level for enterprise and mid-toer businesses. As more applications need to be run on mobile; as datasets expand globally and collect billions of datapoints; and as IoT data begins to become a key data source in the design of new products and services, scalable, high-volume transactional processing data architecture will become the new normal. Along the way, businesses need to manage traffic in ways that reduce the load on their architecture, while also assessing how to take advantage of new feature-sets available from the increasingly crowded marketplace of database service providers.
Feature image via Flickr Creative Commons.