So you have a great idea for the next great startup. What technologies and processes do you use to make it happen as painlessly as possible? More importantly, how will you make sure it scales as your business takes off?
There are technologies that help your system scale quickly, and there are technologies that don’t scale as well but are easier to deploy, so they can help you get your business started more quickly.
You will need both, advised Reddit chief technology officer Martin Weiner. Weiner was also one of the technical leads of Pinterest when that service was launched, and so offered lots of practical advice in a presentation at the SXSW Interactive conference, held in Austin Texas this month.
A lot of tools may not scale w/ you but they will get you started faster-@MartyWeiner @reddit on #scalability #SXSWi pic.twitter.com/gM1hwnnvP2
— InApps (@thenewstack) March 13, 2016
The first year of any new startup’s life will be “scalability hell,” he said. “This is the problem every tech startup faces.”
One approach that many designers might take would be to use all the latest buzz-generating technologies. The designers have a clean slate, so why not start out ahead of the adoption curve? This can be a mistake Weiner warned, given the amount these new, and probably still immature, technologies need.
The truth is, your requirements will not be that different from other startups founded in the last few years, and so a more mature product would probably be your best bet.
How can you tell if software is mature? A geek at heart, he offered a formula:
You could judge software’s maturity by the amount of work done on it by its creators, divided by the complexity inherent in using the software itself.
So a caching system like Redis, for instance, is a fairly simple single-threaded software that has been solidified a large number of contributors from companies like Twitter and Pinterest. So Redis would rank more highly than in the maturity equation, than say, HBase, which has many contributors, but is based on some pretty sophisticated ideas and is difficult to maintain.
Basically, you want to use mature technology whenever you can, because it will help you get to get your operations running as smoothly as possible. If it doesn’t scale to the level of maintaining a Pinterest-sized business, then that will be a problem if and when your own business reaches a Pinterest-level popularity.
“Just use MySQL, the boring tech revolution is here” @MartyWeiner @reddit on #scalability #SXSW2016 #SXSWi pic.twitter.com/qRaNmnILmf
— InApps (@thenewstack) March 13, 2016
Mature technologies offer a wider pool of talent and support. For a mature technology like MySQL, “If it is 3 A.M., and your site is broken, because it will break, whatever the problem is with MySQL, the answer will up on Google; and not only that, it will be up on Stack Overflow and someone will be calling you a newb. That’s a real test of maturity,” Weiner said.
Performance and stability are also common issues with immature tech and “what you want is predictability,” Weiner said. An immature tech may be humming along perfectly fine, then for some reason the latency will shoot through the roof. And the documentation and debugging probably won’t be there to support you either.
“Most likely you won’t need immature tech for a really long time,” he said.
Another advantage mature technology brings is that it makes it easy to pivot the business, to focus on some aspect of the operations that is working.
“There are a lot of tools that may not scale with you, but they will get you going faster,” he said. Pick well-known technologies like NGINX, Ubuntu, GitHub, and Python. “Python is a really mature tech. Everyone knows how to use it, and you can hire for it,” he said.
For most common data capture tasks, such as storing user comments or log-in credentials, MySQL is the way to go. AWS offers a hosted MySQL service through RDS. AWS manages the service, with capabilities such as automatic failover and configurations, and as the company grows you can transfers responsibility over to your own staff.
“The boring tech revolution is here,” he said. And don’t worry about optimizing your queries yet. “Don’t optimize anything on the backend unless you have to,” he said.
In fact, AWS is a great resource for startups. “It just works, and it will evolve and scale with you as you grow,” he said. It is competitively priced and there are a lot of tools for working with it. Basic search capabilities can be provided by AWS CloudSearch. For storage, go for AWS’ Simple Storage Service (S3). “It is super durable,” Weiner said. “It has four 9’s of availability.” For domain name services, look to another AWS service, Route 53.
Startups: Don’t do #ML. It’s a giant black hole of death and you probably don’t need it yet — @MartyWeiner @reddit on #scalability #SXSWi
— InApps (@thenewstack) March 13, 2016
Another tool you shouldn’t have to worry about is Apache Zookeeper, which is great for managing lots of services. A company just starting out, however, could just use AWS Config.
After launch, “If you grow, you’ll start breaking soon” — @MartyWeiner @reddit on #scalability #SXSWi pic.twitter.com/RoMPEdDoB0
— InApps (@thenewstack) March 13, 2016
Everything Breaks
If your startup takes off, you will then have to begin thinking about how to scale operations. You can shed some of your basic technologies because now you may have more IT staff to help you keep things going along.
Stats are important here. “If you start to grow, you will start to break,” he said. Instrument everything! Here, Weiner recommended Statsd, a Node.js program developed by Etsy that can periodically ping different components at their interface points: database calls, APIs, calls to other services, even background tasks such as backups.
Also, do logging. Service calls, page views, user registrations, anything you can query against later should be logged. “It will save your butt one day. One day you will bork your database. I guarantee it. And [logging] will save you,” he said.
Only as the workloads increase should you think about optimization. if you are using RDS, scaling the database should be easy enough. Now is the time to profile SQL queries, and optimizing them so more of them can be cached. Normalize the data, remove the joins and add indexes. “You will actually take away MySQL’s notion of what is related,” he said.
Look for where caching could help for frequently called static material: Redis, Memcache, AWS Elasticache, Varnish.
Over time, a CTO should be thinking not in terms of technologies but in services. Weiner likened services to Legos, plastic bricks that can be used to make larger objects, like toy cars. “Legos are not the raw plastics, and they are not finished cars. They are right in-between,” Weiner said.
A great service, like Legos, can drive creativity,” Weiner said. Services free up developers to work in their preferred languages. Only the interfaces must be in the same schema. A game company, for instance, can quickly build a new game using a selection of pre-built services, such as software for joystick control.
Here is where ZooKeeper comes in handy for service discovery. “ZooKeeper is a fancy stuff. It’s a pain, but it’s worth it,” Weiner said. Every time a new service is introduced, it can be placed in ZooKeeper.
This way, services can be reused across multiple products, such as an email service, and can be easily augmented with new features.
“Engineers are now responsible for their services. It’s their baby. If it breaks in the middle of the night, they wake up and fix it,” Weiner said.
Feature art: Austin Street art, from David Lowe. Slides from Marty Weiner’s presentation deck.