Addressing Network Congestion in Riak Clusters
In high-scale distributed systems like Riak, an open source distributed database written in Erlang, the network can make or break system reliability and availability. In this talk, Steve will discuss an experimental approach to alleviating network congestion effects such as timeouts and throughput collapse for Riak clusters under extreme load. He will cover the basics of Riak, explain what features of Riak can cause networking problems at scale, and then discuss the results of using a new Erlang network driver to try to address those problems.
- Discuss problems with TCP in clustered systems under extreme load, and present alternatives.
- Explore the design of an Erlang network driver that provides a non-TCP reliable protocol.
Target audience: Erlang developers interested in distributed systems, particularly clustered systems featuring heavy TCP usage.
Steve Vinoski Distributed Systems Expert
Steve Vinoski is an architect at Basho Technologies in Cambridge, MA, USA. He's worked on distributed systems and middleware systems for nearly 30 years, including distributed object systems, service-oriented systems, and RESTful web services. His interest in software quality and development productivity led Steve to start exploring and using Erlang in 2006, and he's used it as as his primary development language ever since.