Francesco Cesarini, Viktória Fördős – Scale, Manage and Prevent! – ElixirConf EU 2015

By Erlang Central | Published: May 20, 2015

Making distributed systems that scale to many machines is easier done with Erlang/OTP than most other technologies. Deploying, managing and monitoring thousands of Erlang nodes prepared for massive load, however, remains a tough (and repetitive) challenge. In the context of the EU funded RELEASE project, the main task of WombatOAM is to provide the scalable infrastructure for deploying thousands of Erlang nodes. It provides a broker layer capable of dynamically scaling heterogeneous cloud clusters based on capability profile matching.

In this talk, I will tell you how the scalability and robustness capabilities of WombatOAM were addressed, allowing us to deploy and monitor 10,000 Erlang nodes running an ant farm simulation in 4 minutes. The talk will cover the journey – focusing on the analysis of WombatOAM (using WombatOAM, as we like dog food), on the applied techniques and on the key decisions taken when advancing WombatOAM.

Talk objectives:

Learn how to detect bottlenecks in concurrent system.
Explore an approach to scaling Erlang clusters.
Discover how to predict and prevent possible outages.