EUC 2014 – Distributed Deterministic Dataflow Programming – Christopher Meiklejohn
Erlang implements a message-passing execution model in which concurrent processes send each other asynchronous messages. This model is inherently nondeterministic, in that a process can receive messages sent by any process which knows its process identifier, leading to an exponential number of possible executions based on the number messages received. Concurrent programs in non- deterministic languages, are notoriously hard to prove correct, and have lead to many well-known disasters.
In addition, Erlang natively provides distribution and clustering as part of the runtime environment. This provides the ability to have processes asynchronously communicate across the network between different instances of the virtual machine, effectively increasing the amount of non-determinism.
We propose an alternative execution model for Erlang, namely deterministic dataflow programming. This execution model provides concurrency, while also eliminating all observable non-determinism. Given the same input values, a program written in deterministic dataflow style will always return the same output values, or never return. Our proposed solution provides a distributed deterministic data flow solution which operates transparently over distributed Erlang, providing the ability to have highly-available, fault-tolerant, deterministic computations.