One of the vital software components of RAINBOW is the Distributed Data Storage and Sharing Service. Its role is to provide persistent and in-memory data storage capabilities to nodes scattered across the fog continuum to ensure in-time access to recently collected monitoring data by collaborating entities (i.e., analytics, orchestration, routing). Data that are stored locally on fog nodes need to be accessed with low-latency. Data also need to be replicated and partitioned across nodes to ensure user-desired data quality constraints for reliability and fault-tolerance. Real-time analysis on-the-fly is crucial when dealing with Edge Computing applications and streaming data. Scalability is also critical in Edge Computing as the number of network-connected devices can increase considerably.
Two distributed databases that could potentially fill that role are Redis and Apache Ignite. The basic properties of these two databases can be seen in the following table:
|ACID compliance||Not always guaranteed||Full support for distributed ACID transactions|
|Cluster model||Leader-worker||All nodes are equal|
|Consistency||Not guaranteed||Yes (if persistence is enabled)|
|Data sharing||Only replicated and partitioned data||Supports replicated, partitioned and local data|
In order to assess their performance, we employed the well-know Yahoo! Cloud Serving Benchmark (YSCB) suite, which is a standard benchmark and benchmarking framework to assist in the evaluation of different cloud systems. YCSB offers 6 standard workloads, that correspond to different application needs and which allow to understand the performance tradeoffs of different systems.
Since our purpose was to assess both the performance and scalability of the Ignite and Redis distributed databases, we executed different settings of both databases on network configurations of 6, 10, 14 and 20 nodes using in-memory configurations with data partitioning and full replication. As our testbed, we used the Fogify fog computing emulator to rapidly model and configure the experiment scenarios, where an emulated geo-distributed environment was deployed. Nodes were setup as Docker image containers and we restricted the (internal) network interface for each container to a bandwidth of 1000Mbps and a network latency of 3ms.
Indicative results from the executions of a subset of the workloads are presented in the following two figures. The behavior is similar for the rest of the workloads. The chosen workloads are A and B which involve heavy updates and read operations, since this will be the case and the service’s job on the RAINBOW platform.
In most cases, Redis seems to outperform Ignite by at least a small margin. It also seems to be the case that the results for Redis show less deviation than Ignite as the respective throughput values are more closely packed. However, as the number of operations increases, it seems that Ignite scales better than Redis decreasing the performance gap as the number of operations increases. This improvement provides evidence that Ignite behaves better in dynamic environments, like the RAINBOW ecosystem, where many operations are needed.
Of course, for the final selection of an appropriate database, performance is only one of the factors that needs to be taken into account. ACID compliance, persistence and consistency requirements are also important contributing factors that may steer the selection of an appropriate database system.