RAINBOW’s Distributed Data Storage and Sharing Service’s role is to provide persistent and in-memory data storage capabilities to nodes scattered across the fog continuum to ensure in-time access to the collected monitoring data by collaborating entities such as the Analytics service.
The storage needs to be able to complete the I/O requests without impending heavy I/O overheads on the fog node. Furthermore, it needs to be capable of coping with the dynamic nature of the fog ecosystem, where many failures and/or disconnections may happen and nodes can enter or exit the ad-hoc network of the cluster.
Moreover, the placement of data in such environments, apart from the fog node stability, needs to consider the user-defined and the orchestrator queries that run over the fog nodes. A query’s latency can be highly affected by the location of its input data mainly due to low bandwidths or even privacy and security reasons. Thus, it is crucial to carefully consider where to place and replicate input data in order to mitigate any overheads that may occur during the execution of the query.
To cope with the aforementioned problems of the fog continuum, RAINBOW’s Distributed Data Storage service offers dynamic data placement techniques and employs different mechanisms than those typically encountered in modern cloud-hosted storage solutions. To this end, it incorporates two dynamic data placement policies. The first is based on the fog node stability metadata, while the second one is driven by analytic applications running on top of distributed storages considering the latency, data freshness and quality objectives.