Förderjahr 2018 / Stipendien Call #13 / ProjektID: 3793 / Projekt: Data Management Strategies for Near Real-Time Edge Analytics
Current underlying network infrastructures meet difficulties for future expansion of IoT data sources , with increased energy consumption and high cost of data transmission, requiring novel Edge Data Management Framework (EDMFrame).
To automatically manage accurate near real-time decisions, future management of IoT systems must deal with three things, namely, incomplete data, big volume of data and limited capacity of storage resources at the network edge. To bridge these gaps, one of the potential solutions could be a three-layer architecture model for efficient data management, called EDMFrame.
Architectural elements of EDMFrame
Figure 1 illustrates EDMFrame architecture with corresponding modules. Each monitoring process, of targeted application, can include three layers, namely: gathering layer, edge layer and cloud layer. Although with introduction of the edge layer many data processing activities are migrated from the cloud layer, it is necessary to show importance of each layer in data life cycle. To better understand the potential reference architecture, let us include a case that illustrates real world IoT application, for example, in managing data produced from smart buildings.
Gathering layer takes care of sensor data collection
Gathering layer transmits measurements from IoT sources to the edge layer to reduce communication costs and bandwidth usage and to meet latency requirements. Gateways can perform aggregation of sensor data sending them in an appropriate format to the monitoring component. In step (1), data are collected from smart buildings and then in step (2) transferred to the edge layer.
Edge layer takes care of efficient data handling and local storage
Edge layer manages data through different components of the EDMFrame, to perform accurate and timely analytics. It consists of the following modules:
Monitoring component. This component receives and analyses data to detect outliers and missing values. It can notify mediator component about incomplete data, prepare data for data recovery, and trigger control commands to IoT actuators based on local edge analytics. It can also extrapolate other data characteristics useful for local analytics.
Specification list. Once certain amount of data is transmitted to the edge layer, user specifications are checked in step (3). Specification list includes application dependent and user-defined information, useful for both data recovery and edge storage management, for example, the forecast horizon, monitoring frequency, accuracy threshold for prediction maintenance, and other conditions/rules;
Data recovery mechanism. Adaptive recovery process is performed in step (4). It receives data from the monitoring component and performs recovery of multiple gaps incorporating recovery cycles. The output is dataset without gaps and cleaned from outliers.
Limited storage. Edge storage caries limited capacities, thus it stores only relevant data and interacts with the edge storage management, mediator component and local edge analytics processes;
Edge storage management. In step (5), edge storage management mechanism maintains limited storage keeping only most relevant data for near real-time decisions. It checks available data, validates the specification list and implements the edge storage management phases. The available data are used in step (6) for local analytics whose output is forwarded either to the storage or to the monitoring component that send commands to actuators in step (7).
Mediator component. The mediator manages recovery maps to support data recovery mechanism. In step (8), mediator component communicates with the cloud data repository. It transfers the necessary range of data from/to the cloud. It can perform data compression to improve data transfer between the edge and cloud layer.
Cloud layer takes care of resource intensive data processing tasks
Cloud layer has the data repository storing historical data collected from IoT systems. It performs compute intensive big data batch analytics and delivers information and results based on entire data.
Although some preliminary results for data recovery and edge storage management mechanisms are shown in  and , respectively, several challenges remain such as how to automatize data management activities without intervention of third parties, especially in timely recovery of incomplete data.
 Lee, I., & Lee, K. (2015). The Internet of Things (IoT): Applications, investments, and challenges for enterprises. Business Horizons, 58(4), 431-440.
 Lujic, I., De Maio, V., & Brandic, I. (2018, May). Adaptive Recovery of Incomplete Datasets for Edge Analytics. In 2018 IEEE 2nd International Conference on Fog and Edge Computing (ICFEC)(pp. 1-10). IEEE.
 Lujic, I., De Maio, V., & Brandic, I. (2017, May). Efficient edge storage management based on near real-time forecasts. In 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC)(pp. 21-30). IEEE.