Self-adaptive Analytics Placement - Use Case

Self-adaptive Analytics Placement - Use Case

Using Data Locality for Edge-Cloud Placements (20.03.2020)

Förderjahr 2018 / Stipendien Call #13 / ProjektID: 3793 / Projekt: Data Management Strategies for Near Real-Time Edge Analytics

This blog introduces a use case for self-adaptive placements of data analytics in the edge-cloud environment based on data locality, aiming to efficiently place the analysis where data is located.

Performing data analytics at the edge still requires dealing with issues such as the rapidly growing amount of data, limited storage capacities and high failure probabilities of edge nodes. A rapidly growing amount of data produced at the edge hinders the centralized data collection and performing analytics traditionally. Therefore, datasets nowadays can often reside in different locations from where they were produced initially. Datasets can be moved, migrated or replicated due to

limited storage capacities;
edge node failure probabilities;
meeting certain service level objectives (e.g., availability and resilience to failures);
workload balancing.

This leads to the difficulty of tracking datasets and deploying analytics to the right locations. Thus, data locality represents one of the critical problems for on-demand analytics placement at the edge.

This problem can be found in a typical application such as edge video analytics. Video analytics are considered as the killer app for edge computing. Object detection is a popular video analytic workflow in which video frames are analyzed by a model (usually, a neural network) to produce a list of objects with varying degrees of confidence. Figure 1 shows the challenge of tracking a specific video dataset whose location can change at later time points. The motivational scenario illustrates the problem of locating a missing child, that is captured on video frames initially stored at Location A at time 1. However, due to many reasons such as limited storage capacity (time 2) or edge failure probabilities (time 3), the dataset can be migrated and replicated to new locations. Finally, at time 4, the query for detecting and locating a missing child should be deployed on a specific dataset that is not stored anymore in the initial location. The dataset of a specific camera can at different time points be in different locations, preventing users from timely and accurately executing queries.

Figure 1: A use case scenario to illustrate the problem of tracking datasets for video analytics placement on-demand.

This problem can be found in other event-driven scenarios, such as finding missing pets, locating suspect’s vehicle, fraud analytics, post-crime investigation, failure prevention in smart buildings and smart manufacturing. Making automatic analytics placement to the right locations is one of the crucial factors that significantly impact the critical decision-making.

Many edge nodes (e.g., micro data centers, edge gateways, smart devices) can execute analytics tasks such as object detection in video surveillance, predictive maintenance in smart manufacturing or medical data analysis in eHealth. We want to ensure accurate placement of critical analytics tasks to the right locations, proposing an approach called Self-adaptive Analytics Placement (SAP) based on data locality. The main ideas are: (i) enabling tracking datasets for accurate analytics placement; (ii) allowing on-the-fly adaptation and deployment of the analytics tasks to nodes storing datasets within a single-cluster; (iii) introducing a prototype for using data locality in SAP across a multi-cluster environment such as a hybrid cloud.

Ivan Lujic

Skills:

cloud computing

edge computing

data analytics

Internet of Things

time series forecasting

statistical analysis

Weitere Blogbeiträge

Self-adaptive Analytics Placement – Architecture Overview

This blog introduces a high-level architecture model overview of the SAPLaw system, aiming to illustrate the main components and steps for achieving self-adaptive analytics placement at the edge.

A Testbed Setup for Edge Data Management Strategies

This blog introduces a testbed for experimentally evaluating edge data management strategies, aiming to have a better overview of needed technologies and physical resources.

Contributions accepted for publication in IEEE Transactions on Services Computing

Paper "Resilient Edge Data Management Framework" has been accepted for publication in IEEE Transactions on Services Computing. This blog post gives an overview of what the most significant outcomes of the evaluation were.

Förderjahr 2018 / Stipendien Call #13 / ProjektID: 3793 / Projekt: Data Management Strategies for Near Real-Time Edge Analytics

Tags:

Ivan Lujic

Skills:

Weitere Blogbeiträge

Self-adaptive Analytics Placement – Architecture Overview

A Testbed Setup for Edge Data Management Strategies

Contributions accepted for publication in IEEE Transactions on Services Computing

Practical applicability of EDM strategies

Edge Data Management Solutions and Limitations

Intuitive and Proactive IoT Systems

Impact on Architectural Design of Elastic Edge Storage Services

From the Raw Time Series to a Symbolic Data Representation

Dealing with Approximate Data Representation and Analytics

Engineering Principles for Edge Data Services

Investigating Elasticity for Edge Storage Services

Need for Elastic Edge Data Services

Towards Self-adaptive Technique Selection for Edge Data Recovery

From raw sensor data to smart actuator decisions

State-of-the-art

Finding methodologies for overcoming emerging challenges

Moving to the edge...