(PDF) Fog Context Analytics

Fog Computing Context Analytics Developments in measurement science have enhanced the quality of measurement procedures, improving their efficiency and increasing their accuracy [1]. The implementation of intelligent and adaptive distributed systems using the Internet of Things (IoT) has recently seen a rapid increase. The synergy of these developments requires an extensive use of sensing systems that can measure, analyze and provide accurate results in real-time [2]. However, an increase in the number of sensors leads to challenges such as a necessity for complex processing strategies to handle multivariate data [3]. It also provides opportunities for analyzing multivariate, sensor-generated data to infer contextual knowledge that can be useful in decision making. Multivariate context information can offset the ambiguities that arise when single sensor data are used for inference and can uncover relationships previously unknown or difficult to confirm through conventional approaches [4]. Through this article, we introduce the basic methods of context classification and prediction and we present their application to facilitate fog infrastructure management. A fog infrastructure typically includes cloud, stationary as well as mobile computing devices which host distributed software systems. Fog infrastructures utilize sophisticated algorithms and advanced software or hardware sensors to make intelligent decisions to increase quality of service, utilization of resources and scalability. Context classification and prediction uses multivariate, sensor-generated data to infer contextual knowledge about the status of a fog resource, which can facilitate decision making with respect to infrastructure management by supporting auto-scaling and infrastructure optimization strategies. Context Classification and Prediction Many machine learning methods and mathematical representations of fog infrastructures have been used to convert low-level sensor data into higher-level context, which can facilitate decision making. Indicative methods used for context classification include Hidden Markov Models, Markov Chains, Bayesian Networks, Nearest Neighbor, Time Series, Threshold based Learning and Gaussian Mixture Models [5]. We increasingly observe ecosystems of fog devices, where multiple devices work together towards improved context sensing. For example, the Energy Efficient Mobile Sensing System (EEMSS) hierarchically orders sensors with respect to their energy consumption, and activates high-resolution power-hungry sensors, only when low-consumption ones sense an interesting event [6]. The variety of existing classification methods and sensor data features can be overwhelming. In a model-driven machine learning approach, the designer of a context classification algorithm must carefully analyze the purpose and the intended use of context, select relevant data features and choose the applicable machine learning method. Context prediction, especially in lieu of fog infrastructure management, has focused on energy-related predictions [7] as well as prediction of processing capability [8] and computation offloading [9]. Other works exploit multimodal data for context prediction, see e.g., [10] which use such data sets served as a proving ground for a number of approaches towards context prediction. Although we focus on how context analytics can facilitate fog infrastructure management, context analytics can be of benefit to any distributed measurement system that employs densed or dispersed smart sensors, i.e. sensors with data processing capabilities. For example, estimation of QoS of smart sensors can facilitate health assessment of the monitoring infrastructure of IoT [11] or Industry 4.0 applications [12]. Context in Fog Computing Fog computing environments are dynamic and complex. Edge devices with different hardware and software configuration have different processing capabilities and performance. Factors such as the available RAM, the type of memory chips, the number of CPU cores, frequency, type and generation, the current battery level (when the device lowers CPU clock frequency in order to lower energy consumption) or the speed of storage devices (HD, SSD, Flash) may affect the QoS (e.g. total execution time) of application and services in different ways. Fog computing is increasingly associated with Function as a service (FaaS), a paradigm which allows the deployment of distributed application components without the complexity of building and maintaining the infrastructure typically associated with developing and launching an application. Building an application following this model is one way of achieving a "serverless" architecture, and is typically used when building microservices applications. As part of a holistic Platform-as-a-Service that includes multi-cloud and fog resources, application functions or microservices, so called “application fragments,” can be deployed on private or public cloud resources and edge devices to bring processing capacity closer to data producers. Application fragments, when offloaded to edge devices, may have different processing requirements in terms of number of CPU instructions, volume of data retrieved or stored to disk/flash, volumes of data retrieved or stored to RAM, volumes of data transferred over the network. So, the QoS characteristics, such as the fragment execution speed, depend on which edge device (i.e. specific type, status) was selected to execute it and its processing requirements. Moreover, a fog computing environment that contains edge devices may include multiple edge device types, such as different mobile phones, often new and unknown which randomly enter and exit a processing topology (e.g. when the remaining battery or the strength of the wireless signal are not sufficient). Considering the aforementioned characteristics of fog computing, with the term context we refer to any information (such as CPU, memory or utilization, network type and traffic, battery state, software and hardware configuration, etc.) that can help to infer information about the current and future state of fog devices. Inferences may be deduced from a single device (for example a device cannot handle a specific task because it does not have sufficient battery) or by combining information from multiple devices either deterministically (i.e. there are more than 4 devices with 8 cores and low CPU utilization within a range of 100m of a WiFi hotspot) or probabilistically (i.e. from historical information it can be inferred that a device that has brightness level 90% and CPU utilization 60% will consume 5% of its battery in the next 30 minutes). Fog Context Analytics The proposed approach for fog context analytics and supporting software tool, so-called FCA, aims to collect, analyze, classify and predict context that characterize fog devices and provide real-time context inference functionalities based on models that are constructed with machine learning methods. To illustrate the approach, we focus on predicting QoS for deploying a certain application fragment on a certain edge device. FCA supports the estimation of the potential QoS by considering the previous and current device context along with the application fragment processing requirements. This allows real-time decision making with respect to fog infrastructure management, i.e. when and how to reconfigure the infrastructure to maintain QoS. Fragments to be Deployed Monitoring Resources Assignment Observing Metrics ANN Training Benchmarking Metric Name Factors Execution_time_per_request 2 Number of factors (integer) 20 Epochs Number of iterations (integer) Regularization Term 0.02 Regularization term for all SVD parameters (Default 0.02) Learning Rate 0.005 Biases 20, 50, 100, 150 Comma-separated list of SVD epochs to try Regularization Terms 0.02, 0.08, 0.1, 0.15 Comma-separated list of SVD regularization terms to try Learning Rates 0.001, 0.003, 0.005, 0.008 Comma-separated list of SVD learning rates to try CV Splits 3 Number of cross-validation Kfold splits pd * Latent Factors Bottom k% Devices per Fragment Excluded F F F Optimization & Scheduling 2, 3, 4, 5, 6, 7, 110, 120, 140, 160 Comma-separated list of SVD factors to try Epochs Learning rate for all SVD parameters (Default 0.005) μ + bd + bf + q f Execution_time_per_request Metric Name Factors Auto-tuning User-defined Predicted Prob. matrix- Parameters QoS Metrics factorization Initialization Sparse Input Data Initial Data Collection Hosting Edge Devices F F F F F Reconfiguration Figure 1. Fog Context Analyzer Processing Logic. Figure 1 outlines how FCA supports fog infrastructure management by facilitating run-time deployment decisions on a subset of available fog devices based on context predictions, in our case QoS predictions such as requests served per second in stream processing applications or execution time in batch processing applications. In order to accomplish this, FCA performs an initial data collection: In case there are historical records about previous or current deployments of fragments to specific edge devices (e.g. Fragment 3 deployed on Raspberry pi 12), the observed QoS metrics are collected per each existing device-fragment combination. Moreover and if feasible, each edge device that may be registered as a potential fog applications hosting resource, accommodates the deployment of a reference fragment that under certain workload provides benchmarking contextual values. In cases where this benchmarking process is not feasible, then an Artificial Neural Network (ANN) can predict the benchmarking contextual values with high accuracy [13]. When there are sufficient data from past fragment executions on different fog devices a set of static (e.g. CPU type, number of cores) and dynamic (e.g. screen brightness, battery level) observed context features can be used in order to train an ANN that is capable of inferring higher level context values such as the remaining battery lifetime or the expected fragment QoS metrics. The initial benchmarking contextual values set the background knowledge for the subsequent context predictions. FCA predicts the missing metric values by performing SVD on the sparse input data with SGD optimization [14] by transferring and applying in our domain the collaborative filtering method described in [16]. SVD relies only on the observed metric values and does not need any other explicitly declared data features. Instead it discovers the latent factors that contribute to the predicted QoS metric value. The input data are typically sparse because the number of device-fragment combinations for which we can obtain metrics, observed or predicted, are low compared to all the possible ones. We used the SVD algorithm to derive the missing metrics based only on observed values with a probabilistic matrix-factorization method. A prerequisite for applying this method is an initialization phase (as seen in Figure 1) that provides the initial data for the SVD algorithm. Equation 1 shows that the expected metric (𝑚 ̂ 𝑑𝑓 ) value (e.g. latency) for a specific fragment that is executed by a specific device is the sum of a constant (μ) that is common for all devices and fragments, a bias (bd) that depends only on the characteristics of the specific device, a bias (bf) that depends on the characteristics of only the specific fragment and the product of two vectors that describe how the latent factors of the fragment (qf) interact with the latent factors of the device (pd). Note that the latent factors reveal all the contextual attributes that have a significant effect to the QoS of a certain fragment/edge device combination. 𝑚 ̂ 𝑑𝑓 = 𝜇 + 𝑏𝑑 + 𝑏𝑓 + 𝑞𝑓𝑇 𝑝𝑑 (1) For example, if we knew that the latent factors of a device are the CPU frequency (cpu) and the speed of the RAM chips (ram), and the latent factors of a fragment correspond to the number of cpu cycles it consumes to process a request (cycles) and the amount of data that are read from the ram (data), then the latency depends on : q = [ cycles data ] T , p = [ cpu ram ] T (2) qT * p = cycles * cpu + data * ram (3) The estimation of the above parameters (μ, bd, bf, qf, pd) can be performed probabilistically, with the Stochastic Gradient Descent [15] or Alternating Least Squares errors methods which calculates the minimum error (Equation ) with optimization. 2 2 ∑𝑚𝑑𝑓∈𝑀𝑡𝑟𝑎𝑖𝑛 (𝑚𝑑𝑓 − 𝑚 ̂ 𝑑𝑓 ) + 𝜆 (𝑏𝑓2 + 𝑏𝑑2 + ‖𝑞𝑓 ‖ + ‖𝑝𝑑 ‖2 ) (4) The SGD process is tuned by a set of hyper-parameters (i.e. the number of latent factors, the number of SGD epochs, the learning rate and the regularization term) that should be defined during the initialization phase either by the administrator of the system or they should be calculated through cross validation by FCA with the Auto-tuning functionality. While SVD can calculate the values of the vectors p and q for each device and fragment, it cannot reveal the actual factor it describes. This is not a problem for many cases where the predicted metric value is enough for decision making. The result of the probabilistic matrix-factorization method that the FCA applies is a table with the predicted QoS metrics as seen in Figure 1, i.e. estimated metric values for all possible combinations of fragments/edge devices. These metrics are used by FCA to support the assignment of resources according to the desired initial deployment or adaptation policy (e.g. rank the devices according to the predicted fragment QoS metric). Based on this ranking, FCA is able to exclude a configurable percentage, e.g., 30%, of the worst available edge devices per fragment. The short listed edge devices can be relayed to an optimization library such as www.btrplace.org for further optimizing the deployment. This means that the variability space, that an optimizer should examine according to a certain goal (e.g. minimize cost) is reduced by excluding devices which are expected to behave poorly with respect to QoS. To complete the overview of FCA, Figure 1 depicts on the bottom the monitoring functionality which is used to acquire in realtime sensor data (e.g. cpu utilization) that describe the low-level context of the fog infrastructure and trigger the reconfiguration processes that will guarantee QoS based on the inferred context obtained by FCA. Illustration & Evaluation To illustrate our approach, we simulated a set of fog device types and a set of application fragments. Application fragments can be deployed on different devices and execute tasks. We assume that every device type and every fragment has a number of features that contribute to the QoS metric, which is the inferred context element we want to predict. A common proxy metric for QoS is the total execution time of an application fragment. In our example, we use two context features, i.e., raw data to predict execution time: the processing capacity of the device and the processing requirements of the fragment. Accordingly, each Device Model is identified by two features: Ram Speed (rs) and CPU Speed (cs) which can be measured. Each fragment is identified by two other features, CPU Instructions (ci) and RAM access instructions (ri). The time that is needed to run a task on a specific fragment – device combination can be computed with the formula: 𝑇𝑎𝑠𝑘𝑇𝑖𝑚𝑒 = 𝑟𝑠 ∗ 𝑟𝑖 + 𝑐𝑠 ∗ 𝑐𝑖 (5) If we have n devices and k fragments then the task times for each possible combination can be computed by multiplying two arrays : 𝐷1 𝑇1 𝐷2 𝑇1 [ .. 𝐷𝑛 𝑇1 𝐷1 𝑇2 𝐷2 𝑇2 𝐷𝑛 𝑇2 𝐷1 𝑇3 𝐷2 𝑇3 𝐷𝑛 𝑇3 .. .. .. 𝐷1 𝑇𝑘 𝐷2 𝑇𝑘 𝐷1 𝐷2 ] = 𝐷 𝑇 = [ ] [𝑇1 .. 𝐷 𝐷𝑛 𝑇𝑘 𝑛 𝑇2 𝑇3 . . 𝑇𝑘 ] (6) Initially, we generate randomly the arrays D and T ( Di = [rs cs] , Tj = [ri ci] ), and compute the array DT. Then, based on the array DT, we generate a dataset that contains two devices per device type and assume that one fragment (the benchmark fragment) has been executed on all devices and each device has executed one of the other fragments. A random noise with maximum value 1.5 is added on the all fragment execution times except for the benchmark (Figure 2). Figure 2. Initial observed context metric values. This dataset corresponds to a fog infrastructure that contains 10 devices (named D1,D2,…D10). These devices belong to 5 types (i.e. 2 devices per device type, named NUC i7, Rpi0, Rpi3, Drone1, Drone2) and 4 application fragments (AudioCaptor, VideoTranscoder, FaceDetector, VideoTranscoder2). Based on this dataset, FCA predicts the execution times for all devices and fragments by computing probabilistically the SVD of the sparse array (Figure 3). Predicted values are used in order to adapt the processing infrastructure with a set of pre-defined policies: a) The worst 5 devices are excluded; b) each fragment can be executed on maximum 3 different devices; c) no device executes two fragments. Figure 3. FCA simulation predicted values (iteration 1). Figure 4. Iterations 2 and 3: Observed (left) vs predicted values (right). Figure 5. Iteration 11: Observed (left) vs predicted values (right). Figures 4 and 5 show the results before and after the next two simulation iterations and after a larger number of simulations, respectively. As the simulation proceeds, more observations are collected and predicted values are updated in each iteration of the simulation. Figure 6 depicts the Mean Prediction Error for each fragment among all devices and shows that the mean prediction error drops significantly during the 10 presented simulation iterations. This is expected because the FCA collects more observed values which reveal better the latent factors that affect the execution time of each fragment on all possible devices. Figure 6. Mean Prediction Error for each fragment among all devices. After each simulation step, we execute the SVD auto-tuning functionality of the system. The auto-tuning algorithm searches for the hyper-parameters with the lowest error value by performing cross-validation. We use the RMSE (Root Mean Square Error) metric. In the following graphs we can see the number of latent factors that the auto-tuning algorithm has decided that was the optimum. Figure 7. SVD factors (computed with cross-validation). As seen in Figure 7, the optimum value for the SVD factors hyper-parameter was the correct 9 of the 11 times. It is important to tune correctly the SVD hyper-parameters in order to avoid the waste of resources at the system that runs FCA, to avoid the increase of the time to get the predicted metrics (which can lead to increased time to take and apply a decision for the adaptation of a processing topology), and to avoid to get predictions with large errors (due to over-fitting for example). Conclusions Context prediction, enabled by applying machine learning techniques on sensor measurements, is crucial for efficient auctioning and effective decision making in complex measurement systems and computing infrastructures such as fog topologies. As environmental concerns for green and low-power computing grow, the accurate classification and prediction of fog device context which is an enabler of fog infrastructure management becomes a necessity. Context prediction can support DevOps make decisions that minimize resource utilization and power consumption while maintaining QoS. FCA enhances raw context data with inferred or predicted values of context features while avoiding machinelearning feature engineering when possible. In the context of the PrEstoCloyd H2020 project, FCA has been used to accurately predict context in challenging use cases, which include the management of a fleet management infrastructure based on data streams from GPS, on-board diagnostics and in-truck sensors, the optimization of computing resources that process and transmit live journalist video feeds as well as the management of a complex surveillance infrastructure that include both stationery, i.e. cameras and mobile, i.e., drones audiovisual transmitting devices. Context prediction can serve emerging, complex computing infrastructures that include sensor-enabled IoT, which are considered highly dynamic and require intense human supervision and intervention to maintain good operation. FCA can support infrastructure operators in case where there is an increased demand for management automation, robustness and overall management complexity reduction to achieve good quality of service and serve demands of different devices and applications. References [1] C. Dorn, S. Dasari, Y. Yang, C. Farrar, G. Kenyon, P. Welch and D. Mascarenas, “Efficient full-field vibration measurements and operational modal analysis using neuromorphic event-based imaging,” J. Eng. Mechanics, vol. 144, no. 7, 2018. [2] B. Fenton, M. McGinnity and L. Maguire, “Fault diagnosis of electronic system using artificial intelligence,” IEEE Instrum. Meas. Mag., vol. 5, no. 3, pp. 16-20, 2002. [3] A. Vanarse, A. Osseiran, and A. Rassau, “Neuromorphic engineering—A paradigm shift for future IM technologies,” IEEE Instrum. Meas. Mag, vol. 22, no 2, pp. 4-9, 2019. [4] A. Puiatti, S. Mudda, S. Giordano and O. Mayora, “Smartphone-centred wearable sensors network for monitoring patients with bipolar disorder,” In IEEE Proc. Engineering in Medicine and Biology Society (EMBC), pp. 3644-3647, 2011. [5] V. Pejovic, V and M. Musolesi, “Anticipatory mobile computing: A survey of the state of the art and research challenges,” ACM Computing Surveys, vol. 47, no.3, 2015. [6] Y. Wang, J. Lin, M. Annavaram, Q. A. Jacobson, J. Hong, B. Krishnamachari and N. Sadeh, “A framework of energy efficient mobile sensing for automatic user state recognition,” in Proc. International conference on Mobile systems, applications, and services, ACM, pp. 179092, 2009. [7] E. Peltonen, E. Lagerspetz, P. Nurmi and S. Tarkoma, “Constella: Crowdsourced system setting recommendations for mobile devices,” Pervasive and Mobile Computing, vol. 26, pp. 71-90, 2016. [8] M. Tillenius, E. Larsson, R. Badia and X. Martorell, “Resource-aware task scheduling. ACM Transactions on Embedded Computing Systems, vol. 14, no. 5, pp. 180-199, 2015. [9] K. Akherfi, M. Gerndt, H. Harroud, “Mobile cloud computing for computation offloading: Issues and challenges,” Applied Computing and Informatics, vol. 14, no. 1, pp. 1-16, 2016. [10] J. K. Laurila, D. Gatica-Perez, I. Aad, O. Bornet, T. M. Do, O. Dousse, J. Eberle and M. Miettinen, “The mobile data challenge: Big data for mobile computing research,” No. CONF, 2012. [11] Y. Yang, M. Zhong, H. Yao, F. Yu, X. Fu and O. Postolache, "Internet of things for smart ports: Technologies and challenges," in IEEE Instrumentation & Measurement Magazine, vol. 21, no. 1, pp. 34-43, February 2018. doi: 10.1109/MIM.2018.8278808 [12] F. Griffiths and M. Ooi, "The fourth industrial revolution - Industry 4.0 and IoT [Trends in Future I&M]," in IEEE Instrumentation & Measurement Magazine, vol. 21, no. 6, pp. 29-43, December 2018. doi: 10.1109/MIM.2018.8573590 [13] N. Papageorgiou, D. Apostolou, Y. Verginadis, A. Tsagkaropoulos and G. Mentzas, “A Situation Detection Mechanism for Pervasive Computing Infrastructures,” In Proc of IEEE International Conference on Information, Intelligence, Systems and Applications (IISA),” pp. 1-8, 2018. [14] C. Delimitrou and C. Kozyrakis, “Quasar: resource-efficient and QoS-aware cluster management,” in ACM SIGARCH Computer Architecture News, vol. 42, no. 1, pp. 127144, 2014. [15] J. Kiefer, J. Wolfowitz, J., “Stochastic estimation of the maximum of a regression function,” The Annals of Mathematical Statistics, Ann. Math. Stat. no. 3, pp. 462-466, 1952. [16] Y. Koren, R. Bell and C. Volinsky, “Matrix Factorization Techniques for Recommender Systems,” Computer vol. 42, pp. 30-37, 2009 Bios Nikos Papageorgiou, PhD in Electrical and Computer Engineering from the National Technical University of Athens, Greece. He is currently a researcher at the Institute of Communications and Computer Systems. His research interests include machine learning, cloud and fog computing. Giannis Verginadis, PhD in Electrical and Computer Engineering from the National Technical University of Athens, Greece. He is currently a senior researcher at the Institute of Communications and Computer Systems. His research interests include information systems, BPM, cloud and fog computing. Dimitris Apostolou, PhD is Associate Professor in the Department of Informatics, University of Piraeus, Greece. His research interests focus on decision making, information systems, cloud and fog computing. Gregoris Mentzas, PhD is Professor at the School of Electrical and Computer Engineering and Director of the Information Management Unit (IMU). He serves as Director of the division of Industrial Electric Devices and Decision Systems. He holds a Ph.D. in Operations Research and Information Systems.