ECE Dissertations Spring 2021

Spring 2021 Dissertations

Li Li

Geometric Properties of the Gradient of Loss Functions in Discriminant Deep Neural Networks

Advisor: Dr. Miloš Doroslovački

March 15, 2021

ABSTRACT

Classification is an important and challenging problem in the field of machine learning (ML). In recent years, deep neural networks (DNNs) are the rising stars for solving classification problems. DNNs marginally outperform the last generation ML approaches and even humans in certain applications due to the capacity of learning representations of the input data automatically.

As a supervised learning approach, DNNs need to be trained to learn the parameters that map the input data to representations. Learning good representations that can facilitate the classification is the goal of training a discriminant DNN. The backpropagation establishes the learning rule in state-of-the-art DNNs, which updates the DNN parameters by a proportion of the negative gradients of the loss function with respect to the parameters. Therefore, the loss function plays a key role in the training of DNNs; it determines the direction of updating the parameters and the form of representations.

This thesis analyzes the geometric properties of the gradient of loss functions for discriminant DNNs. Based on the properties, a set of new loss functions is proposed to obtain better representations for classification. By analyzing the properties of the cross-entropy loss function, which is the most popular loss function for the discriminant DNNs, the approximations of its gradient are proposed that overcome the vanishing gradient problem and accelerate the training of DNNs.

Adversarial examples are the instances with small, intentional feature perturbations that cause a machine learning model to make false predictions. The vulnerability to the adversarial examples is one of the issues that limit the application of DNNs. By analyzing the geometric properties of the decision boundaries in the representation space, a new approach of generating the adversarial examples is proposed that aims to monotonically move the representation close to the decision

boundaries. By using the approximation of the gradient of the cross-entropy loss function, a new form of the representations that are far-away from the decision boundaries can be obtained, which robustifies DNNs against adversarial examples.

Fall 2020 Dissertations

Hongyu Fang

Efficient Defense against Covert and Side Channel Attack on Multi-core Processor Using Signal Processing Techniques

Advisor: Dr. Miloš Doroslovački

December 8, 2020

ABSTRACT

Modern application workloads are mostly deployed on cloud computing platforms. Illicit communication across user domains is prohibited by Operating System (OS) or Virtual Machine Manager (VMM) to prevent information leakage. To bypass the software-level isolation maintained by OS or VMM, the adversaries turn to hardware for information exfiltration. Various works demonstrate that shared hardware could be exploited to leak secret through hardware covert/side channels. It is hard to eliminate this type of information leakage with traditional software-based solutions due to frequent accesses to these resources and limited information on related hardware events provided by existing hardware. Frequently accessed microarchitecture like cache and Graphics Processing Unit (GPU) provide large attack surfaces for hardware covert/side channel since these microarchitecture are shared among a large number of processes to boost performance. Simply disabling shared microarchitecture would cause severe performance degradation. In this work, we explore efficient and robust designs to defeat adversaries exploiting shared microarchitecture which are critical for performance of computer systems while being vulnerable to hardware side/covert channel attacks.

A cache timing channel attack occurs when a spy process infers secrets of another process by covertly observing its cache access pattern. Cache timing channel can be combined with speculative instructions to launch powerful attack which enable the spy to read secret values of the kernel or other processes. Software layers in the computing stack cannot fully eliminate cache timing channels since caches are typically shared between multiple processes. We propose three frameworks, Prefetch-guard, Reuse-trap and COTSKnight to identify and obfuscate cache timing channel. Prefetch-guard detects cache timing channels by analyzing the cache access pattern of applications. Reuse-trap repurposes the performance metric, reuse distance, to identify advanced cache timing channel which tries to evade detection. Both frameworks leverage existing hardware prefetcher to obfuscate cache timing channel through noise injection.

Experimental results show that the spy processes suffer 50% bit error rate on average which is difficult or impossible for adversaries to receive any information. COTSKnight repurposes existing hardware mechanisms, namely Cache Allocation Technology (CAT) and Cache Monitoring Technology (CMT), to disband cache timing channels on commercial-off-the-shelf processors. Experimental results show that COTSKnight is able to mitigate adversary with less than 5% overhead of benign workloads.

Recent work demonstrated that modern GPUs do not erase the remnant data in GPU memory left behind by previous applications prior to OS context switch. A spy process can steal the sensitive data of the previous application by allocating large size of memory region and dumping the data within the region before it is overwritten. Initializing every memory page on context switch would slowdown GPU which is against its original purpose to boost performance of computer systems. We propose EraseMe, an efficient framework which intelligently removes memory pages with highest information entropy. Preliminary experimental results show that EraseMe can increase the difficulty by 10x for the attacker.

Frequency of adversarial activities in side channels can influence prior detection frameworks which samples hardware statistics with predetermined frequency. When the frequency of the adversary mismatchs the frequency of the detection, it would be difficult or impossible to extract the suspicious patterns of the side channel. To provide a robust protection against various implementations of side channels, we propose SC-K9, a detection framework which can smartly tune and synchronize its sampling frequency of microarchitectural events based on the adversarial activity. The synchronization increases the detection accuracy especially when the attack frequency is unknown. We incorporate a three-phase model into our framework design to capture the victim's activity inbetween the setup and observation phases of the adversary (spy). SC-K9 synchronizes itself with the side channel by tracking the repetitive critical events in side channels and recording statistics between two consecutive critical events. We illustrate our design and demonstrate its effectiveness in identifying notorious side channels studied by recent works. Our experimental results show that SC-K9 can effectively spot adversaries with different transmission frequencies, and incurs very low rate of false alarms among benign workloads.

Yimeng Wang

Data-Driven Online Network Optimization through Reinforcement Learning

Advisor: Dr. Tian Lan

December 8, 2020

ABSTRACT

In the past decade, the Internet has expanded and developed rapidly. With more users getting access to the Internet, many new applications appeared to fulfill the ever-growing needs of users—from personal entertainment to the world economy. These new applications continuously bring challenges to the whole Internet ecosystem, targeting the computation resource, reliability, latency, energy, etc.

Various problems are proposed to resolve these challenges, including data caching, data rescheduling/prefetching, service placement, etc. Aiming to solve each proposed problem, existing works utilize varied techniques to optimize network performance in different aspects. Traditional optimization methods take advantage of the awareness of problem models. With the mathematical model, the optimal decision can be chosen at each system state to guarantee the maximum reward. For simple optimization objectives that can be easily calculated using the mathematical models, the model-based methods proved their efficiencies for many research fields. However, when solving real-world network optimizations, with the growth of the problem complexities, e.g., the optimization objective is jointly decided by multiple system metrics, the amount of time and space to obtain the calculation result could be extremely large. Thus, the utilization of novel model-free techniques, such as machine learning, is motivated.

This dissertation aims to solve the optimization quests for applications on networks using model-free methods. In other words, the optimizer does not require full information on the problem models. This model-free optimization task can be resolved by a state-of-the-art solution—Deep Reinforcement Learning (DRL). We concluded the advantages of DRL as: (i) it is able to explore the state transitions automatically and self-improve the decision making, thus an accurate problem model is not required; (ii) utilizing neural networks, the system state spaces can be vast so that complex problems involving multiple state variables can be easily optimized; (iii) with the trained neural network, DRL consumes fixed time span and storage space, which assures the scalability of problem applications.

In this dissertation, we use various methods to optimize the performances for different network applications, including mobile advertisements, video streaming, and service tree placement. The optimized targets include energy consumption, data cache hit ratio, data stream process stall, multi-user quality of experience, etc. We model the problems into decision-making processes—not limited to Markov Decision Processes (MDP), so that DRL could be applied. It is worth mentioning that since the neural networks have limited output layer sizes, we develop novel methods to break down large action spaces so that the scalabilities of problems are guaranteed. By implementing each optimization problem on our designed testbeds, the proposed DRL optimizers showed great potentials and outperformed existing baseline decision-making policies.

Shiyuan Wang

Distributed Intelligence for Online Situational Awareness and Resilience in Power Grids

Advisor: Dr. Payman Dehghanian

August 11, 2020

ABSTRACT

Electric power grids constantly confront potential fast- and slow-dynamic disruptions ranging from unpredictable faults, weather-driven disasters , malicious cybersecurity attacks, load variations, among others. With the growing demand to ensure electricity with higher quality to the end-use customers and mission-critical systems and services, enhancing the resilience and operational endurance of the power delivery infrastructure against disruptive events and reducing and mitigating such threatening risks is urgently needed. This calls for fundamental advancements of new, fast, and efficient analytical frameworks for online situational awareness in power grids that can accurately measure and effectively monitor, detect, adapt and respond to a wide range of threats.

We first propose an inclusive next-generation smart sensor technology embedded with novel and sophisticated data-driven analytics for online surveillance and situational awareness in power grids. The proposed analytics take the electrical signals as the input and unlock the full potential in advanced signal processing and machine learning for real-time pattern recognition, event detection and classification. A robust measurement mechanism is housed within the proposed sensor technology that will be triggered following a detected event and guides on the adaptive selection of the best-fit and most accurate synchrophasor estimation algorithms at all times. Embedding such analytics within the sensors and closer to where the data is generated, the proposed distributed intelligence mechanism mitigates the potential risks to communication failures and latencies , as well as malicious cyber threats, which would otherwise compromise the trustworthiness of the end-use applications in distant control centers. Our experiments demonstrate that the introduced sensor technology achieves a promising event detection and classification accuracy with improved quality of measurements , collectively resulting in enhanced online situational awareness in power grids. Also, the performance of the proposed smart sensor analytic is tested and verified in several event detection applications in power grid.

Reid McCargar

Out-of-Plane Enhancment in a Discrete Random Halfspace

Advisor: Dr. Roger Lang

November 23, 2020

ABSTRACT

Although widely used to model wave propagation in random media, radiative transfer theory is insufficient when waves propagating in different directions exhibit strong correlation and comprise an appreciable portion of the total field. The literature to date has almost exclusively regarded discrepancies between radiative transfer and wave theory-termed enhancements-as a backscatter phenomenon, which is the case in an unbounded random medium . It will be shown that a strong enhancement can also appear outside the plane of incidence in a plane-stratified random medium illuminated by a plane-wave. Many practical remote sensing problems conform closely to this geometry, thus the identification of an enhancement that is not confined to backscatter is significant for a multitude of bistatic sensing applications. The enhancement is examined for both electromagnetic and acoustic waves using the

Foldy-Lax and distorted Born approximations in conjunction with the two-variable perturbation method .