Tutorial: Adaptive and Flexible Model-Based AI for Wireless Systems (Part 1 of 2)
Nir Shlezinger, Sangwoo Park, Tomer Raviv, and Osvaldo Simeone
Wireless communication technologies are subject to escalating demands for connectivity, latency, and throughput. To facilitate meeting these performance requirements, emerging technologies such as mmWave and THz communication, holographic MIMO, spectrum sharing, and RISs are currently being investigated. While these technologies may support desired performance levels, they also introduce substantial design and operating complexity. For instance, holographic MIMO hardware is likely to introduce non-linearities on transmission and reception; the presence of RISs complicates channel estimation; and classical communication models may no longer apply in novel settings such as the mmWave and THz spectrum, due to violations of far-field assumptions and lossy propagation. These considerations notably affect transceiver design.
Traditional transceiver processing design is model-based, relying on simplified channel models, which may no longer be adequate to meet the requirements of next-generation wireless systems. The rise of deep learning as an enabler technology for AI has revolutionized various disciplines, including computer vision and natural language processing (NLP). The ability of deep neural networks (DNNs) to learn mappings from data has spurred growing interest in their usage for transceiver design. DNN-aided transceivers have the ability to succeed where classical algorithms may fail. They can learn a detection function in scenarios having no well-established physics-based mathematical model, a situation known as model-deficit; or when the model is too complex to give rise to tractable and efficient model-based algorithms, a situation known as algorithm-deficit.
Despite their promise, several core challenges arise from the fundamental differences between wireless communications and traditional AI domains such as computer vision and NLP. The first challenge is attributed to the nature of the devices employed in communication systems. Wireless communication transceivers are highly constrained in terms of their compute and power resources, while deep learning inherently relies on the availability of powerful devices, e.g., high-performance computing servers. A second challenge stems from the nature of the wireless communication domain. Communication channels are dynamic, implying that the task, dictated by the data distribution, changes over time. This makes the standard pipeline of data collection, annotation, and training highly challenging. Specifically, DNNs rely on (typically labeled) data sets to learn from the underlying unknown, but stationary, data distributions. This is not the case for wireless transceivers , whose processing task depends on the time-varying channel, restricting the size of the training data set representing the task. These challenges imply that successfully applying AI for transceivers design requires deviating from conventional deep learning approaches. To this end, there is a need to develop communication-oriented AI techniques that are not only of high performance for a given channel, but also light-weight, interpretable, flexible, and adaptive.
In the proposed tutorial we shall present in a pedagogic fashion the leading approaches fordesigning of practical and effective deep transceivers that address the specific limitations imposed by the use of dataand resource-constrained wireless devices and by the dynamic nature of the communication channel. We advocate that AI-based wireless transceiver design requires revisiting the three main pillars of AI, namely, (i) the architecture of AI models;(ii) the data used to train AI models; and (iii) the training algorithm that optimizes the AI model for generalization, i.e., to maximize performance outside the training set (either on the same distribution or for a completely new one). For each of these AI pillars, we survey candidate approaches from the recent literature. We first discuss how to design light-weight trainable architectures via model-based deep learning. This methodology hinges on the principled incorporation of model-based processing, obtained from domain knowledge on optimized communication algorithms, within AI architectures. Next, we investigate how labeled data can be obtained without impairing spectral efficiency, i.e., without increasing the pilot overhead. We show how transreceivers can generate labeled data by self-supervision, aided by existing communication algorithms; and how they may further enrich data sets via data augmentation techniques tailored for such data. We then cover training algorithms designed to meet requirements in terms of efficiency, reliability, and robust adaptation of wireless communication systems, avoiding overfitting from limited training data while limiting training time. These methods include communication-specific meta-learning as well as generalized Bayesian learning and modular learning.
Tutorial outline:
Introduction and motivation
Dramatic success of deep learning
Gains of deep learning for wireless communications and sensing
Overcoming model deficiency
Overcoming algorithm deficiency
Applications
The fundamental differences between wireless technologies and conventional AI domains and its associated challenges
Nature of the devices
Nature of the domain
The need for AI that is light-weight, flexible, adaptive, and interpretable
Tutorial goal + outline
Deep Learning Aided Systems in Dynamic Environments
System model and main running example of deep learning aided receivers
Overview of existing approaches for handling dynamic tasks
Joint learning
Estimated environment parameters as input
Online learning
Pros and cons of each approach when and why should AI-aided systems be trained on device?
Paradigm shift in AI needed to enable such operations:
Go beyond design of parametric models
Holistic treatment of machine learning algorithms –
Architecture
Data
Training
Architecture:
The family of mappings one can learn
From black box highly-parameterized architectures to light-weight interpretable machine learning systems via domain knowledge
Model-based deep learning methodologies
Deep unfolding and its forms:
Learned hyperparameters
Learned objective
DNN conversion
DNN-aided inference
Issues for future research
Data:
Data for learning the task under the current environment
From few pilots to large labeled data sets
Self-supervision:
Codeword level
Decision-level
Active learning
Data augmentation
Complete data enrichment pipeline
Issues for future research
Training:
Tuning parametric architecture from data
Train rapidly with limited data, possibly exploiting model-based architectures
Deciding when to train using concept drift
Meta-learning:
Gradient-based meta-learning
Hypernetwork-based meta-learning
Bayesian learning:
End-to-end Bayesian learning
Model-based aware Bayesian learning
Continual Bayesian learning
Modular learning for model-based deep architectures
Issues for future research
Summary:
Additional aspects of federated learning not discussed in this tutorial
Hardware-aware and power-aware AI
Collaborative flexible AI for mobile wireless devices
Conclusions