Approximation Error Backtracking for Q-function in Scalable Reinforcement Learning with Tree Dependence Structure

Yuzi Yan (Tsinghua University); Yu Dong (Tsinghua University); Kai Ma (Tsinghua University); Yuan Shen (Tsinghua University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

We apply the exponential decay property of scalable RL theory to a specific scenario where the network structure is a tree, and use KL (Kullback-Leibler) divergence to analyze the transmission of approximation error along the structure over time, in order to quantify its backtracking result. We gain the insight that most of the approximation error originates from the inaccurate estimation of the state of the source nodes (root in Top-Down mode and leaves in Bottom-Up mode), which can be largely recovered by establishing the multi-hop communication link.

Tags:

Perception and quality models for images and video