Approximation Error Backtracking for Q-function in Scalable Reinforcement Learning with Tree Dependence Structure
Yuzi Yan (Tsinghua University); Yu Dong (Tsinghua University); Kai Ma (Tsinghua University); Yuan Shen (Tsinghua University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
We apply the exponential decay property of scalable RL theory to a specific scenario where the network structure is a tree, and use KL (Kullback-Leibler) divergence to analyze the transmission of approximation error along the structure over time, in order to quantify its backtracking result. We gain the insight that most of the approximation error originates from the inaccurate estimation of the state of the source nodes (root in Top-Down mode and leaves in Bottom-Up mode), which can be largely recovered by establishing the multi-hop communication link.