Reinforcement learning (RL) has been shown to have the potential for optimal control of heating, ventilation, and air conditioning (HVAC) systems. Although research on RL-based building control has received extensive attention in recent years, there …