Project overview
Use of interactive reinforcement learning technique to generate a policy that match users preference. The objective of the project is to resolve the conflict between the thermal comfort and electricity cost. In addition, we take into consideration of the room occupancy, in which we forecast the probability of the future room occupancy with historical occupancy pattern, and use the GPS signal and moving direction from the mobile phone to auto correct the occupancy schedule. This project concerntrates on the importance of the continuous learning, we research on the appropriate way for the human and machine collaboration, and use this advantage to reshape the reward function or policy.