Oasis distinguishes between Online Adaptation (adjusting the Digital Twin parameters as seasons change) and Policy Learning (figuring out the best control strategy).
Reinforcement Learning (RL)
RL is powerful but risky in physical homes (trial and error means freezing users). We use RL agents (PPO) primarily in our Simulation Engine to discover novel strategies safely. Only validated, safe strategies are deployed to the control loop.