Reinforcement learning is often explained with the term “agent” in the loop. The agents stands for the module of the system who takes the decision. The policy of the agent is equal to the decision making process. In the easiest form a policy looks similar to a behavior tree. Other policies are defined with q-table (qlearning) which is an if-then-matrix. If a certain state is true then an action is executed.
The more elaborated way for constructing an agent is the with help of an environmental model. In the literature this concept is called model based reflex agent . Russel/Norvig call the idea a reflex agent with an internal state.
A more recent explanation is given by the OPSXCQ Blog which provides an example for an agent in a simulated world and also the sourcecode is given. In the bottom of the blog the wonderful CC-By 4.0 tag was set, which means that the blog is devoted to the open access movement.
 page 3 in Suganya, K. "A Review of Intelligent Agents." International Journal of Engineering Research and General Science 2.2 (2014): 112-117.
 Russell, Stuart J., and Peter Norvig. Artificial intelligence: a modern approach. Malaysia; Pearson Education Limited,, 2016.
 Artificial Intelligence - OPSXCQ Blog, https://strm.sh/post/artificial-intelligence/