<strong>Paper Title</strong><br>
Opportunistic Routing In Cognitive Radio Networks Using Reinforcement Learning<br>
<br>

<strong>Abstract</strong><br>
Cognitive radio (CR) technology is rapidly developing these days due to  its capability of adaptive learning and 
reconfiguration. Thus, using Cognitive  Radio Networks (CRNs) spectrum efficiency can be increased by allowing the 
secondary users (SUs) to access the licensed band dynamically and opportunistically without interfering the primary users 
(PUs). Daniel H. and Ryan W. Thomas, define the CRNs in the context of machine learning as the network which improves 
its performance  through experience gained over a period of time without complete information about the environment in 
which it operates. Thus, the dynamism and opportunism can be learnt by reinforcement learning, which is concerned with 
how software agents or learning agents ought to take actions in an environment so as to maximize some notion of cumulative 
reward. The paper proposes a routing scheme that uses Q-learning, which is the most widely used RL approach in wireless 
networks.  In Q-learning, the learnt action value  or Q-value, Q (state, event, action)  is updated using the reward and is 
recorded. For each state-event pair, an  appropriate action is rewarded and its Q-value is increased. Hence, the Q-value 
indicates the appropriateness of an action selection in a state-event pair. At any time instant, an action is chosen by the agent 
in such a way that it maximizes its Q-value. The reward corresponds to performance metric such as throughput.