Wireless sensor network (WSN) is a network of devices denoted as nodes that can sense the environment and communicate gathered data, through wireless medium to a sink node. It is a wireless network with low power consumption, small size and reasonable price which has a variety of applications in monitoring and tracking. However, WSN is characterized by constrained energy because its nodes are battery-powered and energy recharging is difficult in most of applications. Also the reduction of energy consumption often introduces additional latency of data delivery. To address this, many scheduling approaches have been proposed. In this paper, we discuss the applicability of reinforcement learning (RL) towards multiple access design in order to reduce energy consumption and to achieve low latency in WSNs. In this learning strategy, an agent would become intelligent in making actions through interacting with the environment. As a result of rewards in response to the actions, the agent asymptotically reaches the optimal policy. This policy maximizes the long-term expected return value of the agent