文章目录
题目:
Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol
Citation:
J. Hu, H. Zhang and L. Song, “Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol,” in IEEE Internet of Things Journal, vol. 6, no. 4, pp. 6177-6189, Aug. 2019, doi: 10.1109/JIOT.2018.2876513.
文章的 ieee 链接:
https://ieeexplore.ieee.org/document/8494742
创新点:
In literature, most works focused on either the sensing or the transmission in the UAV networks, instead of considering UAV sensing and transmission jointly.
系统模型
其中每个 UAV 对应一个事先已知的 task , 不存在 UAV-user association 的问题
回传的过程要求达到一个要求的信噪比,如果达到了,可以认为再 1 个 time slot 就可以完成传输
sense and send cycle
the process is divided into cycles, which are indexed by k. In each cycle, each UAV senses its task and then sends the collected.
每个 cycle 的结构可直观地反映在图里
beaconing :
- sensory data to the BS In the beaconing phase, each UAV sends its location to the BS in its beacon through the control channel.
- the BS then broadcasts to inform the UAVs of the general network settings as well as the locations of all the UAVs. : UAVs can obtain the locations of other UAVs in the beginning of each cycle.
Based on the acquired information, each UAV then decides its trajectory in the cycle and informs the BS by another beacon.
A. Sense-and-Send Cycle
beaconing结束后UAV匀速直线运动直至下一个cycle开始
transmission state :
B. Uplink Subchannel Allocation Mechanism
channel allocation state :
the BS allocates the C available uplink SCs to the UAVs with uplink requirements , in order to maximize the sum of successful transmission probabilities of the UAVs.
sense-and-send protocol analysis
Outer Markov Chain of UAV Sensing
the state transition takes place among different cycles . for each UAV,
it has two states in each cycle
Inner Markov Chain of UAV Transmission
符号提示
A. UAV Trajectory Design Problem
上面我们一直研究和分析的是每个cycle 内的具体细节,本节我们把cycle当成一个单元考虑
Single-Agent Q-Learning Algorithm
we first set up the model to describe the UAVs’ trajectories.
we consider the utility of each UAV to be the total number of successful valid sensory data transmissions for its task.
Multiagent Q-Learning Algorithm