DS-RNN Architecture 분석

Author: KangchanRoh

Team: Reinforcement Learning Team @ CAI Lab
Date: 2022/11/30

CrowdNav DS-RNN

FoV environment Robot with a limited field of view (FoV)

sites.google.com

위 링크에서 제시한 논문의 네트워크 아키텍쳐에 대한 초핵심 요약이다.

Jain et al propose a general method called structural- RNN (S-RNN) that transforms any st-graph to a mixture of RNNs that learn the parameters of factor functions end-to- end. Our work is the first to combine S-RNN with model-free RL for crowd navigation.
St-graph use nodes to represent the problem components and edges to capture the spatio-temporal interactions.

$MDP:\ <\mathcal{S},\mathcal{A},P,R,\gamma,\mathcal{S_0}>$
$\operatorname{w}^t : robot's\ state,\ \operatorname{w}^t=[(p_x,p_y),(v_x,v_y),(g_x,g_y),v_{max},\theta,\rho]$
$\operatorname{u}^t_i : i-th\ human's\ state,\ \operatorname{u}^t_i=(p_x^i,p_y^i)$
$s_t \in \mathcal{S},\ s_t=[\operatorname{w}^t,\operatorname{u}^t_i,...,\operatorname{u}^t_n]$
$a_t \in \mathcal{A},\ a_t=[v_x,v_y]$
$r(s_t,a_t)\in R,\ r(s_t,a_t)=$\begin{cases} -20, & {if\ d_{min}^t<0}\\ 2.5(d_{min}^t-0.25), & {if\ 0<d_{min}^t<0.25}\\ 10, & {if\ d_{goal}^t \le \rho_{robot}}\\2(-d_{goal}^t+d_{goal}^{t-1}), & {otherwise.} \end{cases}

$st-graph\ components: \mathcal{G}=(\mathcal{V},\mathcal{E_S},\mathcal{E_T})$
$\mathcal{V}:set\ of\ nodes$
$\mathcal{E_S}:set\ of\ spatial\ edges$
$\mathcal{E_T}:set\ of\ temporal\ edges$
Spatial Edge : $x^t_{{u_i}w}=(p^i_x-p_x,p^i_y-p_y),\ x^t_{{u_i}w}\in\mathcal{E_S},$
Temporal Edge : $x^t_{ww} =(v_x,v_y),\ x^t_{ww}\in\mathcal{E_S}$
Node Feature : $x^t_{w} =\operatorname{w}^t=[(p_x,p_y),(v_x,v_y),(g_x,g_y),v_{max},\theta,\rho]$

$x^t_{{u_i}w}$ → $\operatorname{R}_S$ → $h_{u_{i}w}^t$ $$ h_{u_{i}w}^t = \operatorname{RNN}(h_{u_{i}w}^{t-1},\ f_{spatial}(x_{u_iw}^t)) $$
$x^t_{ww}$ → $\operatorname{R}_T$ → $h_{ww}^t$ $$ h_{ww}^t = \operatorname{RNN}(h_{ww}^{t-1},\ f_{temporal}(x_{ww}^t)) $$
$h_{u_{i}w}^t,h_{ww}^t$ → Attention Module → $v_{att}^t$
- Attention Module
  - Notation
    - $V^t = [h_{u_{i}w}^t,...,h_{u_{n}w}^t]^\top$ : Output of $\operatorname{R}_S$
    - $W$: Weight
    - $v_{att}^t$ : Attention weighted sum of spatial edges
  - Linear transformations$$ Q^t=V^tW_Q,\ \ K^t=h^t_{ww}W_K $$
  - Attention weight$$ \alpha^t=\operatorname{softmax}({n \over \sqrt{d_k}}Q^t(K^t)^\top) $$
  - Output : Attention weighted spatial hidden state$$ v_{att}^t=(V^t)^\top \alpha^t $$
$v_{att}^t$ + $x^t_{w}$ → $\operatorname{R}_N$ → $h_{w}^t$ $$ h_{w}^t = \operatorname{RNN}(h_{w}^{t-1},\ [\ f_{edge}(v_{att}^t,h_{ww}^t),\ f_{node}(x_{w}^t)\ ]) $$
$h_{w}^t$ → PPO → $V(s_t),\pi(a_t|s_t)$

Simple Summary of Mobile Robot Motion Planning Methods (2)	2023.03.07
[AD] Cubic Spline Interpolation (3차 곡선 보간법) (0)	2022.09.15
[AD] PRM(Probabilistic Road-Map) 알고리즘 설명 (0)	2022.09.13
[AD] 자율주행 시뮬레이터 비교 (0)	2022.09.13
[AD] Frenet Frame Planning Algorithm 설명 (0)	2022.09.12