728x90
Author: KangchanRoh
Team: Reinforcement Learning Team @ CAI Lab
Date: 2022/11/30
CrowdNav DS-RNN
CrowdNav DS-RNN
FoV environment Robot with a limited field of view (FoV)
sites.google.com
위 링크에서 제시한 논문의 네트워크 아키텍쳐에 대한 초핵심 요약이다.
- Jain et al propose a general method called structural- RNN (S-RNN) that transforms any st-graph to a mixture of RNNs that learn the parameters of factor functions end-to- end. Our work is the first to combine S-RNN with model-free RL for crowd navigation.
- St-graph use nodes to represent the problem components and edges to capture the spatio-temporal interactions.
$A.\ Problem \ formulation$
- $MDP:\ <\mathcal{S},\mathcal{A},P,R,\gamma,\mathcal{S_0}>$
- $\operatorname{w}^t : robot's\ state,\ \operatorname{w}^t=[(p_x,p_y),(v_x,v_y),(g_x,g_y),v_{max},\theta,\rho]$
- $\operatorname{u}^t_i : i-th\ human's\ state,\ \operatorname{u}^t_i=(p_x^i,p_y^i)$
- $s_t \in \mathcal{S},\ s_t=[\operatorname{w}^t,\operatorname{u}^t_i,...,\operatorname{u}^t_n]$
- $a_t \in \mathcal{A},\ a_t=[v_x,v_y]$
- $r(s_t,a_t)\in R,\ r(s_t,a_t)=$\begin{cases} -20, & {if\ d_{min}^t<0}\\ 2.5(d_{min}^t-0.25), & {if\ 0<d_{min}^t<0.25}\\ 10, & {if\ d_{goal}^t \le \rho_{robot}}\\2(-d_{goal}^t+d_{goal}^{t-1}), & {otherwise.} \end{cases}
$B.\ Spatio-Temporal\ Graph\ Representaion$
- $st-graph\ components: \mathcal{G}=(\mathcal{V},\mathcal{E_S},\mathcal{E_T})$
- $\mathcal{V}:set\ of\ nodes$
- $\mathcal{E_S}:set\ of\ spatial\ edges$
- $\mathcal{E_T}:set\ of\ temporal\ edges$
- Spatial Edge : $x^t_{{u_i}w}=(p^i_x-p_x,p^i_y-p_y),\ x^t_{{u_i}w}\in\mathcal{E_S},$
- Temporal Edge : $x^t_{ww} =(v_x,v_y),\ x^t_{ww}\in\mathcal{E_S}$
- Node Feature : $x^t_{w} =\operatorname{w}^t=[(p_x,p_y),(v_x,v_y),(g_x,g_y),v_{max},\theta,\rho]$
$C.\ Network\ Architecture$
DS-RNN Process
- $x^t_{{u_i}w}$ → $\operatorname{R}_S$ → $h_{u_{i}w}^t$ $$ h_{u_{i}w}^t = \operatorname{RNN}(h_{u_{i}w}^{t-1},\ f_{spatial}(x_{u_iw}^t)) $$
- $x^t_{ww}$ → $\operatorname{R}_T$ → $h_{ww}^t$ $$ h_{ww}^t = \operatorname{RNN}(h_{ww}^{t-1},\ f_{temporal}(x_{ww}^t)) $$
- $h_{u_{i}w}^t,h_{ww}^t$ → Attention Module → $v_{att}^t$
- Attention Module
- Notation
- $V^t = [h_{u_{i}w}^t,...,h_{u_{n}w}^t]^\top$ : Output of $\operatorname{R}_S$
- $W$: Weight
- $v_{att}^t$ : Attention weighted sum of spatial edges
- Linear transformations$$ Q^t=V^tW_Q,\ \ K^t=h^t_{ww}W_K $$
- Attention weight$$ \alpha^t=\operatorname{softmax}({n \over \sqrt{d_k}}Q^t(K^t)^\top) $$
- Output : Attention weighted spatial hidden state$$ v_{att}^t=(V^t)^\top \alpha^t $$
- Notation
- Attention Module
- $v_{att}^t$ + $x^t_{w}$ → $\operatorname{R}_N$ → $h_{w}^t$ $$ h_{w}^t = \operatorname{RNN}(h_{w}^{t-1},\ [\ f_{edge}(v_{att}^t,h_{ww}^t),\ f_{node}(x_{w}^t)\ ]) $$
- $h_{w}^t$ → PPO → $V(s_t),\pi(a_t|s_t)$
반응형
'Autonomous Driving' 카테고리의 다른 글
Simple Summary of Mobile Robot Motion Planning Methods (2) | 2023.03.07 |
---|---|
[AD] Cubic Spline Interpolation (3차 곡선 보간법) (0) | 2022.09.15 |
[AD] PRM(Probabilistic Road-Map) 알고리즘 설명 (0) | 2022.09.13 |
[AD] 자율주행 시뮬레이터 비교 (0) | 2022.09.13 |
[AD] Frenet Frame Planning Algorithm 설명 (0) | 2022.09.12 |