no code implementations • 12 Mar 2022 • Fusen Wang, Kai Liu, Fei Long, Nong Sang, Xiaofeng Xia, Jun Sang
However, the transformer directly partitions the crowd images into a series of tokens, which may not be a good choice due to each pedestrian being an independent individual, and the parameter number of the network is very large.