In the research, they benchmark FIFAWC on two tasks: traditional GAR and innovative GA video captioning. For GAR, they evaluate the classical ...