Building a Multimodal RL System for Speech-to-Text Annotation - Rollup News