GThinker
modelA multimodal reasoning model (MLLM) that introduces "Cue-Guided Rethinking." Unlike linear Chain-of-Thought, GThinker can revisit and correct initial visual interpretations if reasoning encounters inconsistencies, significantly reducing hallucinations.
Paper
arXiv: 2501.01234