대시보드 토픽 엔티티 탐색

Humanity's Last Exam

usecase

Gemini 2.5 Deep Think 모델이 34.8% 성능을 기록한 평가 기준으로, 복잡한 추론 능력을 측정하는 벤치마크입니다.

Related Services (1)

Services that support this use case

Gemini 2.5 Deep Think(Google)

관련 컨텐츠

AI Buddy

Chat about

Humanity's Last Exam

Entity 질문 모드

Suggested: