FlagEval

non-profit

https://flageval.baai.ac.cn/

AI & ML interests

None defined yet.

Recent Activity

philokey updated a dataset about 2 months ago

FlagEval/coco_val2014_sampled

philokey authored a paper about 2 months ago

Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench

philokey updated a dataset about 2 months ago

FlagEval/MeasureBench

View all activity

FlagEval 's datasets 13

FlagEval/ERQAPlus

Viewer • Updated Nov 27 • 800 • 31 • 1

FlagEval/coco_val2014_sampled

Viewer • Updated Nov 6 • 1k • 51

FlagEval/MeasureBench

Viewer • Updated Nov 3 • 2.44k • 299 • 1

FlagEval/EmbodiedVerse-Bench

Viewer • Updated Jun 25 • 2.04k • 549

FlagEval/Where2Place

Viewer • Updated May 29 • 100 • 435

FlagEval/SAT

Viewer • Updated May 6 • 150 • 124

FlagEval/HMMT_2025

Viewer • Updated May 6 • 30 • 697 • 1

FlagEval/ERQA

Viewer • Updated Apr 22 • 400 • 1.19k • 3

FlagEval/sub_spatial

Viewer • Updated Apr 21 • 690 • 461

FlagEval/EmbSpatial-Bench

Viewer • Updated Apr 21 • 3.64k • 334 • 3

FlagEval/documentation-images

Viewer • Updated Nov 13, 2024 • 3 • 179

FlagEval/CLCC_v1

Viewer • Updated Jul 29, 2024 • 760 • 49 • 3

FlagEval/HalluDial

Updated Jun 26, 2024 • 20 • 3