SketchUp Log Bench 3D Model

TerrisGO/BIG-bench_nlp

The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and extrapolate their future capabilities. The more than 200 tasks included in ...

GitHub

MSLR: Multi-Step-Reasoning-Trace Chinese Multi-Step Legal Reasoning Benchmark Dataset

With the rapid development of large language models (LLMs) in legal applications, systematically evaluating their reasoning ability in judgment prediction has become increasingly urgent. Currently, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

TerrisGO/BIG-bench_nlp

MSLR: Multi-Step-Reasoning-Trace Chinese Multi-Step Legal Reasoning Benchmark Dataset

Trending now