Evaluation suites
Test cases, expected criteria, judge-based automated scoring across six dimensions, and rubric design.
This section is being written. Check back soon.
Test cases, expected criteria, judge-based automated scoring across six dimensions, and rubric design.
This section is being written. Check back soon.