GitHub / bigcode-project/bigcodebench / commits
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
| SHA | Message | Author | Date | Stats |
|---|---|---|---|---|
| 24e42d19 | feat(generate): add reasoning effort for o1 and o3 | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 3314ebe2 | update doc | Terry Zhuo <t****5@g****m> | 11 months ago | |
| cb30bfcc | update model outputs | Terry Zhuo <t****5@g****m> | 11 months ago | |
| fd3cbc84 | fix(generate): use 2nd part for gemini thinking models | Terry Zhuo <t****5@g****m> | 11 months ago | |
| ba18aaf6 | fix(evaluate): postprocess the concurrent outputs for o1 | Terry Zhuo <t****5@g****m> | 11 months ago | |
| e380adff | fix(generate): rm max output tokens | Terry Zhuo <t****5@g****m> | 11 months ago | |
| e264513e | fix(generate): rm temperature for o1 | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 66d9499f | fix (evaluate): update gt checking | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 90adab8b | update docker | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 86c05cf4 | fix: add transformers req | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 201c2364 | update running example | Terry Zhuo <t****5@g****m> | 11 months ago | |
| d92b0125 | update docker files | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 992fc348 | fix: pass `calibrated` into gradio api |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
11 months ago | |
| aa634d5d | fix(evaluate): compute pass_at_k for existing results |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
11 months ago | |
| 3828e623 | update setup.cfg |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
11 months ago | |
| b5cfea3b | fix: update modules in sanitize.py |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
11 months ago | |
| 3fbcbb66 | update docker | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 36a905d2 | fix: update the tf version for general installation | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 1fa73223 | fix: change the setup cfg for evaluate | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 25afe4fd | fix: update deps for evaluate | Terry Zhuo <t****5@g****m> | 11 months ago | |
| de17cce2 | fix(evaluate): put the future completion in the executor | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 8fa95f8b | feat(evaluate): do calibration by default | Terry Zhuo <t****5@g****m> | 11 months ago | |
| 06437ab9 | fix(evaluate): remove redundant code |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
11 months ago | |
| b9a7b178 | feat(evaluate): add backoff for concurrent issues | Terry Zhuo <t****5@g****m> | 12 months ago | |
| b888ce68 | feat(evaluate): add backoff for file reading | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 13f07c9e | feat(codegen): add the progress bar for openai API | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 9d7af543 | add doc for result submission | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 86458639 | fix: add tokenizer customization back | Terry Zhuo <t****5@g****m> | 12 months ago | |
| fc3034a3 | merge fix: add trust_remote_code back |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
12 months ago | |
| 54794ed1 | fix missing trust_remote_code parameter | LRL <l****l@l****v> | 12 months ago | |
| e517a9e2 | fix(evaluate): update backup pass_k result path | Terry Zhuo <t****5@g****m> | 12 months ago | |
| d40eceb1 | doc: add params | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 0f4df3e7 | fix(codegen): remove commented code | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 8ed15f69 | fix(codegen): update make_request | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 9ff42cac | fix(doc): change id_range input | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 570a4c8f | feat(evaluate): add no_execute flag | Terry Zhuo <t****5@g****m> | 12 months ago | |
| bc13148b | merge Merge branch 'main' of https://github.com/bigcode-project/bigcodebench | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 16ec422e | fix(evaluate): update the calibration setup | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 813712f9 | feat: add 3.5 haiku and grok beta | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 1d9ea6af | feat: batch o1 and deepseek-chat via concurrency | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 49208081 | feat: change google api request | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 974e6791 | fix: make google api do n samples | Terry Zhuo <t****5@g****m> | 12 months ago | |
| cb283fdd | feat: customize instruction and response | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 1e243647 | feat: using datasets to load | Terry Zhuo <t****5@g****m> | 12 months ago | |
| 21654312 | fix(codegen): stop by upper bound | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| e8798f47 | fix: change id_range type | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| a810f315 | doc: add model revision | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| c5a22bff | feat(codegen): support model revision | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| e25440ea | Update README.md |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| e10d361a | docs: update impact |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 3b4a058d | fix(doc): typos |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| e35d6257 | doc: add impact | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 2b731258 | merge Merge branch 'main' of https://github.com/bigcode-project/bigcodebench | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 11262303 | remove reflection model | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 817e63b2 | doc: benchmark description |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 2a28c61c | doc: update model outputs link | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 58b3f2d0 | fix: change parallel logic | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| dac3a008 | doc: update parallel arg | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 0825835f | doc: update minimal full script | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| b8c18116 | refactor(eval): update parallel default val | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| f6de469c | docker: update Gradio.Dockerfile | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 1cb320fc | Update ADVANCED_USAGE.md |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 0a5154f1 | doc: merge installation cfg |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 8cb06e4e | merge cfg |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 991e41c0 | doc: add warning |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 8b9b46ef | doc: update link |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| b54dd6c7 | doc: minot update | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 2498259d | merge Merge branch 'main' of https://github.com/bigcode-project/bigcodebench | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 312321dd | doc: update w/ 0.2.0 | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| fce1f38d | update model list | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 49e3b3ca | rfactor(analysis): update get results | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| c436061f | doc: update adv use | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 549ba752 | rm: old doc | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 22432039 | doc: minor update | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 5a502125 | doc: minor update | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 3d044781 | doc: minor update | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 098ee793 | minor fix | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 8993dcaa | refactor: update evaluate pipeline | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 7c5c3d08 | refactor(gen): update model provider | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| e51bd31e | doc: minor update | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 71097102 | doc: update full script | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| ed9886cc | doc: add args details | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| d367f5b5 | update dependencies | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 625752eb | refactor(gen): remove mistral errors | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 62f8ab1a | refactor(tools): rrename to 0.2.0 | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 33548bbd | doc: minor fix | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| b8f5aea9 | doc: minor fix | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 01e2fb08 | doc: update full script | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| ae94984d | doc: 0.2.0 pre-release | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| db22671b | refactor: use model provide for gen | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 717835f7 | feat: refactor: model provider | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| f8897def | merge resolve conflict | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| f4b5978f | feat: refactor eval | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 195a1659 | merge fix: await futures |
Terry Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 year ago | |
| 8ffe6d3d | add customized pass k | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 8f83db19 | feat: refactor generate pipeline | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| 129e2634 | feat: add more models | Terry Zhuo <t****5@g****m> | about 1 year ago | |
| afc1f87a | Reset timer on progress | Roy Hvaara <r****y@l****o> | about 1 year ago | |
| 6c01136c | Remove superfluous comment | Roy Hvaara <r****y@l****o> | about 1 year ago | |
| bf58e0f1 | Wait on futures in progress checker | Roy Hvaara <r****y@l****o> | about 1 year ago |