GitHub / bigcode-project/bigcodebench / commits
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
| SHA | Message | Author | Date | Stats |
|---|---|---|---|---|
| 47eff25a | merge Merge pull request #112 from alexazhou/main |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
2 months ago | |
| 0580930f | merge Merge pull request #104 from KedarnathKC/main |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
2 months ago | |
| 45a3b783 | fix e2b sdk version, because 2.0 sdk use new interface, so makesure use the o... | AlexaZhou <A****u@1****m> | 2 months ago | |
| fe5af22c | fix: pass_k-iterable in evaluate.py | KedarnathKC <k****5@g****m> | 4 months ago | |
| 77b286f7 | fix: rm printout | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 10c8327f | add models | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 33ed54d4 | fix: update grok3 name for reasoning | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 9b6013c0 | merge Merge pull request #94 from Devy99/fix-nltk-download |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
7 months ago | |
| bb082968 | feat: support reasoning for grok-3-mini | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 821f3a54 | Update instruction prompts with ntlk fix | Alessandro Giagnorio <g****a@u****h> | 7 months ago | |
| 1bf199d4 | Fix nltk resource download | Alessandro Giagnorio <g****a@u****h> | 7 months ago | |
| 8fb8e239 | update model meta info and processing script | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 9bd90fed | feat: use google genai | Terry Zhuo <t****5@g****m> | 7 months ago | |
| c9e2cbba | update model metadata | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 720681b8 | feat: add max_model_len for vllm | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 00fc9bb9 | fix model metadata | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 3513d997 | fix: hardcode the model max length for vllm | Terry Zhuo <t****5@g****m> | 7 months ago | |
| 5f0743d0 | fix: remove vllm max length | Terry Zhuo <t****5@g****m> | 7 months ago | |
| fa21527b | feat: add model release date | Terry Zhuo <t****5@g****m> | 7 months ago | |
| d37847db | fix: customize lora output file | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 82fc40df | fix: vllm lora attribute | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 6d967338 | feat: support vllm lora | Terry Zhuo <t****5@g****m> | 8 months ago | |
| f087e3b0 | doc: add new model outputs | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 0ecd667f | update the results analysis script | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 05b7f1f9 | doc: fix endpoints | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 78dceb21 | fix: only append text output | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 57eb973f | fix: correctly process anthropic treaming | Terry Zhuo <t****5@g****m> | 8 months ago | |
| c05694cd | fix: remove unused args | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 89309066 | feat: support anthropic extended thinking | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 9059fb84 | remove check_gt_only flag | Terry Zhuo <t****5@g****m> | 8 months ago | |
| d8161d93 | feat: use v0.1.4 dataset | Terry Zhuo <t****5@g****m> | 8 months ago | |
| 72ac9195 | merge Merge pull request #49 from hvaara/mock-response-ok |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
8 months ago | |
| 9515dfcc | Move changes to new fix file and bump version numbers | Roy Hvaara <r****y@l****o> | 8 months ago | |
| 365867c1 | merge Merge pull request #86 from hvaara/re-progress-checker-fix |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
8 months ago | |
| 98482fd7 | merge Merge pull request #85 from hvaara/hf-serverless-inference |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
8 months ago | |
| 45e901f1 | Reintroduce progress checker from #48 | Roy Hvaara <r****y@l****o> | 9 months ago | |
| e646fe52 | Add docs | Roy Hvaara <r****y@l****o> | 9 months ago | |
| afa881ce | Add support for Hugging Face Serverless Inference | Roy Hvaara <r****y@l****o> | 9 months ago | |
| 04b317f1 | Fixes for tasks 211 and 215 | Roy Hvaara <r****y@l****o> | 9 months ago | |
| 8e5cc7e4 | fix: check ``` in hf prefill | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 2f9f4c84 | fix: check if prefill | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 41770a77 | force e2b docker update | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 49d2c522 | fix: optional bool args | Terry Zhuo <t****5@g****m> | 9 months ago | |
| bdc265c1 | fix: check if selective_evaluate exists | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 81aca07f | fix passk | Terry Zhuo <t****5@g****m> | 9 months ago | |
| e2de1eab | merge Merge pull request #80 from bigcode-project/e2b_debug |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
9 months ago | |
| 2ee62630 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| e70f177a | merge Merge pull request #79 from bigcode-project/e2b_debug |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
9 months ago | |
| 5cfe22b2 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 0cd61745 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 5091ff0e | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 5d42541e | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| d32f19ee | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 035221bd | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| cb1ddd09 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 1dc1e374 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| de90e7ed | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 9c5726aa | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 75469f46 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 8b79fc44 | fix | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 6f1c33d2 | fix e2b | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 0f4cf188 | fix e2b | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 5beceb57 | fix e2b | Terry Zhuo <t****5@g****m> | 9 months ago | |
| f1378304 | fix e2b docker | Terry Zhuo <t****5@g****m> | 9 months ago | |
| deb41dd0 | change e2b docker | Terry Zhuo <t****5@g****m> | 9 months ago | |
| f254211e | fix for e2b | Terry Zhuo <t****5@g****m> | 9 months ago | |
| fcc4389f | merge Merge pull request #77 from shwinshaker/add-unique-cache |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
9 months ago | |
| 23d43901 | add unique cache directory before each code execution | Chengyu Dong <c****d@n****m> | 9 months ago | |
| 4381f0af | merge Merge pull request #75 from zhangchen-xu/main |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
9 months ago | |
| e9407245 | fix make_raw_chat_prompt when prefill is disabled | fly_dust <f****8@g****m> | 9 months ago | |
| 202203e2 | update execution doc | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 768177ea | add more models | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 0331489b | Update README.md |
Terry Yue Zhuo <t****5@g****m>
Committed by: GitHub <n****y@g****m> |
9 months ago | |
| fb46c011 | add swe arena | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 0c97a4d8 | add models | Terry Zhuo <t****5@g****m> | 9 months ago | |
| dcff46f8 | fix prefill arg | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 1a340780 | update doc | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 832035eb | update eval model | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 1e12249f | fix gen file name | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 0f30a64c | fix: change arg | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 5e173165 | feat: add no_prefill in file if being activated | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 2ff547d5 | fix: change prefill to no_prefill | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 746a1990 | feat: make prefill optional | Terry Zhuo <t****5@g****m> | 9 months ago | |
| a4f300ab | fix: remove sample file in example | Terry Zhuo <t****5@g****m> | 9 months ago | |
| fdbc50fc | update doc | Terry Zhuo <t****5@g****m> | 9 months ago | |
| e62c55e8 | update google thinking model api | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 6ffa085a | update doc | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 11c00809 | fix docker | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 65305f1c | fix docker | Terry Zhuo <t****5@g****m> | 9 months ago | |
| d0cacd09 | fix docker | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 57a4a3cb | add r1 reasoning effort | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 092c5a3e | update e2b toml | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 62f387c4 | update e2b env setup | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 468eeceb | add e2b support | Terry Zhuo <t****5@g****m> | 9 months ago | |
| 342aed87 | feat: support selective eval task | Terry Zhuo <t****5@g****m> | 10 months ago | |
| 8cdcdfe6 | release bigcodebench data 0.1.3 | Terry Zhuo <t****5@g****m> | 10 months ago | |
| 80c83b69 | add AI2 into the doc | Terry Zhuo <t****5@g****m> | 10 months ago | |
| 3ac359c4 | update the result computation | Terry Zhuo <t****5@g****m> | 10 months ago | |
| 05329590 | feat: add new models | Terry Zhuo <t****5@g****m> | 10 months ago | |
| e5f27a98 | fix(generate): update the identifier for o1 and o3 | Terry Zhuo <t****5@g****m> | 11 months ago |