GitHub / abetlen/llama-cpp-python / commits
Python bindings for llama.cpp
| SHA | Message | Author | Date | Stats |
|---|---|---|---|---|
| 7440aaa3 | feat: update llama.cpp to f449e0553 (#2312) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 4 hours ago | |
| ddb6a058 | chore: bump version to 0.3.30 (#2311) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
4 days ago | |
| a8042331 | feat: add Pyodide wheel support (#2309) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
4 days ago | |
| 822146b7 | feat: update llama.cpp to e3a74b299 (#2310) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
4 days ago | |
| 824565a9 | feat: update llama.cpp to 6eab47181 (#2308) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
5 days ago | |
| 541b08cc | feat: update llama.cpp to ggml-org/llama.cpp@6e9007ae6 (#2307) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
5 days ago | |
| 3850aff7 | fix(ci): use C++ compiler for Docker builds (#2304) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
6 days ago | |
| e8070920 | fix(ci): skip mtmd CLI wrappers in package builds (#2303) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
6 days ago | |
| ddc0d15b | chore: bump version to 0.3.29 (#2302) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
6 days ago | |
| a52702fc | feat(example): use MTMD batch encoding (#2301) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
6 days ago | |
| 565d3c5c | feat: update llama.cpp to ggml-org/llama.cpp@f05cf4676 (#2300) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
6 days ago | |
| 65b50ca3 | feat: update llama.cpp to ggml-org/llama.cpp@3e7bd4f39 (#2298) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
8 days ago | |
| 19ea70cf | feat: update llama.cpp to ggml-org/llama.cpp@ac4cddeb0 (#2297) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
9 days ago | |
| b5eefc82 | feat: update llama.cpp to ggml-org/llama.cpp@76da2450a (#2295) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
10 days ago | |
| 0edb5d8d | feat: update llama.cpp to ggml-org/llama.cpp@e3471b3e7 (#2294) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
11 days ago | |
| 051dda25 | feat(example): support server video inputs and Gemma text tool calls (#2291) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
11 days ago | |
| e1079999 | feat: update llama.cpp to 8f83d6c27 (#2290) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| e8191f00 | fix(example): correct GPT-OSS tool calling config for server example (#2289) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| d4ac2c2c | fix(example): support multi-step Responses tool streaming (#2288) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| 7eb494d8 | fix(ci): repair Linux accelerator wheels (#2286) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| 411e0f40 | fix(example): derive streaming response parser boundaries from schema (#2287) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| a72325b0 | fix(example): avoid duplicate streamed response deltas (#2285) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| 8e470ac5 | chore: bump version to 0.3.28 (#2284) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| fddee272 | feat(example): align server MTP support with llama.cpp (#2283) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| db66da32 | feat: update llama.cpp to ggml-org/llama.cpp@9e3b928fd (#2282) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| fe927bd0 | feat(example): add OpenAI-compatible embeddings endpoint (#2281) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
12 days ago | |
| 380177b7 | chore: bump version to 0.3.27 (#2279) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
13 days ago | |
| cf188306 | feat: update llama.cpp to ggml-org/llama.cpp@465b1f0e7 (#2278) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
13 days ago | |
| 66635a02 | feat(example): Updated server example (batch processing, `/v1/responses` api,... |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
13 days ago | |
| ed833664 | feat: update llama.cpp to 5a69c9743 (#2277) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
14 days ago | |
| 7f16fe19 | docs: add Gemma 4 QAT Colab notebook (#2276) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
14 days ago | |
| 7a2a36d7 | docs: fix Gemma 4 Colab notebook (#2275) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 8949066b | docs: add Gemma 4 Colab notebook (#2274) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 46849851 | fix(ci): build one riscv64 release wheel (#2273) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 67219897 | fix(ci): index all CUDA wheel variants (#2272) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 7c86eae0 | fix(ci): allow empty wheel indexes (#2271) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 78ac75e8 | fix(ci): repair release wheel workflows (#2270) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 5151ac7a | chore: bump version to 0.3.26 (#2269) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| c2e22ae8 | feat: update llama.cpp to ggml-org/llama.cpp@7c158fbb4 (#2268) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 9013c1d3 |
chore(deps): bump docker/setup-qemu-action from 3 to 4 (#2267)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 23fe09f0 |
chore(deps): bump softprops/action-gh-release from 2 to 3 (#2266)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 6bccad59 |
chore(deps): bump actions/configure-pages from 5 to 6 (#2265)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 64c01752 |
chore(deps): bump actions/upload-pages-artifact from 3 to 5 (#2264)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 8edcd15e |
chore(deps): bump docker/build-push-action from 6 to 7 (#2263)
Co-authored-by: dependabot[bot] <4****]@u****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
15 days ago | |
| 2dae4770 | feat: Generic Multimodal Chat Handler (#2256) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| 9f6efb07 |
chore(deps): bump conda-incubator/setup-miniconda from 3.1.0 to 4.0.1 (#2261)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| 927dde21 |
chore(deps): bump actions/deploy-pages from 4 to 5 (#2260)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| b46bccf4 |
chore(deps): bump docker/login-action from 3 to 4 (#2259)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| df45432b |
chore(deps): bump actions/checkout from 4 to 6 (#2258)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| d0990015 | chore(deps): bump actions/download-artifact from 4 to 8 (#2257) |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| e2d148a9 | feat: update llama.cpp to ggml-org/llama.cpp@e3ba22d6c (#2262) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| cc2efc58 | feat: update llama.cpp to ggml-org/llama.cpp@94a220cd6 (#2255) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| 8d2d2699 | feat: update llama.cpp (#2254) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
16 days ago | |
| ddaac103 | feat: update llama.cpp (#2253) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| 3754c04a | feat(ci): add ROCm wheel builds (#2252) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| ab7a9b0c | feat(ci): add Vulkan wheel builds (#2251) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| 6e6c4e6d |
chore(deps): bump actions/setup-python from 5 to 6 (#2247)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| f8bd67df |
chore(deps): bump docker/setup-buildx-action from 3 to 4 (#2246)
Co-authored-by: dependabot[bot] <4****]@u****m>, Andrei <a****n@g****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| b439a84a |
chore(deps): bump actions/upload-artifact from 4 to 7 (#2245)
Co-authored-by: dependabot[bot] <4****]@u****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| aa944e4c | ci: cache embedding test model (#2250) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| dad5d0aa |
chore(deps): bump actions/cache from 4 to 5 (#2248)
Co-authored-by: dependabot[bot] <4****]@u****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| bbdc8518 |
chore(deps): bump pypa/cibuildwheel from 2.22.0 to 3.4.1 (#2249)
Co-authored-by: dependabot[bot] <4****]@u****m> Signed-off-by: dependabot[bot] <s****t@g****m> |
dependabot[bot] <4****]@u****m>
Committed by: GitHub <n****y@g****m> |
17 days ago | |
| d185d64a |
fix: handle additional `from_pretrained` files in subfolders (#2085)
Co-authored-by: abetlen <a****n@g****m> |
Markus <m****s@x****e>
Committed by: GitHub <n****y@g****m> |
18 days ago | |
| f1bfa117 | chore: bump version to 0.3.25 (#2243) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
18 days ago | |
| 4b66c45e | feat: update llama.cpp to 210a6570c (#2242) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
18 days ago | |
| a9b480f8 | feat: add Gemma 4 multimodal chat support (#2241) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 927b574e | docs: add Python 3.14 classifier (#2240) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 718a1ca5 | feat(ci): add CUDA 13 wheel builds (#2239) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 43c92a7f | feat(ci): add CUDA 11.8 wheel builds (#2238) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| c7af423e | fix(ci): add Pascal compute capability targets to CUDA wheel builds (#2237) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 26633bd1 | chore: bump version to 0.3.24 (#2236) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 20240609 | feat: update llama.cpp to af6528e6d (#2235) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| c3adb354 | server types: Move 'model' parameter to clarify it is used (#1786) |
adam jones <d****t@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 52cf7475 | docs: update ROCm install instructions (#1867) |
Alex Grönholm <a****m@n****i>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| da07e463 | docs: update llama.cpp build docs link (#2056) |
Matthias <d****v@s****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 86871224 | docs: fix NanoLlava chat handler name in README (#2059) |
Stefano Fiorucci <s****i@g****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| e3aa6b5c | docs: fix typo in README (#2072) |
Imad Saddik <7****k@u****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| cdb7a755 |
fix: clear prompt for recurrent / hybrid models when only a partial prefix ma...
Co-authored-by: Ralf Waldukat <r****t@g****m> |
avion23 <1****3@u****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 73ee7cd5 |
fix(docs): remove double word typo in README (#1791)
Co-authored-by: Andrei <a****n@g****m> |
Victor Oluwadare <1****0@u****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 33bf9d24 |
fix: correct typo in comments and settings description (#2121)
Co-authored-by: thecaptain789 <t****9@u****m> |
thecaptain789 <2****9@u****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 84bc143b |
fix: match Transformers `tojson` in chat template rendering (#1486)
Co-authored-by: abetlen <a****n@g****m> |
Sigbjørn Skjæret <s****t@s****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| 58480208 |
fix: use env var configured multimodal library override paths when loading sh...
Co-authored-by: Andrei <a****n@g****m> |
navratil-matej <6****j@u****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| e8ee64b2 |
feat: add Jinja2 loop controls to chat templates (#2018)
Co-authored-by: Andrei <a****n@g****m> |
Joshua Turner <j****r@h****m>
Committed by: GitHub <n****y@g****m> |
19 days ago | |
| fdf38b3e |
fix: avoid cleanup errors for partially initialized LlamaModel (#2173)
Co-authored-by: abetlen <a****n@g****m> |
usernames122 <8****2@u****m>
Committed by: GitHub <n****y@g****m> |
20 days ago | |
| 6bdab5d8 |
fix: suppress stdout and stderr in Jupyter notebooks (#2181)
Co-authored-by: Anai-Guo <2****o@u****m> |
Tai An <a****1@o****m>
Committed by: GitHub <n****y@g****m> |
20 days ago | |
| b91460b6 |
feat: enable arm64 musl builds (#2221)
Co-authored-by: abetlen <a****n@g****m> |
Alex O'Connell <3****6@u****m>
Committed by: GitHub <n****y@g****m> |
20 days ago | |
| f160bf7a |
Fix: model fails to load when chat template uses HuggingFace generation tags ...
Co-authored-by: abetlen <a****n@g****m> |
Tobias <5****2@u****m>
Committed by: GitHub <n****y@g****m> |
20 days ago | |
| 2c455a5b | feat: Update llama.cpp to d749821db (#2233) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
20 days ago | |
| 3bda0914 | docs: add contributing guide (#2229) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
27 days ago | |
| 52fe54bb | feat: Update llama.cpp to c0c7e147e (#2228) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
27 days ago | |
| 5dd9b1ce | feat: Update llama.cpp to b9a2170fc (#2223) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| c7bea711 |
chore: migrate llama.cpp submodule to ggml-org (#2034)
Co-authored-by: abetlen <a****n@g****m> |
shalinib-ibm <S****i@i****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 7664a3ed | feat: Update llama.cpp to ggerganov/llama.cpp@91e84fed6 (#2218) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 95ccb191 |
fix(embedding): set kv_unified=True when embedding=True to enable batch proce...
Co-authored-by: abetlen <a****n@g****m> |
Sanjana Brahmbhatt <9****3@u****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 4a1a8ecd | chore: bump version to 0.3.23 (#2215) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 56841123 | feat: update llama.cpp to 7d442abf (#2214) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| f8c1f36b | fix(embed): mark all tokens as output to suppress llama.cpp 'overriding' INFO... |
Tai An <a****1@o****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| f7746900 | feat: update llama.cpp to 5d6f18a63 (#2207) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 128c331b | fix: configure n_seq_max for batched embeddings (#2206) |
Andrei <a****n@g****m>
Committed by: GitHub <n****y@g****m> |
about 1 month ago | |
| 90e8df95 | fix(_internals): use n_tokens0 offset when enabling last-token logits in add_... |
Tai An <a****1@o****m>
Committed by: GitHub <n****y@g****m> |
about 2 months ago |