llama.cpp

History

Oliver Simons 021cc28bef cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741 ) * Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs Gemma3n uses Matrix-Matrix addition as part of their input processing, wrongly triggering CUDA_GRAPH disablement on NVGPUs even when batch-size of 1 is used. * Exclude `project_per_layer_input` by matching node names This ensures that all other graphs which don't exhibit this pattern do not have their behavior changed. * Revert unnecessary formatting changes		2025-07-18 04:35:32 -07:00
..
cmake	ggml-cpu : rework weak alias on apple targets (#14146 )	2025-06-16 13:54:15 +08:00
include	ggml: Add initial WebGPU backend (#14521 )	2025-07-16 18:18:51 +03:00
src	cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741 )	2025-07-18 04:35:32 -07:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml: Add initial WebGPU backend (#14521 )	2025-07-16 18:18:51 +03:00