Add `llama_sampler_init` for safe usage of `llama_sampler_free` #11727

cfillion · 2025-02-07T07:43:22Z

This pull request adds a new llama_sampler_init function to the public API so that libllama does both the allocation and deallocation of custom llama_sampler objects. The caller deals with ctx and nothing else. This seems to be in-line with the intent of the current API.

The C API in llama.h claims users can implement llama_sampler_i to create custom llama_sampler. The sampler chain takes ownership and calls llama_sampler_free on them. However, llama_sampler_free is hard-coded to use delete. This is undefined behavior if the object wasn't also allocated via new from libllama's C++ runtime. Callers in C and C-compatible languages do not use C++'s new operator. C++ callers may not be sharing the same heap as libllama.

(I kept the redundant struct keyword to match the style of the rest of the file despite the contributing guidelines.)

const std = @import("std");
const c = @cImport({
  @cInclude("llama.h");
});

pub fn main() !void {
  var gpa = std.heap.GeneralPurposeAllocator(.{}){};
  const allocator = gpa.allocator();

  var iface = std.mem.zeroes(c.llama_sampler_i);

  // crash
  {
    const smpl = try allocator.create(c.llama_sampler);
    smpl.iface = &iface;
    c.llama_sampler_free(smpl);
  }

  // workaround
  {
    var smpl: *c.llama_sampler = c.llama_sampler_init_greedy();
    std.debug.assert(smpl.ctx == null);
    smpl.iface = &iface;
    c.llama_sampler_free(smpl);
  }

  // solution
  {
    const smpl = c.llama_sampler_init(&iface, null);
    c.llama_sampler_free(smpl);
  }
}

$ zig run test.zig -Iinclude -Iggml/include -Lbuild/bin -lc -lllama
free(): invalid pointer
zsh: IOT instruction (core dumped)

Alternative

The caller could be made responsible of both allocation and deallocation of custom sampler types.
That approach would allow flattening sampler objects (removing the ctx member and its second dynamic allocation).

It would be a breaking change:

llama_sampler_i::free becomes a mandatory field
llama_sampler_free doesn't assume how to deallocate
llama_sampler_clone becomes unable to clone samplers whose interface don't specify clone

Pseudo-code:

struct llama_sampler {
  const llama_sampler_i *vtable;
};

// C++
struct CustomSampler : llama_sampler {
  static void free(llama_sampler *smpl) { delete static_cast<CustomSampler *>(smpl); }
  CustomSampler() : llama_sampler {&custom_smpl_vtable} {}
  int m_field;
};
assert(custom_smpl_vtable.free == &CustomSampler::free);
llama_sampler_chain_add(chain, new CustomSampler);

// C
struct CustomSampler {
  llama_sampler base;
  int field;
};
CustomSampler *smpl = malloc(sizeof(CustomSampler));
smpl->base.vtable = &custom_smpl_vtable;
llama_sampler_chain_add(chain, (llama_sampler*)smpl);

// Zig
const CustomSampler = struct {
  base: c.llama_sampler,
  field: u32,
};
var smpl = try allocator.create(CustomSampler);
smpl.base.vtable = &custom_smpl_vtable;
llama_sampler_chain_add(chain, &smpl.base);
// var upcast: *CustomSampler = @fieldParentPtr("base", &smpl.base);

The C API in llama.h claims users can implement `llama_sampler_i` to create custom `llama_sampler`. The sampler chain takes ownership and calls `llama_sampler_free` on them. However, `llama_sampler_free` is hard-coded to use `delete`. This is undefined behavior if the object wasn't also allocated via `new` from libllama's C++ runtime. Callers in C and C-compatible languages do not use C++'s `new` operator. C++ callers may not be sharing the same heap as libllama.

slaren approved these changes Feb 7, 2025

View reviewed changes

cfillion mentioned this pull request Feb 7, 2025

Ignore invalid UTF-8 input in the BPE tokenizer #11729

Merged

ggerganov merged commit 7ee953a into ggerganov:master Feb 7, 2025
46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `llama_sampler_init` for safe usage of `llama_sampler_free` #11727

Add `llama_sampler_init` for safe usage of `llama_sampler_free` #11727

cfillion commented Feb 7, 2025

Add llama_sampler_init for safe usage of llama_sampler_free #11727

Add llama_sampler_init for safe usage of llama_sampler_free #11727

Conversation

cfillion commented Feb 7, 2025

Add `llama_sampler_init` for safe usage of `llama_sampler_free` #11727

Add `llama_sampler_init` for safe usage of `llama_sampler_free` #11727