Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add llama_sampler_init for safe usage of llama_sampler_free #11727

Merged
merged 1 commit into from
Feb 7, 2025

Conversation

cfillion
Copy link
Contributor

@cfillion cfillion commented Feb 7, 2025

This pull request adds a new llama_sampler_init function to the public API so that libllama does both the allocation and deallocation of custom llama_sampler objects. The caller deals with ctx and nothing else. This seems to be in-line with the intent of the current API.

The C API in llama.h claims users can implement llama_sampler_i to create custom llama_sampler. The sampler chain takes ownership and calls llama_sampler_free on them. However, llama_sampler_free is hard-coded to use delete. This is undefined behavior if the object wasn't also allocated via new from libllama's C++ runtime. Callers in C and C-compatible languages do not use C++'s new operator. C++ callers may not be sharing the same heap as libllama.

(I kept the redundant struct keyword to match the style of the rest of the file despite the contributing guidelines.)

const std = @import("std");
const c = @cImport({
  @cInclude("llama.h");
});

pub fn main() !void {
  var gpa = std.heap.GeneralPurposeAllocator(.{}){};
  const allocator = gpa.allocator();

  var iface = std.mem.zeroes(c.llama_sampler_i);

  // crash
  {
    const smpl = try allocator.create(c.llama_sampler);
    smpl.iface = &iface;
    c.llama_sampler_free(smpl);
  }

  // workaround
  {
    var smpl: *c.llama_sampler = c.llama_sampler_init_greedy();
    std.debug.assert(smpl.ctx == null);
    smpl.iface = &iface;
    c.llama_sampler_free(smpl);
  }

  // solution
  {
    const smpl = c.llama_sampler_init(&iface, null);
    c.llama_sampler_free(smpl);
  }
}
$ zig run test.zig -Iinclude -Iggml/include -Lbuild/bin -lc -lllama
free(): invalid pointer
zsh: IOT instruction (core dumped)

Alternative

The caller could be made responsible of both allocation and deallocation of custom sampler types.
That approach would allow flattening sampler objects (removing the ctx member and its second dynamic allocation).

It would be a breaking change:

  • llama_sampler_i::free becomes a mandatory field
  • llama_sampler_free doesn't assume how to deallocate
  • llama_sampler_clone becomes unable to clone samplers whose interface don't specify clone

Pseudo-code:

struct llama_sampler {
  const llama_sampler_i *vtable;
};

// C++
struct CustomSampler : llama_sampler {
  static void free(llama_sampler *smpl) { delete static_cast<CustomSampler *>(smpl); }
  CustomSampler() : llama_sampler {&custom_smpl_vtable} {}
  int m_field;
};
assert(custom_smpl_vtable.free == &CustomSampler::free);
llama_sampler_chain_add(chain, new CustomSampler);

// C
struct CustomSampler {
  llama_sampler base;
  int field;
};
CustomSampler *smpl = malloc(sizeof(CustomSampler));
smpl->base.vtable = &custom_smpl_vtable;
llama_sampler_chain_add(chain, (llama_sampler*)smpl);

// Zig
const CustomSampler = struct {
  base: c.llama_sampler,
  field: u32,
};
var smpl = try allocator.create(CustomSampler);
smpl.base.vtable = &custom_smpl_vtable;
llama_sampler_chain_add(chain, &smpl.base);
// var upcast: *CustomSampler = @fieldParentPtr("base", &smpl.base);

The C API in llama.h claims users can implement `llama_sampler_i` to
create custom `llama_sampler`. The sampler chain takes ownership and
calls `llama_sampler_free` on them. However, `llama_sampler_free` is
hard-coded to use `delete`. This is undefined behavior if the object
wasn't also allocated via `new` from libllama's C++ runtime. Callers
in C and C-compatible languages do not use C++'s `new` operator. C++
callers may not be sharing the same heap as libllama.
@ggerganov ggerganov merged commit 7ee953a into ggerganov:master Feb 7, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants