Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad Access Error When running large jobs #20

Open
spprichard opened this issue Apr 9, 2024 · 0 comments
Open

Bad Access Error When running large jobs #20

spprichard opened this issue Apr 9, 2024 · 0 comments
Assignees

Comments

@spprichard
Copy link

Describe the bug
When processing requests to prompt a model, over long periods of time, application crashes with EXC_BAD_ACCESS error.

To Reproduce
When calling the response(to: ) API, over sustained workload

let modelResponse = try await model.response(to: prompt)

Expected behavior
Application should be able to process many request to prompt a model over sustained workloads

Screenshots
image (2)
Pasted Graphic 1

When runnign with Thread Sanatizer enabled, application crashes with the following logs. Screenshot above

llama_new_context_with_model:      Metal compute buffer size =   164.00 MiB
llama_new_context_with_model:        CPU compute buffer size =    12.01 MiB
llama_new_context_with_model: graph nodes  = 1062
llama_new_context_with_model: graph splits = 2
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[MTLBufferLayoutDescriptorInternal stride]: unrecognized selector sent to instance 0x10a873060'
*** First throw call stack:
(
	0   CoreFoundation                      0x00000001862f2ccc __exceptionPreprocess + 176
	1   libobjc.A.dylib                     0x0000000185dda788 objc_exception_throw + 60
	2   CoreFoundation                      0x00000001863a502c -[NSObject(NSObject) __retain_OA] + 0
	3   CoreFoundation                      0x000000018625ca80 ___forwarding___ + 976
	4   CoreFoundation                      0x000000018625c5f0 _CF_forwarding_prep_0 + 96
	5   MetalTools                          0x0000000186a08dc0 -[MTLDebugComputeCommandEncoder dispatchThreadgroups:threadsPerThreadgroup:] + 716
	6   GPUToolsCapture                     0x00000001081eaf30 -[CaptureMTLComputeCommandEncoder dispatchThreadgroups:threadsPerThreadgroup:] + 116
	7   App                                 0x0000000101bfadd0 __ggml_metal_graph_compute_block_invoke + 16480
	8   libclang_rt.tsan_osx_dynamic.dylib  0x00000001065528c8 __wrap_dispatch_apply_block_invoke + 124
	9   libdispatch.dylib                   0x00000001063e6be4 _dispatch_client_callout2 + 20
	10  libdispatch.dylib                   0x0000000106402290 _dispatch_apply_invoke3 + 376
	11  libdispatch.dylib                   0x00000001063e6ba4 _dispatch_client_callout + 20
	12  libdispatch.dylib                   0x00000001063e8af0 _dispatch_once_callout + 160
	13  libdispatch.dylib                   0x00000001064011a8 _dispatch_apply_redirect_invoke + 272
	14  libdispatch.dylib                   0x00000001063e6ba4 _dispatch_client_callout + 20
	15  libdispatch.dylib                   0x00000001063fed0c _dispatch_root_queue_drain + 992
	16  libdispatch.dylib                   0x00000001063ff6d0 _dispatch_worker_thread2 + 188
	17  libsystem_pthread.dylib             0x000000010648fd04 _pthread_wqthread + 228
	18  libsystem_pthread.dylib             0x0000000106497a94 start_wqthread + 8
)
libc++abi: terminating due to uncaught exception of type NSException

Desktop (please complete the following information):

  • Chip: Apple M2 Pro
  • Memory: 32GB
  • OS: macOS 14.4

Additional context
I am attempting to create a Server that can process many items of work, where each item of work must be run through an LLM. I am using Vapor as the HTTP/Server framework & Redis to manage the queue's of work. Each item of work gets 2 passes through the LLM. 1st pass generates information, which is used by the 2nd pass. The model is kept in memory for the processing of 1 work (2 prompts to generate). Following the completion of the work item, the job stops and the model is deallocated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants