[Inference PagedAttention] Integrate initial paged attention implementation into maxengine (2/N) #1686
Job | Run time |
---|---|
7s | |
43s | |
4m 34s | |
4m 11s | |
7m 30s | |
20m 42s | |
7m 11s | |
7s | |
4s | |
45m 9s |
Job | Run time |
---|---|
7s | |
43s | |
4m 34s | |
4m 11s | |
7m 30s | |
20m 42s | |
7m 11s | |
7s | |
4s | |
45m 9s |