Search Results for author: Ekaterina Aidova

Found 1 papers, 1 papers with code

Leveraging Speculative Sampling and KV-Cache Optimizations Together for Generative AI using OpenVINO

1 code implementation8 Nov 2023 Haim Barad, Ekaterina Aidova, Yury Gorbachev

Inference optimizations are critical for improving user experience and reducing infrastructure costs and power consumption.

Quantization Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.