Thursday, July 2, 2026
CN
  • About
  • Advertise
  • Careers
  • Contact
Money Compass
  • Home
  • Financial News
  • Investment News
  • Other News
    • Bursa News
    • Government News
    • Listing Companies News
    • Oversea Financial & Investment News
  • Interviews
    • Features Interviews
    • Corporate Interviews
  • Financial & Investment Articles
  • PR Newswire
  • Login
No Result
View All Result
Money Compass
Home PR Newswire

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Money Compass by Money Compass
July 1, 2026
in PR Newswire
0
Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper
0
SHARES
5
VIEWS
Share on FacebookShare on Twitter
  • Introduces a low-rank-based approach to KV cache compression, one of the key bottlenecks in long-context AI
  • Speeds up attention computation by up to 6.9x and overall generation throughput by up to 3.1x, moving beyond memory savings to faster inference
  • Selected as a Spotlight paper at ICML 2026, representing about 2.2% of reviewed submissions and about 8.4% of accepted papers
  • Following the attention around Google’s TurboQuant at ICLR 2026, STAR-KV presents another approach to advancing KV cache compression
  • Paper available on arXiv; source code released on GitHub

SEOUL, South Korea, July 2, 2026 /PRNewswire/ — Dnotitia Inc. (Dnotitia), a company specializing in long-term memory AI and semiconductor-based AI infrastructure technologies, has released the paper and source code for “STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control.” The technology was developed through a joint research effort involving UC San Diego’s VVIP Lab and Dnotitia researchers, and the paper was selected as a Spotlight paper at ICML 2026 (International Conference on Machine Learning 2026), one of the world’s leading conferences in machine learning.

Dnotitia contributed STAR-KV, selected as an ICML 2026 Spotlight Paper, achieving up to 20x KV cache compression and faster inference through low-rank compression and GPU optimization
Dnotitia contributed STAR-KV, selected as an ICML 2026 Spotlight Paper, achieving up to 20x KV cache compression and faster inference through low-rank compression and GPU optimization

In the experiments reported in the paper, low-rank compression alone reduced the KV cache by up to 75%. Combined with the mixed-precision quantization method proposed in the paper, STAR-KV compressed the full KV cache by up to 20x. The technology also improves computation speed through custom GPU kernels, increasing attention computation speed by up to 6.9x and overall generation throughput by up to 3.1x. STAR-KV also showed higher accuracy than major existing KV cache compression methods.

Related posts

Kavalan Debuts Solist Madeira Cask in Global Travel Retail

Kavalan Debuts Solist Madeira Cask in Global Travel Retail

July 2, 2026
Kavalan Debuts Solist Madeira Cask in Global Travel Retail

Kavalan Debuts Solist Madeira Cask in Global Travel Retail

July 2, 2026

KV cache compression has become a key technical challenge in AI infrastructure. As research into reducing the memory bottleneck of long-context AI gains momentum, including the attention around Google’s TurboQuant at ICLR 2026, STAR-KV presents a new approach that combines low-rank compression with quantization and GPU execution optimization.

The KV cache is temporary memory stored on the GPU so that a large language model (LLM) does not have to recompute context it has already processed. As AI evolves into agentic systems that use multiple documents, conversation history, code, search results, and outputs from external tools, the amount of context a model must process is growing rapidly. In this environment, the KV cache has emerged as a key bottleneck affecting both GPU memory usage and inference cost.

According to the STAR-KV paper, when a LLaMA-3.1-8B model processes a 128K-token context at a batch size of 4, the KV cache accounts for about 81% of total GPU memory. As long-context AI becomes more widely used, KV cache compression is increasingly viewed as a core AI infrastructure technology for processing long context at lower cost.

ICML, where the STAR-KV paper was accepted, is widely regarded as one of the top international conferences in AI and machine learning, alongside NeurIPS and ICLR. ICML 2026 will be held from July 6 to 11 at COEX in Seoul. This year, 23,918 papers entered review, 6,352 were accepted, and 536 were selected as Spotlight papers. Spotlight papers account for about 2.2% of all reviewed submissions and about 8.4% of accepted papers.

Going forward, Dnotitia plans to further advance STAR-KV for use in real-world AI service environments and explore its application to open-source LLM inference frameworks such as vLLM.

“Technologies that help AI process longer context faster and at lower cost are advancing rapidly” said MK Chung, CEO of Dnotitia. “STAR-KV addresses the core bottlenecks in KV cache capacity and attention processing speed, and Dnotitia aims to contribute to the AI inference ecosystem through open sourcing.”

​ 

Previous Post

CAI Closes Recapitalization with JLL Partners, Unlocking New Opportunities for Growth

Next Post

Kavalan Debuts Solist Madeira Cask in Global Travel Retail

Next Post
Kavalan Debuts Solist Madeira Cask in Global Travel Retail

Kavalan Debuts Solist Madeira Cask in Global Travel Retail

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

BROWSE BY CATEGORIES

  • Blog
  • Bursa News
  • Corporate Interviews
  • Features Interviews
  • Financial & Investment Articles
  • Financial News
  • Government News
  • Investment News
  • Listing Companies News
  • Oversea Financial & Investment News
  • PR Newswire

BROWSE BY TOPICS

2018 League Balinese Culture Bali United Budget Travel business Champions League Chopper Bike Doctor Terawan industrial Istana Negara Malaysia Market Stories National Exam net zero emissions targets 2025 Renewable energy Visit Bali

Recent News

  • Kavalan Debuts Solist Madeira Cask in Global Travel Retail
  • Kavalan Debuts Solist Madeira Cask in Global Travel Retail
  • Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Category

  • Blog
  • Bursa News
  • Corporate Interviews
  • Features Interviews
  • Financial & Investment Articles
  • Financial News
  • Government News
  • Investment News
  • Listing Companies News
  • Oversea Financial & Investment News
  • PR Newswire
  • About
  • Advertise
  • Careers
  • Contact

Copyright © 2024 Money Compass Media (M) Sdn Bhd. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Features Interviews
  • Government News
  • Financial News
  • Investment News
  • Listing Companies News
  • Corporate Interviews
  • Bursa News
  • Financial & Investment Articles
  • Oversea Financial & Investment News

Copyright © 2024 Money Compass Media (M) Sdn Bhd. All Rights Reserved