Scan to download
BTC $78,142.60 +0.00%
ETH $2,300.94 +0.06%
BNB $615.57 +0.16%
XRP $1.38 -0.03%
SOL $83.68 +0.09%
TRX $0.3296 +0.54%
DOGE $0.1079 -0.13%
ADA $0.2483 +0.19%
BCH $443.66 -1.40%
LINK $9.09 -0.06%
HYPE $40.90 -1.16%
AAVE $92.02 -0.08%
SUI $0.9165 -0.14%
XLM $0.1585 -0.56%
ZEC $383.99 +1.31%
BTC $78,142.60 +0.00%
ETH $2,300.94 +0.06%
BNB $615.57 +0.16%
XRP $1.38 -0.03%
SOL $83.68 +0.09%
TRX $0.3296 +0.54%
DOGE $0.1079 -0.13%
ADA $0.2483 +0.19%
BCH $443.66 -1.40%
LINK $9.09 -0.06%
HYPE $40.90 -1.16%
AAVE $92.02 -0.08%
SUI $0.9165 -0.14%
XLM $0.1585 -0.56%
ZEC $383.99 +1.31%

DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
Related tags
app_icon
ChainCatcher Building the Web3 world with innovations.