扫码下载
BTC $78,180.28 -0.25%
ETH $2,301.50 -0.10%
BNB $616.07 +0.01%
XRP $1.38 -0.12%
SOL $83.66 -0.13%
TRX $0.3300 +0.88%
DOGE $0.1078 -0.30%
ADA $0.2479 +0.01%
BCH $444.00 -1.52%
LINK $9.09 -0.16%
HYPE $41.04 -0.76%
AAVE $92.02 -0.59%
SUI $0.9167 -0.15%
XLM $0.1586 -0.69%
ZEC $383.65 +0.02%
BTC $78,180.28 -0.25%
ETH $2,301.50 -0.10%
BNB $616.07 +0.01%
XRP $1.38 -0.12%
SOL $83.66 -0.13%
TRX $0.3300 +0.88%
DOGE $0.1078 -0.30%
ADA $0.2479 +0.01%
BCH $444.00 -1.52%
LINK $9.09 -0.16%
HYPE $41.04 -0.76%
AAVE $92.02 -0.59%
SUI $0.9167 -0.15%
XLM $0.1586 -0.69%
ZEC $383.65 +0.02%

DeepSeek 推出 NSA,用于超快速的长上下文训练和推理

2025-02-18 16:37:45
收藏

ChainCatcher 消息,据金十报道,DeepSeek 推出 NSA。

DeepSeek 称,NSA 是一种与硬件一致且本机可训练的稀疏注意力机制,用于超快速的长上下文训练和推理。通过针对现代硬件的优化设计,NSA 加快了推理速度,同时降低了预训练成本,而不会影响性能。

在一般基准测试、长上下文任务和基于指令的推理上,它的表现与完全注意力模型相当甚至更好。

关联标签
关联标签
app_icon
ChainCatcher 与创新者共建Web3世界