扫码下载
BTC $59,546.66 -0.57%
ETH $1,571.56 +0.07%
BNB $550.22 -1.12%
XRP $1.04 -0.35%
SOL $71.12 +0.98%
TRX $0.3219 +0.40%
DOGE $0.0729 -1.94%
ADA $0.1433 -1.20%
BCH $190.25 -2.67%
LINK $7.24 -0.64%
HYPE $61.19 -1.02%
AAVE $90.88 -3.62%
SUI $0.6783 -0.86%
XLM $0.1717 -1.04%
ZEC $373.85 -5.47%
BTC $59,546.66 -0.57%
ETH $1,571.56 +0.07%
BNB $550.22 -1.12%
XRP $1.04 -0.35%
SOL $71.12 +0.98%
TRX $0.3219 +0.40%
DOGE $0.0729 -1.94%
ADA $0.1433 -1.20%
BCH $190.25 -2.67%
LINK $7.24 -0.64%
HYPE $61.19 -1.02%
AAVE $90.88 -3.62%
SUI $0.6783 -0.86%
XLM $0.1717 -1.04%
ZEC $373.85 -5.47%

DeepSeek 推出 NSA,用于超快速的长上下文训练和推理

2025-02-18 16:37:45
收藏

ChainCatcher 消息,据金十报道,DeepSeek 推出 NSA。

DeepSeek 称,NSA 是一种与硬件一致且本机可训练的稀疏注意力机制,用于超快速的长上下文训练和推理。通过针对现代硬件的优化设计,NSA 加快了推理速度,同时降低了预训练成本,而不会影响性能。

在一般基准测试、长上下文任务和基于指令的推理上,它的表现与完全注意力模型相当甚至更好。

关联标签
关联标签
app_icon
ChainCatcher 与创新者共建Web3世界