大数跨境
0
0

数据中心液体冷却极限

数据中心液体冷却极限 AI芯片与散热
2025-11-13
6
导读:散热问题成为机架功率密度提升的主要限制因素之一,芯片和服务器级别的冷却技术的创新会持续成为设备厂商的关注方向,也会成为推动机架功率密度提升的动力
图片





液冷群” 请加微信15221898851,备注公司+姓名!


图片


随着机架密度飙升,CoolIT和Accelsius不断提升数据中心液体冷却极限

CoolIT and Accelsius Push Data Center Liquid Cooling Limits Amid Soaring Rack Densities


By Matt Vincent,April 11, 2025

Accelsius



译者说


散热问题成为机架功率密度提升的主要限制因素之一,芯片和服务器级别的冷却技术的创新会持续成为设备厂商的关注方向,也会成为推动机架功率密度提升的动力。


随着机架功率不断攀升至600kW,各厂商正推出相应的冷却创新技术。与此同时,戴尔奥罗修订后的预测表明:到2029年,数据中心冷却与配电市场的年增长率将达14%。

As racks climb toward 600 kW, vendors are delivering thermal innovations to match. Meanwhile, Dell’Oro’s revised forecast sees data center cooling and power distribution growing 14% annually through 2029.


由Accelsius赋能的NeuCool是一套完整的液冷解决方案,采用高效两相循环工艺,并使用对电子设备完全安全的介电制冷剂。NeuCool系统单插槽支持2200 W以上功率,单机架最高可支持100 KW(其中80KW为直接芯片冷却)。

Powered by Accelsius, NeuCool is a complete liquid cooling solution, using a highly-efficient two-phase process and a dielectric refrigerant that is entirely safe for electronics. The NeuCool system supports 2200W+ per socket and up to 100kW per rack (80kW direct-to-chip cooling).


数据中心行业的冷却基准正在迅速转变。随着英伟达新一代机架目标瞄准600KW,且人工智能工作负载对传统基础设施带来的巨大压力,冷却创新已成为当务之急。目前,两家顶尖液冷企业——Accelsius和CoolIT Systems,正以创纪录的性能回应这一紧迫需求。本文将阐述它们的最新系统如何在数据中心的芯片、机架及机柜排级层面突破性能极限,并为支持加速计算的爆发式增长提供关键路径。

The data center industry’s thermal baseline is shifting—fast. With NVIDIA’s next-gen racks targeting 600kW and AI workloads straining traditional infrastructure, cooling innovation has become a front-line imperative. That urgency is now being met with record-setting performance from two select liquid cooling leaders: Accelsius and CoolIT Systems. This article explains how their latest systems push the limits of what’s possible at the chip, rack, and row levels of the data center, offering critical pathways to support the explosive growth in accelerated computing.


在这些公告发布之际,戴尔奥罗集团上调了对数据中心物理基础设施市场的预测,预计到2029年将达到610亿美元。该集团指出,这不仅得益于2024年超预期的业绩,还得益于二级云服务商的增长势头及电信运营商推进人工智能基础设施建设。在AI部署中,机架功率密度从目前的15KW跃升至120KW,这些信息表明,打造能无限制扩展的冷却与供电基础设施的竞争已正式拉开帷幕。

These announcements land as the Dell’Oro Group raises its forecast for data center physical infrastructure to $61 billion by 2029, citing not just stronger-than-expected 2024 results, but also growing momentum among Tier 2 cloud providers and telco-backed AI buildouts. With rack power densities jumping from today’s 15 kW to as high as 120 kW in AI deployments, all of this news is evidence that the race is definitively on to build thermal and power infrastructure that can scale without compromise.


01

Accelsius达到4500W里程碑,两相液冷技术助力人工智能机架功率提升

SAccelsius Hits 4,500W Milestone as Two-Phase Liquid Cooling Heats Up for AI Racks

随着新一代人工智能工作负载推高GPU、服务器及整个机架的功率密度,热管理挑战正迅速成为数据中心设计中至关重要的因素。本周,位于奥斯汀的制冷初创公司Accelsius(该公司在两相热技术方面有着深厚的积累)宣布,其NeuCool平台在直接芯片液冷领域实现了行业领先的性能基准,为功率、耐温性和可扩展性设定了新阈值。

As next-generation AI workloads drive up power densities across GPUs, servers, and full racks, the challenge of thermal management is rapidly becoming existential for data center design. This week, Accelsius, an Austin-based cooling startup with deep roots in two-phase thermal technologies, announced that its NeuCool platform has achieved industry-leading performance benchmarks for direct-to-chip liquid cooling—setting new thresholds for power, temperature resilience, and scalability.


在一系列模拟新一代AI工作负载的研发测试中,Accelsius在模拟GPU插槽的热测试设备上,将NeuCool冷板的散热能力推至4500W。这一数值不仅是迄今为止直接芯片冷却解决方案所记录到的最高负载值,还意味着其能为未来热设计功耗迈向4000W以上的AI加速器与边缘推理设备提供充足的冷却n能力。值得关注的是,测试终止并非因散热失效,而是测试基础设施自身达到了功率上限——这一区别具有重要意义。

In a series of R&D tests simulating next-gen AI workloads, Accelsius pushed its NeuCool cold plate to 4,500 watts on a thermal test vehicle designed to mimic a GPU socket. That figure is not only the highest documented load handled by any direct-to-chip cooling solution to date—it also signals thermal headroom for future AI accelerators and edge inference devices already headed toward 4,000W+ TDPs. Importantly, the test ended not because of thermal failure, but because the test infrastructure itself hit its power limit—a meaningful distinction.


第二个里程碑出现在机架层面。Accelsius展示其行级两相冷却CDU,与改装冷板的四路H100服务器搭配后,即便通入温度达40℃的机房供水(远高于常规阈值),也能为满载的250 KW机架散热。该系统使用每分钟375升(LPM)的流量和标准PG25冷却液,即便在满负载状态下也能将GPU接口温度保持在英伟达的热限制阈值以下(约87℃),从而验证两相冷却在温水环境中的稳定性。

The second milestone came at the rack level. Accelsius demonstrated that its in-row two-phase CDU, paired with retrofitted cold plates on a four-way H100 server, could cool a fully loaded 250kW rack even when fed with facility water at 40℃—well above conventional thresholds. Using 375 liters per minute (LPM) of flow and standard PG25 coolant, the system kept GPU junction temperatures below NVIDIA’s thermal throttle limit (~87℃) even under full load, validating the resilience of two-phase cooling in warm-water scenarios.


据Accelsius介绍,相较于常规单相冷却系统,其技术能在进水温度高出6-8℃的条件下运行,这不仅可使冷却能耗降低25%以上,还能在不同气候环境中大幅增加自然冷却时长。该解决方案在进水温度为20℃、30℃和40℃的测试中均保持稳定,这表明该平台不仅当前性能可靠,还能适配未来诸如 600KW的Kyber机架和立式服务器等新兴基础设施。

According to Accelsius, this capability to operate at 6–8℃ higher inlet temperatures than typical single-phase systems could translate into over 25% energy savings for cooling, while also unlocking significantly more free cooling hours in diverse climates. The solution’s stability across 20℃, 30℃, and 40℃ inlet water tests suggests the platform is not only robust today, but adaptable to emerging infrastructure trends like 600kW Kyber racks and vertically oriented servers.


Accelsius首席技术官(CTO)理查德・邦纳博士表示:“我们正向客户证明,我们不仅能轻松满足当前的性能需求,还能通过性能扩展,适配近期发布的600kW机架需求。我们的研发团队也已做好准备,以应对芯片与服务器架构的快速演进,例如热设计功耗(TDP)达4500W的插槽,以及立式刀片服务器。”

“We’re showing customers that we can easily meet current performance requirements and scale our performance to meet the needs of the recently announced 600kW racks,” said Dr. Richard Bonner, CTO at Accelsius. “Our R&D team has also prepared us for rapidly evolving chip and server architectures, such as 4,500W TDP sockets and vertically oriented blade servers.”


Accelsius 将在华盛顿特区举行的“数据中心世界”展会上(展位号:524)展示测试结果与正在进行的冷板研究(时间:4月15日至17日)。随后于4月29日,该公司将在都柏林举办的OCP欧洲、中东及非洲峰会上发表技术演讲。演讲重点将聚焦立式安装服务器刀片的冷却方案,该方案对英伟达的Vera Rubin Ultra 高密度机柜而言,是日益重要的设计要素。

Accelsius will showcase both test results and ongoing cold plate research atData Center World (Booth #524) in Washington, DC (April 15–17), followed by a technical presentation at the OCP EMEA Summit in Dublin on April 29. The latter will focus on cooling vertically mounted server blades, an increasingly relevant design element for high-density enclosures like NVIDIA’s Vera Rubin Ultra.


在与此次技术发布同步的声明中,Accelsius首席执行官乔希・克拉曼强调了冷却创新对于人工智能时代的战略意义:

In a statement timed with the announcement, Accelsius CEO Josh Claman emphasized the strategic significance of thermal innovation for the AI age:


“黄仁勋(Jensen Huang)在GTC大会的主题演讲中,凸显了当前人工智能领域正经历的巨大创新浪潮。近期的技术突破覆盖行业各个层面,也使得支撑人工智能的基础设施必须以同等速度演进,这种必要性呈指数级增长。我们正目睹当前人工智能冷却基础设施的局限性,因此必须投入资源研发,保障未来需求、及随需求扩展的解决方案。随着人工智能系统在数据中心与边缘场景中变得更复杂、功耗更高,行业需优先推进冷却技术的进步,确保不因落后的基础设施陷入瓶颈。这在英伟达宣布的Vera Rubin Ultra上尤为突出,其单机架600kW的功率表明:未来的人工智能工作负载供电与冷却的支持,仍需持续的技术创新。

Jensen Huang's keynote at GTC highlighted the tremendous innovation we are experiencing in AI. Recent breakthroughs touch every aspect of the industry and exponentially increase the necessity for the infrastructure supporting AI to evolve at the same pace. We are seeing the limitations of current AI cooling infrastructure and must invest in solutions that can meet and scale with these future requirements. As AI systems grow more complex and power-intensive, in data centers and edge locations, the industry must prioritize advancements in cooling technologies to ensure that innovation isn’t bottlenecked by outdated infrastructure. This was particularly apparent in NVIDIA’s announcement of Vera Rubin Ultra, which at 600 kW per rack, speaks to the continued innovation required to power and cool future AI workloads.


02

CoolIT推出1.5MW的行级CDU,助力人工智能产业的冷却发展

CoolIT Raises the Bar with 1.5MW Row-Based CDU to Power AI’s Thermal Future

Accelsius的此次公告凸显液冷解决方案的演进速度之快。但在此领域,并非只有该公司在争夺领先地位。就在几周前,总部位于卡尔加里的CoolIT Systems推出了CHx1500,这款产品为行级冷却液分配单元(CDU)树立了新标杆,其设计将强劲的散热功率、紧凑的外形尺寸与易维护性融为一体,满足人工智能(AI)和高性能计算(HPC)部署的需求。

The Accelsius announcement highlights just how rapidly liquid cooling solutions are evolving—but they're not the only ones staking out leadership in this critical space. Just weeks earlier, Calgary-based CoolIT Systems introduced the CHx1500, a new high-water mark for row-based coolant distribution units (CDUs), combining raw cooling power, compact form factor, and serviceability in a design aimed squarely at the needs of AI and HPC deployments.


03

CoolIT CHx 1500 液-液式CDU

CoolIT CHx 1500 Liquid to Liquid CDU

CoolIT表示,CHx1500的峰值冷却能力达1500kW,是同类产品中性能最高的液-液式冷却液分配单元(CDU)。

With a peak cooling capacity of 1,500 kW, CoolIT contends that the CHx1500 stands as the highest-performing liquid-to-liquid CDU of its class.


该设备是CoolIT与超大规模数据中心运营商及头部处理器厂商深度合作研发的成果,能在逼近温差(ATD)5℃下实现每千瓦1.2升/分钟(LPM/kW)的流量输出,能够为最具散热挑战性的产品部署方案(包括9个NVIDIA GB200的NVL72机架)提供支持。

Developed in close collaboration with hyperscalers and leading processor manufacturers, the unit delivers 1.2 liters per minute per kilowatt (LPM/kW) at a 5°C approach temperature difference (ATD)—enabling support for the most thermally aggressive deployments on the roadmap, including 9 x NVIDIA GB200 NVL72 racks.


CoolIT 产品副总裁尼尔・穆尔吉表示:“CHx1500为CDU(冷却液分配单元)性能树立了标准。它在提供每千瓦最优成本的同时还能满足设备制造商与超大规模数据中心运营商客户所需的各项特性与功能。”

“The CHx1500 sets the standard for CDU performance,” said Neil Mulji, Vice President of Product at CoolIT. “It provides the best cost per kW while delivering the features and functionality our OEM and hyperscale customers demand.”


除了强劲的基础性能外,CoolIT公司最新推出的CDU还在密度和压力方面进行了优化。该公司提供的对比图表显示,CHx1500几乎在所有方面都优于其他主流的CDU:

Beyond brute force, CoolIT's latest CDU is optimized for density and pressure. A comparison chart provided by the company shows the CHx1500 outperforming other major CDUs on virtually every front:


· 总冷却负荷比竞争对手高出27%至148%。

27% to 148% greater total cooling load than rival models.


·冷却负荷密度高出53%至463%(最高可达1516kW/m²)。

53% to 463% higher cooling load density (up to 1516 kW/m²).


· 二次侧压头高出26%至75%,峰值可达44psi。

26% to 75% higher secondary pressure head, peaking at 44 psi.


而且它的物理占地面积很小,尺寸为750mm×1200mm,作为单个机架大小的设备,仍能保持前后两面均可维护的特性。该设计包含可热插拔的关键组件、内置25微米过滤器及冗余系统,将运行的可靠性与操作灵活性相结合。机内智能控制单元能够动态调节温度、流量和压力,可通过10英寸触摸屏进行操作,或通过Redfish、SNMP、Modbus、TCP/IP 等常用协议进行远程操作。

And it does so in a tight physical footprint: 750mm x 1200mm—a single rack-sized unit that maintains front and back serviceability. The design includes hot-swappable critical components, built-in 25-micron filters, and redundant systems, combining uptime reliability with operational flexibility. Intelligent onboard controls dynamically regulate temperature, flow, and pressure, accessible through a 10-inch touchscreen or remotely via Redfish, SNMP, Modbus, TCP/IP, and other common protocols.


CHx1500的结构设计体现了CoolIT 24年的直接液冷(DLC)经验,其采用不锈钢管路和高等级接触液材料,以满足企业级和超大规模数据中心的严苛要求。此外,其设计还具有可扩展性:不仅能适配当前功耗最高的处理器,还能支持未来预计会突破现有功率限制的硬件平台。

The CHx1500’s construction reflects CoolIT’s 24 years of DLC experience, using stainless-steel piping and high-grade wetted materials to meet the rigors of enterprise and hyperscale data centers. It’s also designed to scale: not just for today's most power-hungry processors, but for future platforms expected to surpass today’s limits.


现在,CoolIT已面向全球市场开放订单,同时在超过75个国家提供全生命周期支持,服务内容包括系统设计、安装、CDU与服务器兼容性认证及维护——随着液冷技术从高性能领域扩展至大规模人工智能基础设施,这些服务正是这一转变中的关键要素。

Now available for global orders, CoolIT is offering full lifecycle support in over 75 countries, including system design, installation, CDU-to-server certification, and maintenance services—critical ingredients as liquid cooling shifts from high-performance niche to a requirement for AI infrastructure at scale.


04

资本支出紧跟散热需求:戴尔奥罗预测显示冷却及机架式电力基础设施将迎来大幅增长

TCapex Follows Thermals: Dell’Oro Forecast Signals Surge In Cooling and Rack Power Infrastructure

从Accelsius和CoolIT的动态来看,核心信号十分明确:直接液冷技术正迈入成熟阶段,其产品设计不仅聚焦性能,更兼顾了大规模部署的需求。

Between Accelsius and CoolIT, the message is clear: direct liquid cooling is stepping into its maturity phase, with products engineered not just for performance, but for mass deployment.


不过,仅靠技术并不能决定应用的普及速度。Accelsius与CoolIT在散热领域的创新激增并非孤立发生。随着人工智能基础设施的资金需求不断增加,整个行业正密切关注数据中心运营商如何核算、优先排序及报告其人工智能驱动的投资情况。

Still, technology alone doesn’t determine the pace of adoption. The surge in thermal innovation from Accelsius and CoolIT isn’t happening in a vacuum. As the capital demands of AI infrastructure rise, the industry is turning a sharper eye toward how data center operators account for, prioritize, and report their AI-driven investments.


具体而言:根据戴尔奥罗集团的最新市场数据,向高功率、高密度AI机架的转型,目前正推动数据中心物理层出现长期投资转向。戴尔奥罗已上调对数据中心物理基础设施(DCPI)市场的预测,预计到2029年该市场的复合年增长率将达14%,总营收规模将达到610亿美元。

To wit: According to new market data from Dell’Oro Group, the transition toward high-power, high-density AI racks is now translating into long-term investment shifts across the data center physical layer. Dell’Oro has raised its forecast for the Data Center Physical Infrastructure (DCPI) market, predicting a 14% CAGR through 2029, with total revenue reaching $61 billion.


此次调整是由于2024年的业绩超出预期,尤其是在一级和二级云服务提供商对加速计算技术的采用方面表现尤为突出。该研究机构指出,推动预测上调的三大关键因素如下:

That revision stems from stronger-than-expected 2024 results, particularly in the adoption of accelerated computing by both Tier 1 and Tier 2 cloud service providers. The research firm cited three catalysts for the upward adjustment:


服务器出货量的增长速度超出预期。

Accelerated server shipments outpaced expectations.


高功率基础设施需求正延伸至小型超大规模数据中心运营商和区域云服务商。

Demand for high-power infrastructure is spreading to smaller hyperscalers and regional clouds.


各国政府与一级电信运营商也加入到基础设施建设中,进一步巩固了人工智能作为未来十年基础设施发展浪潮的地位。

Governments and Tier 1 telecoms are joining the buildout effort, reinforcing AI as a decade-long infrastructure wave.


该报告特别指出,热管理是一个关键的转折点。尽管平均机柜功率密度仍维持在15KW左右,但人工智能工作负载正将需求推向60至120KW的范围——这远远超出了传统风冷技术的处理能力。正如戴尔奥罗集团创始人塔姆・戴尔奥罗所言:“热管理领域正发生重大的变革——即从风冷向液冷的转型。”

The report singles out thermal management as a defining pivot point. While average rack densities still hover around 15 kW, AI workloads are pushing requirements into the 60 to 120 kW range—well beyond the reach of traditional air cooling. As Dell’Oro founder Tam Dell’Oro noted, “The biggest change is unfolding in thermal management – the transition from air to liquid cooling.”


这种转型已在Accelsius和CoolIT等厂商的产品战略与研发路线图中体现。无论是Accelsius所展示可适配4500W插槽的冷却能力,还是CoolIT CHx1500 所具备的1.5 MW机架级CDU性能,新一代液冷系统的设计与戴尔奥罗所提及的机架级需求完全匹配。

That transition is already materializing in the product strategies and R&D roadmaps of vendors like Accelsius and CoolIT. Whether it’s the 4,500W socket-level tolerance demonstrated by Accelsius or the 1.5 MW rack-scale CDU performance of CoolIT’s CHx1500, the new generation of liquid cooling systems is being engineered to align directly with the rack-level demands cited by Dell’Oro.


这份报告还强调了地理分布的多元化,其中北美、欧洲中东非洲及亚太地区(不包括中国)引领了增长态势。

The report also highlights geographic diversification, with North America, EMEA, and Asia Pacific (ex-China) leading growth.


与此同时,该研究表明,长期以来在创新中处于落后地位的托管数据中心服务商,如今已准备好在基础设施方面扮演核心角色。这一转变凸显了灵活、易用、高效并能够在共享环境中迅速部署的冷却平台的重要性。

Meanwhile, the study indicates that colocation providers—long relegated to trailing innovation curves—are now poised to take a central role in hosting inferencing infrastructure. This shift underscores the growing importance of flexible, serviceable, and efficient cooling platforms that can be deployed rapidly in shared environments.


05

从冷却到资本支出:基础设施“飞轮”正在加速转动

From Cooling to Capex: The Infrastructure Flywheel is Spinning Up

这些公告与预测共同凸显了一个正在AI基础设施生态系统中形成的广泛观点:冷却与供电创新不再是IT变革的滞后指标,而是推动未来发展的关键赋能因素。

Together, these announcements and forecasts underscore a broader thesis taking shape across the AI infrastructure ecosystem: Thermal and power innovations are no longer trailing indicators of IT change—they are leading enablers of what's next.


液冷技术不再仅仅是实验室与概念验证场景的专用技术。如今,它已成为具有投资价值的基础设施品类,其有效性已通过性能数据得到验证,受到超大规模企业的青睐,并且被纳入五年期市场预测的追踪范围。对于原始设备制造商、数据中心运营商和云服务提供商而言,问题不再是是否采用先进的冷却技术,而是能以多快的速度在其全产品组合中实现该技术的标准化。

Liquid cooling is no longer just a specialty tech for labs and proof-of-concepts. It is now an investment-grade infrastructure category, validated by performance data, embraced by hyperscalers, and tracked in five-year market forecasts. For OEMs, colos, and cloud providers, the question is no longer whether to adopt advanced cooling, but how fast they can standardize it across their portfolios.


随着600KW功率的机架和立式服务器逐渐进入视野,冷却与供电的压力也随之增大。

And with 600kW racks and vertically oriented servers coming into view, the pressure is quite literally on.




图片
免责声明:资料来源深知社,我们尊重原创,也乐于分享。如有侵权或涉及版权等问题,请第一时间联系我司 15221898851 进行删除处理,谢谢!


【声明】内容源于网络
0
0
AI芯片与散热
聚焦AI芯片、数据中心、通讯、光模块、人工智能、新能源等领域的前沿技术及信息发布。
内容 57
粉丝 0
AI芯片与散热 聚焦AI芯片、数据中心、通讯、光模块、人工智能、新能源等领域的前沿技术及信息发布。
总阅读145
粉丝0
内容57