跳至主要内容

Dell Brings Turnkey GPUaaS to VMware Using Bitfusion

 Dell EMC is bringing a new GPU-as-a-Service or GPUaaS offering to the market. Underpinning the GPU hardware, Dell EMC is leveraging software from VMware as well as the Bitfusion acquisition to help drive adoption and utilization of accelerated computing. With the solution, instead of targeting those leading companies that have already deployed an AI or HPC solution, Dell EMC is hoping to capture the next wave of adoption by making the task easier.


Dell is using this graphic to frame the conversation. If we think of 14.6% of the market using AI today, they are the early adopters who are leaders in the field. Still, that leaves 85.4% of the market that are not leaders and that Dell hopes to service with their solutions.

Dell Brings Turnkey GPUaaS for AI and HPC
As part of the Dell Technologies strategy, it is leaning on its VMware integration to bring GPU accelerated AI and HPC to vSphere. One will notice that while the company is saying it is for HPC, this is not for high-performance supercomputer HPC. Instead, it can be used for “some HPC workloads.”

Most companies that have embarked on this journey have already solved this problem. Indeed, using Kubernetes has made solving the basic problem Dell points out trivial, but if you want to run VMware Tanzu and have a typical VMware environment, you likely have silos of GPUs that are available. These GPUs reside in different silos. A great example is GPUs that are used for VDI during the day but sit idle in the evening. Likewise, different groups may have small GPU servers. Dell EMC knows that this is an underutilization of expensive hardware, and is looking to make it more efficient.

With the Bitfusion acquisition, VMware vSphere admins can manage pools of GPUs. They can allocate GPUs and GPU memory to different user groups from one pool.

The overall Dell EMC PowerEdge servers, PowerSwitch networking, and Isilon storage is designed as a Ready Solution that can be quickly deployed according to a pre-defined formula.

The formula for the GPUaaS Ready Solution for AI uses fairly standard Intel and NVIDIA components:

As a quick aside here, Dell EMC’s offering is what one would expect from a larger corporate offering. Still, if you look at what leading AI companies such as Tesla and even smaller but well funded autonomous driving companies like Zoox use, they are not building out massive arrays of NVIDIA V100 / T4’s as their GPUs of choice. Those companies use different GPUs in systems more similar to the Dell EMC DSS 8440 for their scale-out GPU clusters. Higher-end scale-up NVIDIA-based AI work will happen on NVSwitch-based solutions such as the Inspur NF5488M5 we reviewed, the new DGX A100 (when it is available), new HGX-2 based 16x GPU solutions and the like.

Dell’s offering is not necessarily focused on those who see AI infrastructure as a critical competency, but rather for IT departments that want to provide solutions based on Dell EMC and VMware. We reviewed the Dell EMC PowerEdge R740xd and know that Dell has many great corporate HPC/ AI customers for the Dell EMC PowerEdge C4140, but it is a different solution for a different type of customer.

The HPC solution is interesting because it builds upon the AI solution, with additional hardware options such as using Mellanox (now NVIDIA) Infiniband as well as AMD EPYC based nodes. This is a good indication to see that Dell EMC is seeing interest in AMD EPYC Rome and future Milan CPUs in the HPC space.

Dell also has scale-out storage solutions available. One will quickly notice that the HPC design has a lot more options on its validated hardware.

Final Words
On a pre-briefing, Dell said that not only will the offering be part of one of its Ready Solutions but the company will go one step further and configure the rack at the factory using this solution so that it is a turnkey experience.

Overall, this makes a lot of sense. Dell has an enormous market within its customer base of Dell-VMware shops that can utilize this type of solution. Although the leading-edge deployments are always interesting, one can say that the sweet spot is making it available to more buyers. Perhaps the underlying message here is not necessarily that Dell EMC has a solution aimed at the leading edge ~15%. Instead, it is that Dell EMC has a solution that is going after the next 70% of customers. As Dell and VMware make it easier to deploy AI infrastructure, there is a natural separation that will occur where those who do not adopt, and are in that last 15% (or so) will not be able to compete. In a sense, by democratizing the technology with VMware and Bitfusion, Dell can help make structural winners and losers driven by the CIO’s staff at large corporations. Perhaps heading into the next economic cycle it is those who take the opportunity to integrate now that make it out.

评论

此博客中的热门博文

AMD Ruilong 5000 APU "Cezanne" exposure: GPU still integrates Vega

Although Ruilong 4000 series desktop APU processors have not been officially released, news about the next generation Ruilong 5000 series has begun to gradually reveal. According to the device ID information captured by Komachi, the AMD Ruilong 5000 APU is code-named "Cezanne (Cezanne)" and the PCI ID is 1638. Previously, the PCI ID of the Ruilong 4000 Renoir APU was 1636. Interestingly, according to Komachi, the partner with Cezanne Zen3 CPU is still GFX9, which is the Vega (Vega) GPU. Of course, AMD will further tap Vega's potential and debug performance. Finally, there is good news. The Ruilong 5000 "Cezanne" APU will retain the AM4 interface. With the realization of Cezanne, we can also sort out Zen3 family products. EPYC server product code is Milan (Milan), desktop HEDT fever platform thread tearer code is Genesis Peak, desktop mainstream platform (Ruilong 4000) code Vermeer (Vermil). In addition, the Zen4 architecture EPYC codename Genoa (Genoa), the...

NVIDIA GeForce RTX 2080 Ti

The GeForce RTX 2080 Ti is an enthusiast-class graphics card by NVIDIA, launched in September 2018. Built on the 12 nm process, and based on the TU102 graphics processor, in its TU102-300A-K1-A1 variant, the card supports DirectX 12 Ultimate. This ensures that all modern games will run on GeForce RTX 2080 Ti. Additionally, the DirectX 12 Ultimate capability guarantees support for hardware-raytracing, variable-rate shading and more, in upcoming video games. The TU102 graphics processor is a large chip with a die area of 754 mm² and 18,600 million transistors. Unlike the fully unlocked TITAN RTX, which uses the same GPU but has all 4608 shaders enabled, NVIDIA has disabled some shading units on the GeForce RTX 2080 Ti to reach the product's target shader count. It features 4352 shading units, 272 texture mapping units, and 88 ROPs. Also included are 544 tensor cores which help improve the speed of machine learning applications. The card also has 68 raytracing acceleration cores. NVID...

Nvidia GTX 1650S (Super)

The Nvidia GTX 1650 Super features 12Gbps GDDR6 up from 8Gbps of GDDR5 on the “not super” GTX 1650. With a launch price of just $160 the 1650S is aimed squarely at AMD’s 500 series cards. Comparing the GTX 1650S and the RX 590 shows that the 590 leads by 3% but the 1650S is around 10% cheaper. The 1650S has a TDP of 100W which is 50% lower than a typical AMD 500 series card. With a lower TDP, the 1650S requires a less demanding thermal solution and therefore runs a lot quieter. Nvidia’s top value offering prior to the 1650S was the $70 more expensive 1660S which is around 18% faster but also 40% more expensive than the 1650S. Although the 1650S promises to shake up, if not dominate, the value end of the GPU market, street prices are ultimately king. Further price cuts could, once again, bring AMD’s 500 series back into the game. [Nov '19 GPUPro] Poor: 54%Average: 58.3%Great: 62% Popular builds with this GPU Gigabyte B450M DS3H (1,014) Asus PRIME B450M-A (528) Asroc...