The middle tier of AI still needs powerful hardware

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn more

This article is part of the VB Special Issue “Fit for Purpose: Adapting AI Infrastructure.” Catch all the other stories here.

As more organizations look to build more AI applications and even AI agents, it is becoming increasingly clear that organizations need to use different language models and databases to achieve the best results.

However, suddenly switching an application from Llama 3 to Mistral may require some technological infrastructure mastery. Here context and orchestration layer enters; The middle layer that connects the underlying models to applications will ideally control the traffic of API calls to the models to execute tasks.

The middle layer mainly consists of software such as: LangChain or LlamaIndex It helps bridge databases, but the question remains: Will the middleware consist solely of software, or does hardware still have a role to play here beyond powering many of the models that power AI applications?

The answer is that the role of hardware is: Support frameworks like LangChain and databases that bring applications to life. Businesses need to have hardware stacks that can handle massive data streams and even look after devices that can do a lot of data center work on the device.

>>Don’t miss our special issue: Fit for Purpose: Customizing AI Infrastructure.<

“While it is true that the AI middleware is primarily a software concern, hardware providers can significantly impact its performance and efficiency,” said Scott Gnau, head of data platforms at the data management company. InterSystems.

While software supports AI orchestration, servers and GPUs It could not handle the big data movement.

In other words, for the software AI orchestration layer to work, the hardware layer must be smart and efficient, focusing on high-bandwidth, low-latency connections to data and models that can handle heavy workloads.

“This model orchestration layer needs to be powered by fast chips,” said Matt Candy, managing partner of generative AI. IBM Consultingin an interview. “I could see a world where silicon/chips/servers could be optimized based on the type and size of the model used for different tasks, as the orchestration layer switches between them.”

Existing GPUs will work anyway if you have access to them

John Roese, global CTO and chief AI officer DellHe told VentureBeat that hardware like the hardware made by Dell still has a role in that middleware.

“This is an issue with both hardware and software because what people forget about AI is that it looks like software,” Roese said. “Software always runs on hardware, and AI software is the most demanding software we have ever developed, so you have to understand the performance layer of where the MIPs are and where the computing is to make these things work properly.”

This AI middleware may need fast and powerful hardware, but there is no need for new dedicated hardware beyond the GPUs and other chips currently available.

“Of course hardware is an important enabler, but I don’t know if there’s any specific hardware that will really push it forward other than GPUs that make models run faster,” Gnau said. “I think software and architecture are places where you can gently optimize the ability to minimize data movement.”

AI agents make AI orchestration even more important

The rise of artificial intelligence agents has made strengthening the middle layer even more critical. When AI agents start talking to other agents and making multiple API calls, the orchestration layer drives traffic and fast servers are crucial.

“This layer also provides seamless API access to all different types of AI models and technologies and a seamless user experience layer that spans them all,” said IBM’s Candy. “In this middleware stack, I call it the AI controller.”

artificial intelligence agents they are the hottest topic in the industry right now and will likely influence how businesses build much of their AI infrastructure in the future.

Roese added something else businesses should consider: on-device artificial intelligenceAnother hot topic in space. He said companies will want to imagine when AI agents will need to work locally because the old internet could crash.

“The second thing to consider is where are you running to?” Roese said. “That’s where things like AI PC come in, because once I have a bunch of agents working on my behalf and they can talk to each other, they all need to be in the same place.”

He added that Dell is exploring the possibility of adding “concierge” representatives to the device, “so if your internet connection drops, you can still do your job.”

The tech stack is booming now, but not always

Generative AI has allowed the technology stack to expand as more tasks become more abstract, and the GPU space introduces new databases or new service providers. AIOps services. He said it won’t stay like this forever uniform CEO Umesh Sachdev and businesses need to remember this.

“The tech stack has exploded but I think we will see it return to normal,” Sachdev said. “Eventually people will bring things in-house and demand for capacity on GPUs will decrease. The layer and vendor explosion happens all the time with new technologies, and we’ll see the same with AI.”

It’s clear that for businesses, considering the entire AI ecosystem, from software to hardware, is the most logical application for AI workflows.

VB Daily

Stay informed! Get the latest news in your inbox daily

By subscribing you agree to VentureBeat’s Terms of Service.

Thank you for subscribing. Check out more VB newsletters here.

An error has occurred.