As AI is integrated into business operations, CIOs are under pressure to deliver the best results. Coupled with customers demanding real-time responses, edge computing has emerged as a remedy for safe, efficient and accurate AI deployments. Andre Reitenbach, CEO of Gcore, shares more.
Andre Reitenbach, CEO of Gcore
While the origins of AI go back decades, the introduction of Open AI has given nearly everyone easy tools to optimize both AI and machine learning, and many have taken advantage of it. In the past two years, advances in AI have seen an increase in companies and startups building large-scale language models to improve business operations and using AI to gain enterprise-level insights and improve decision-making.
CIOs are now under pressure to adopt AI more broadly across applications and business processes to deliver more personalized customer experiences and better manage sensitive data. But AI isn’t something you can take off the shelf and plug in. Integrating AI into an enterprise organization requires a deep understanding of the infrastructure that supports it and the requirements AI places on both computing power and energy usage as models scale.
According to the World Economic Forum, a 10x improvement in the efficiency of AI models could lead to a surge in demand for computing power of up to 10,000x, while the energy required to run AI tasks is growing by 26% to 36% every year.
In tandem with this is the desire for real-time responses, which is where edge AI becomes so important. For example, Apple has announced that Siri will have a generative AI system that will not only respond to one question at a time, but also chat with you. Of course, if you are an Apple device user, you can expect this functionality to be available wherever you are in the world. This means that this functionality is entirely dependent on a global edge AI deployment, eliminating the risk of latency associated with sending data to and from a remote cloud or data center.
Another key issue for CIOs to consider is privacy. We live in a globalized world, and leaders are increasingly concerned about data sovereignty and ensuring that the data they collect or store about their citizens is protected without disrupting business operations. The question is not just how AI systems will manage and use data in certain sectors, such as healthcare or finance, but whether the infrastructure can run models for a particular location while complying with local regulations. Edge AI offers the advantage of increased privacy and security, with sensitive data kept local.
A Key Step to Preparing for Edge AI
Training and Inference
When companies develop AI models, they typically go through three stages: training, inference, and distribution. Companies need powerful computing resources, such as GPUs, to train large AI models on large datasets. The world’s leading technology companies use hundreds of thousands of GPUs in this training phase, but CIOs just starting on this journey can be more conservative in scaling their models over time and as their use of AI becomes more complex.
Once a model is pre-trained, you can run inference on it, which involves taking the model to generate outputs like text, images, or predictions based on new data inputs. This inference stage often takes place in the cloud, and although not as powerful as the training stage, it still requires a fair bit of computing power.
Finally, the output of the inference model needs to be distributed globally to serve end users around the world with minimal latency. Large content delivery networks (CDNs) with points of presence (PoPs) around the world (we have over 180 PoPs) are ideal for this purpose, using Edge AI to deliver it closer to end users. The closer your end customers are to the PoP, the faster they can interact with your AI model.
Low latency and low cost
Like other digital services, AI applications require low latency to deliver a responsive user experience. This was initially driven by the needs of gaming, e-commerce, finance, and entertainment. AI is following this same trend and benefits from secure, high-performance global networks and edge computing resources.
Centralized clouds offer vast computing power for training, but struggle to compete with other providers when it comes to privacy, low latency, and cost-effectiveness. CIOs will benefit from distributed AI inference, which can cost-effectively deliver less intensive inference workloads to end users on an on-demand, pay-as-you-go basis. These are calculated for the GPUs required, allowing businesses to start with basic GPUs and elastically scale resources based on usage.
Use cases and challenges
Companies are still understanding the potential use cases for AI across industries, including automation systems, robotics, human resources, customer service chatbots, etc. But they need to consider factors like data management, compliance, skills, budgets, and more.
Large-scale language models like ChatGPT offer powerful yet general-purpose AI capabilities. For production use, enterprises may prefer to fine-tune such models with their own data to increase relevance, accuracy, and control. They can leverage open-source models or contract AI experts to develop customized private models.
Some of the key considerations for CIOs include understanding the specific use cases and desired outcomes that suit their business, evaluating build vs buy options for model development based on in-house AI expertise and available resources, choosing a scalable infrastructure for training and inference, determining the best route to ensure low latency, handling data governance and privacy, and optimizing costs.
Overall, AI offers a transformative opportunity, but operationalizing it requires careful planning around technology infrastructure, economic models, use cases, and data strategies. Working with specialized AI cloud and consulting providers who understand and can deliver on the promise of AI at the edge can accelerate effective AI adoption.
Click below to share this article
Facebook Twitter LinkedIn Email WhatsApp