The hidden challenges of AI development no one talks about

November 18, 2024Last Updated: November 18, 2024

0 126

The hidden challenges of AI development no one talks about

I have been a DigitalOcean customer for years. When I first encountered the company back in 2016, it provided a very easy-to-spin-up Linux server with a variety of distros as options. It differentiated itself from web hosting providers by offering infrastructure — rather than software — as a service.

erb — Dillon Erb, DigitalOcean VP

Image: Dillon Erb

Most web hosting providers give you a control panel to navigate the web hosting experience for your site. You have no control over the virtual machine. What DigitalOcean does is give you a virtual bare-metal server, letting you do whatever the heck you want. This appealed to me greatly.

Also: I’ve tested a lot of AI tools for work. These 4 actually help me get more done every day

DigitalOcean was essentially Amazon Web Services (AWS) but with a much more understandable pricing structure. When I first started running servers on it, it was substantially more expensive than AWS for the kind of work I was doing. DigitalOcean has since expanded its service offerings to provide a wide variety of infrastructure capabilities, all in the cloud.

Beyond bare-metal virtual Linux servers, I haven’t used its additional capabilities, but I still appreciate the ability to quickly and easily spin up and down a Linux machine for any purpose, and at a very reasonable price. I do this to test out systems, run some low-traffic servers, and generally as a part of my extended infrastructure.

With the big push into artificial intelligence (Al), it makes sense that DigitalOcean is beginning to provide infrastructure for Al operations as well. That’s what we’ll be exploring today with Dillon Erb, the company’s vice president of AI advocacy and partnerships. Let’s dig in.

ZDNET: Could you provide a brief overview of your role at DigitalOcean?

Dillon Erb: I was the co-founder and CEO of the first dedicated GPU cloud computing company called Paperspace. In July of 2023, Paperspace was acquired by DigitalOcean to bring AI tooling and GPU infrastructure to a whole new audience of hobbyists, developers, and businesses alike.

Also: How to use AI for research the right way – responsibly and effectively

Currently I am the VP of AI Strategy where I am working on both exciting product offerings as well as key ecosystem partnerships to ensure that DigitalOcean can continue to be the go-to cloud for developers.

ZDNET: What are the most exciting AI projects you are currently working on at DigitalOcean?

DE: Expanding our GPU cloud to a much larger scale in support of rapid onboarding for a new generation of software developers creating the future of artificial intelligence.

Deep integration of AI tooling across the full DigitalOcean Platform to enable a streamlined AI-native cloud computing platform.

Bringing the full power of GPU compute and LLMs to our existing customer base to enable them to consistently deliver more value to their customers.

ZDNET: What historical factors have contributed to the dominance of large enterprises in AI development?

DE: The cost of GPUs is the most talked about reason for why AI has been difficult for smaller teams and developers to build competitive AI products. The cost of pretraining a large language model (LLM) can be astronomical, requiring thousands, if not hundreds of thousands, of GPUs.

Also: Sticker shock: Are enterprises growing disillusioned with AI?

However, there has also been a tooling gap which has made it hard for developers to utilize GPUs even when they have access to them. At Paperspace, we built a full end-to-end platform for training and deploying AI models.

Our focus on simplicity, developer experience, and cost transparency continues here at DigitalOcean where we are expanding our product offering substantially and building deep integrations with the entire DigitalOcean product suite.

ZDNET: Can you discuss the challenges startups face when trying to enter the AI space?

DE: Access to resources, talent and capital are common challenges startups face when entering into the AI arena.

Also: Building complex gen AI models? This data platform wants to be your one-stop shop

Currently, AI developers spend too much of their time (up to 75%) with the “tooling” they need to build applications. Unless they have the technology to spend less time tooling, these companies won’t be able to scale their AI applications. To add to technical challenges, nearly every AI startup is reliant on NVIDIA GPU compute to train and run their AI models, especially at scale.

Developing a good relationship with hardware suppliers or cloud providers like Paperspace can help startups, but the cost of purchasing or renting these machines quickly becomes the largest expense any smaller company will run into.

Additionally, there is currently a battle to hire and keep AI talent. We’ve seen recently how companies like OpenAI are trying to poach talent from other heavy hitters like Google, which makes the process for attracting talent at smaller companies much more difficult.

ZDNET: What are some specific barriers that prevent smaller businesses from accessing advanced AI technologies?

DE: Currently, GPU offerings, which are crucial for the development of AI/ML applications, are widely only affordable to large companies. While everyone has been trying to adopt AI offerings or make their current AI offerings more competitive, the demand for NVIDIA 100 GPUs has risen.

Also: Organizations face mounting pressure to accelerate AI plans, despite lack of ROI

These data center GPUs have improved significantly with each subsequent, semi-annual release of a new GPU microarchitecture. These new GPUs are accelerators that significantly reduce training periods and model inference response times. In turn, they can run large-scale AI model training for any company that needs it.

However, the cost of these GPU offerings can be out of reach for many, making it a barrier to entry for smaller players looking to leverage AI.

Now that the initial waves of the Deep Learning revolution have kicked off, we are starting to see the increased capitalization and retention of technologies by successful ventures. The most notable of these is OpenAI, who has achieved their huge market share through the conversion of their GPT 3.5 model into the immensely successful ChatGPT API and web applications.

Also: Employees are hiding their AI use from their managers. Here’s why

As more companies seek to emulate the success of companies like OpenAI, we may see more and more advanced technologies in Deep Learning not being released to the open-source community. This could affect startups if the gap between commercial and research model efficacy becomes insurmountable.

As the technologies get better, it may only be possible to achieve state of the art results of certain models like LLMs with truly massive resource allocations.

ZDNET: How does DigitalOcean aim to level the playing field for startups and smaller businesses in AI development?

DE: Creating a level playing field in AI development is something that we, early on, recognized would be critical to the growth of the industry as a whole. While the top researchers in any field can justify large expenses, new startups seeking to capitalize on emerging technologies rarely have these luxuries.

Also: Want a programming job in 2024? Learning any language helps, but only one is essential

In AI, this effect feels even more apparent. Training a Deep Learning model is almost always extremely expensive. This is a result of the combined function of resource costs for the hardware itself, data collection, and employees.

In order to ameliorate this issue facing the industry’s newest players, we aim to achieve several goals for our users: Creating an easy-to-use environment, introducing an inherent replicability across our products, and providing access at as low costs as possible.

By creating the simple interface, startups don’t have to burn time or money training themselves on our platform. They simply need to plug in their code and go! This lends itself well to the replicability of work on DigitalOcean: it’s easy to share and experiment with code across all our products. Together, these combine to assist with the final goal of reducing costs.

At the end of the day, providing the most affordable experience with all of the functionality they require is the best way to meet startups needs.

ZDNET: How important is it for AI development to be inclusive of smaller players, and what are the potential consequences if it is not?

DE: The truth of the matter is that developing AI is incredibly resource-intensive. The steady, practically exponential rate of increase for size and complexity of Deep Learning datasets and models means that smaller players could be unable to attain the required capital to keep up with the bigger players like FAANG companies [Facebook/Meta, Apple, Amazon, Netflix, Google/Alphabet].

Also: Gen AI as a software quality tool? Skepticism is fading as more organizations implement it

Furthermore, the vast majority of NVIDIA GPUs are being sold to hyperscalers like AWS or Google Cloud Platform. This makes it much more difficult for smaller companies to get access to these machines at affordable pricing due to the realities of the GPU supply chain.

Effectively, these practices reduce the number of diverse research projects that can potentially get funding, and startups may find themselves hindered from pursuing their work due to simple low machine availability. In the long run, this could cause stagnation or even introduce dangerous biases into the development of AI in the future.

At DigitalOcean, we believe a rising tide raises all ships, and that by supporting independent developers, startups, and small startups, we support the industry as a whole. By providing affordable access with minimal overhead, our GPU Machines offer opportunities for greater democratization of AI development on the cloud.

Through this, we aim to give smaller companies the opportunity to use the powerful machines they need to continue pushing the AI revolution forward.

ZDNET: What are the main misconceptions about AI development for startups and small businesses?

DE: The priority should always be split evenly on optimizing infrastructure as well as software development. At the end of the day, Deep Learning technologies are entirely reliant on the power of the machines on which they are trained or used for inference.

Also: Why data is the Achilles Heel of AI (and every other business plan)

It’s common to meet people with fantastic ideas, but a misconception about how much work needs to be put into either of these areas. Startups can compensate for this with broad hiring practices to ensure that you do not end up stonewalled by the lack of development in a certain direction.

ZDNET: How can smaller companies overcome the knowledge gap in AI technology and development?

DE: Hiring young entrepreneurs and enthusiasts making open-source technology popular is a great way to stay up on the knowledge you need to succeed. Of course, hiring PhD level senior developers and machine learning engineers will always give the greatest boost, but the young entrepreneurs popularizing these technologies are scrappy operators on the bleeding edge.

In the realms of popular technologies like Stable Diffusion and Llama LLM, we can see this in real time today. There are a plethora of different open source projects like ComfyUI or LangChain that are taking the world by storm. It’s through the use of both senior level, experienced engineers and newer developers of these entrepreneurial minded, but open source projects that I think startups can guarantee their future.

ZDNET: What advice would you give to entrepreneurs looking to integrate AI into their business models?

DE: Consider open-source options first. There are so many new businesses out there that are essentially repackaging an existing, popular open-source resource, especially LLMs. That means it is relatively simple to implement for ourselves with a little practice. Any entrepreneur should learn the basic Python requirements needed to run basic LLMs, at the very least.

Also: Businesses must reinvent themselves in the age of agentic AI

ZDNET: What future advancements in AI do you foresee that will particularly benefit startups and growing digital businesses?

DE: The cost of LLMs (especially for inference) is declining rapidly. Furthermore, the tooling and ecosystem of open-source model development is expanding rapidly. Combined, these are making AI accessible to startups of all scales, regardless of budget.

ZDNET: Any final thoughts or recommendations for startups looking to embark on their AI journey?

DE: The emergence of LLMs like GPT signaled a major leap in AI capabilities. These models didn’t just enhance existing applications; they opened doors to new possibilities, reshaping the landscape of AI development and its potential.

The scientists have built something that the engineers now can run with. AI is having an “API” moment and this time the entire development process has been upended.

There are still big open questions [like], “How does one deal with non-deterministic APIs? What types of programming languages should we use to talk to this new intelligence? Do we use behavior-driven development, test-driven development, or AI-driven development?” And more.

Also: Open source fights back: ‘We won’t get patent-trolled again’

The opportunity, however, is massive, and a whole new wave of category-defining startups will be created.