Introduction
Generative AI applications are going to play a crucial role in improving productivity, efficiency, quality and reduce costs for Enterprises. Generative AI applications generate content given a natural language text-based prompt. Generative AI applications use Large Language models (LLMs). LLMs are AI models trained on tons of data from the internet and other sources. LLMs produce human like responses to a query (like prompt). It can be used to generate text, images, video, music and other forms. Generative AI is going to bring in new types of automation given its ability to take natural language-based input, process it semantically and generate an output. Automation in the past has been mainly focused on automating repetitive tasks and less on cognitive knowledge-based tasks.
Year 2023 has seen phenomenal pace of development in Generative AI. Launch of several different types of applications, back-to-back launches of advanced LLM models, Open source as well as closed source models launched and so on.
Given the fast-changing landscape of Generative AI technology it is important to be cognizant of future of Generative AI applications and build applications in a way so that your application is not locked in or dependent on a specific LLM. Several LLM choices are available to application builders each with its own pros and cons. There have been several events that have happened that necessitate to decouple applications from underlying LLM as much as possible.
Dall-E generated Image on Adaptable solutions
Need based LLM selection
Generative AI can be used for wide variety of Enterprise and Consumer space use cases. Especially for Enterprises it is crucial to build applications that can ensure reliability, availability and security. Choosing an LLM has to be based on your business & application needs. There are several factors that will play a role in LLM selection below listed are some of the key ones:
· Task you want to address.
· Response quality of LLM.
· Cost of overall solution.
· Latency of LLM response.
· Throughput — volume of requests towards the LLM.
· Technical expertise in the Enterprise.
· Availability of LLM.
· Security needs of user.
There are several LLM models now available both open source as well as closed source models. There are advantages and disadvantages of both open source and closed source models. Below will cover this in brief.
Closed source LLM models
These include models like OpenAI GPT, Anthropic Claude, Google PaLM and others. Closed source models are hosted by the providers in their own environment and they expose APIs to be used by the Application developers for which users have to pay for usage. Closed source models usually get charged on pay-as-you-use model.
Advantages of Closed source LLM models:
· Ease of use by means of exposed APIs.
· Infrastructure simplicity — user don’t have to worry about hosting complexity.
· Up to date models.
· Scaled solution — user don’t have to worry about scaling the LLM.
Disadvantages of Closed source LLM models:
· Cost in case if your volume of requests to LLM goes very high.
· Dependency or lock-in to the LLM provider.
· Data privacy/residency concerns.
· Customization control to make any specific change.
Open Source LLM models
These include models like LLAMA-2, Mistral and others. These models are available for users and developers to download and host in users environment. Thus the responsibility of hosting, scaling and maintenance of these model lies with the user.
Advantages of Open source LLM models:
· Cost control when the volume of request to LLM goes very high.
· LLM management and control is completely with the user.
· Data privacy/residency concerns can be managed well.
· Customization as needed can be done by user/developer.
Disadvantages of Open source LLM models:
· Advanced Technical expertise is needed with the user/developer to host, scale and maintain the LLM in users’ infrastructure.
· Infrastructure complexity — LLMs are complex to host and the infrastructure needed will be complex.
· Scaling responsibility lies with the user and he has to ensure that the volume load grows adequate resources are provided to the LLM.
· Maintenance of LLM is the responsibility of the user. Instantiation, monitoring, upgrade/downgrade has to be taken care of by the user.
Given the above you can see that there is no one straight forward answer as to which one is better or worse. It all depends on users needs and factors listed earlier. It will be good to start with closed source model initially for POCs and trials given the simplicity to use initially. Even for production they could be used initially till the time reasonable volume is reached in terms of requests to the LLM. After reaching a point where the volume justifies hosting your own open-source model would be a good changeover time to optimize the cost.
Cost Comparison Graph of LLMs
Above is an indicative Cost Comparison Graph for LLMs. Actual graph can vary based on specific LLMs you use. For Closed source the curve is linear given the fact that it is pay-as-use model pricing, whereas for Open source models it is stepped curve starting with a fixed hosting cost irrespective of volume and then increasing by another step when volume crosses a particular value interval.
Building Applications agnostic to a specific LLM
It is good to have multiple options for LLM given the varying needs of the users. At the same time there are advances happening in LLMs and other Generative AI tools which makes it very important to ensure Generative AI applications are built in a manner that it is adaptable to the ever-changing landscape of Generative AI technology.
In this being agnostic to LLM is key. This is needed both for technical and non-technical reasons. Technically there are new and better models and capabilities getting introduced to models on a regular basis. Thus, it is crucial to be flexible and agnostic to a LLM model. This is difficult to do at the same time given that applications need to fine tune the prompts/inputs to get a better response from an LLM. Thus, it is not a straightforward thing to do. On every change of LLM adaptation or customization will be needed given the characteristics of that LLM.
Non-technical reasons to build applications that are agnostic to LLM includes the uncertainty or events like the Sam Altman ouster saga that unfolded with OpenAI. Things could have significantly gone wrong with OpenAI operations, which would have led to severe implications on the applications and businesses using the OpenAI models. Event like this make the need for agnostic nature of applications even more critical and needing a reliability mechanism to address a situation like LLM provider going out of business or other such uncertain event.
Given the above technical and non-technical reasons it is recommended to build applications from day 1 with the requirement to being agnostic as much as possible. Definitely it cant be straightforward but if the philosophy is built into the design and implementation then adapting to other LLM models can be made easier and smoother.
Summary
Generative AI applications are going to play a crucial role in improving productivity, efficiency, quality and reduce costs for Enterprises. Generative AI is going to bring in new types of automation given its ability to take natural language-based input, process it semantically and generate an output. Automation in the past has been mainly focused on automating repetitive tasks and less on cognitive knowledge-based tasks.
Given the fast-changing landscape of Generative AI technology it is important to be cognizant of future of Generative AI applications and build applications in a way so that your application is not locked in or dependent on a specific LLM. Several LLM choices are available to application builders each with its own pros and cons. There have been several events that have happened that necessitate to decouple applications from underlying LLM as much as possible.
Bottom-line is Generative AI applications need to be built in a manner that they are adaptable to the ever-changing landscape of Generative AI technology. This is needed in order to benefit from continual advancements happening in Generative AI and to ensure continuity and reliability for Enterprises.
Very Crisp and clear !!
Very well articulated Amit. Thanks
Very timely and informative post, Amit!!