Small language models (SLMs) are touted to be the next big thing in AI


While companies are pouring money into large language models (LLMs), some industry experts in the AI sector believe small language models (SLMs) will become the next big thing.

This comes as activity in the industry continues to grow as the festive season arises with tech companies investing more funding to develop their technology.

The future is in small language models

The likes of xAI run by multi-billionaire Elon Musk managed to raise an additional $5 billion from Andreessen Horowitz, Qatar Investment Authority, Sequoia, and Valor Equity Partners, as Amazon invested an additional $4 billion in Anthropic, a rival of OpenAI.

While these big techs and others are investing billions of dollars focusing on developing large LLMs to handle many different tasks, the reality of AI is that there is no one size fits all as there is need for task specific models for businesses.

According to AWS chief executive officer Matt Garman in a release on their expanding partnership and investments, there is alredy an overwhelming response from AWS customers who are developing generative AI powered by Anthropic.

LLMs for most companies are still the number one choice for certain projects, but for others, this choice can be expensive in cost, energy, and computing resources.

Steven McMillan president and CEO of Teradata who have offered an alternative path for some businesses also has other views. He is positive the future is in SLMs.

“As we look to the future, we think that small and medium language models and controlled environments such as domain-specific LLMs, will provide much better solutions.”

~ McMillan

SLMs produce customized outputs on specific types of data as the language models are specifically trained to make that. Since the data generated by SLMs is kept internally, the language models are therefore trained on potentially sensitive data.

With LLMs being energy consumptive, the small language versions are trained to scale both computing and energy use to the project’s actual needs. With such adjustments, it means the SLMs are efficient at a lower cost than current large models.

For users who want to use AI for specific knowledge, there is the option of domain specific LLMs as they do not offer broad knowledge. It is trained to deeply understand only one category of information and respond more accurately, for example a CMO vs a CFO, in that domain.

Why SLMs are a preferred option

According to the Association of Data Scientists (ADaSci) fully developing a SLM with 7 billion parameters for a million users would require just 55.1MWh (Megawatt hours).

ADaSci found out that training GPT-3 with 175 billion parameters consumed an estimated 1,287MWh of electricity and the power does not include when it officially comes into use by the public. Therefore, an SLM uses roughly 5% of the energy consumed through training an LLM.

Large models are usually run on cloud computers because they use more computing power than is ever available on an individual device. This results in complications for companies as they lose control over their information as it moves to the cloud, and slow responses as they travel through the internet.

Going into the future, adoption of AI by businesses will not be one size fits all as efficiency and selecting the best and least expensive tool to complete tasks will be in focus, which means picking the right sized model for each project.

This will be done for all models be it a general-purpose LLM, or smaller and domain-specific LLMs depending on which model will deliver better results, require fewer resources, and reduce the need for data to migrate to the cloud.

For the next phase, AI will be vital for business decisions as the public has high confidence in AI-generated answers.

“When you think of training AI models, they must be built on the foundation of great data.”

~ McMillan

“That is what we are all about, providing that trusted data set and then providing the capabilities and analytics capabilities so clients, and their customers, can trust the outputs,” added McMillan.

With efficiency and accuracy being in high demand in the world, smaller and domain-specific LLMs offer another option for delivering results that companies and the broader public can rely upon.

From Zero to Web3 Pro: Your 90-Day Career Launch Plan



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *