Posted in:

Hassan Taher Explains Chatbots and Large Language Models

Artificial intelligence-based chatbots like ChatGPT are the latest generation of conversational software tools. They promise to perform a wide array of tasks, including web searches, creative literature, and storing the world’s knowledge. These chatbots use large language models, which are crucial to understanding how chatbots work and getting the most out of them. Hassan Taher is a noted expert and author in the AI field who has recently provided his insights into chatbots and their use of LLMs.

An LLM uses a neural network with many parameters, typically in the billions. It trains on large amounts of text, either by itself or with the aid of human supervisors. The first LLMs were created in 2018 and are already performing well in a variety of tasks, shifting the course of research in natural language processing from training for specialized functions like mathematical reasoning, entity recognition, and sentiment analysis to a more generalized functionality. Taher has written at length about the issues that could result from this development, including his first book, The Rise of Intelligent Machines, which was highly successful. He has since written other books on this subject, including The Future of Work in an AI-Powered World and AI and Ethics: Navigating the Moral Maze.

The tasks that LLMs can perform and the skill with which they perform them is primarily a function of resources they can access, including data, processing power, and parameters. Experts don’t expect advances in design to change this paradigm, meaning that these factors will continue to remain a critical indicator of an LLM’s performance. The use of training in showing an LLM how to effectively use its resources is also crucial to its capabilities.

Data Sources

LLMs require a large amount of data to perform tasks well, but developers haven’t disclosed details about exactly where it comes from. However, it’s possible to get some clues about the sources they might be using to train their LLMs. The paper that introduced the Language Model for Dialogue Applications (LaMDA) family of LLMs from Google mentions public forums like Wikipedia. It also refers to code documents from programming sites, which likely includes Stack Overflow.

These references strongly suggest that LLMs have been training extensively on sites that are free to access, meaning that LLMs are scraping and analyzing publicly available content. This practice may need to change in the near future, however, as many of these sites will start charging for access. Reddit has stated that it plans to charge a subscription fee to access its text conversations, which span the past 18 years. Stack Overflow has also recently announced its plans to begin charging for access to its content.

The possibility that LLM developers will need to begin paying for access to training material is a significant change that will affect the development of AI-based software. Hassan Taher has consistently urged those in this field to continue learning, saying, “The field of AI technology is constantly evolving, and staying up to date on the latest developments is crucial to staying relevant and competitive.”

Transformers

An LLM’s neural network processes text data regardless of where it comes from. This architecture consists of many layers and nodes that continually adjust their interpretation of that data based on factors such as the results of previous trials. LLMs typically use a type of neural network known more specifically as a transformer, which is well suited for natural language processing.

Transformers are able to identify patterns in the way humans use words and phrases, allowing them to predict the next words that should follow in a sentence. In this sense, LLMs are extremely powerful autocorrect engines, as they’re very good at determining the most commonly used sequences of words. However, they don’t actually have any real knowledge about what those sequences mean. It only appears that they’re capable of original thought and creativity once they reach a sufficiently advanced stage.

This common misconception presents a business opportunity for a platform that connects AI technology experts with organizations and individuals who need their services, according to Hassan Taher. He has stated that such a platform “could help bridge the gap between those with the expertise and those who need it.”

Innovations

Self-attention is one of the key innovations of the latest transformers, which essentially means they consider the relationships of words to each other rather than simply selecting words in isolation. This feature greatly improves a transformer’s ability to simulate a human level of understanding with respect to text. Transformers also use an element of randomness in their word selection to create variation in a chatbot’s responses to the same question. However, this variation can also introduce errors into these responses since chatbots don’t directly know if a statement is accurate. All they can do is provide a response that matches the data they’ve trained on and sounds plausible.

Human Training

Chatbots often generate cliche or generic text when trying to create new responses to existing text. In a simplified way, this process is somewhat like finding the average for a group of numbers, which can result in unremarkable output. Humans are therefore essential for training chatbots, as they can point out mistakes and rank the quality of the chatbots’ responses. This assistance provides chatbots with a goal to strive for and is technically known as RLHF, or reinforcement learning on human feedback.

This iterative process allows LLMs to continually refine their neural networks, with the goal of creating better results each time. RLHF is still in its earliest stages of development, but it’s already producing noticeable improvements in chatbots’ responses. Hassan Taher finds this trend particularly exciting for its applications in health care, saying, “With the potential to revolutionize patient care and outcomes, I believe that AI has the power to make a significant impact on people’s lives.”

Future Developments

The capabilities of LLMs will continue to increase as they become larger and more complex. OpenAI hasn’t disclosed details about its LLMs, but experts believe GPT 3.5 has about 175 billion parameters. In comparison, GPT-4 should have at least 1 trillion parameters, significantly increasing the relationships between words and ChatGPT’s ability to associate them in meaningful ways.

LLMs are already excellent at imitating the text they’ve been trained on, allowing them to generate responses that sound natural. They also get their facts right most of the time, although there is still room for improvement in this area. The next step in improving LLMs will be getting them to recognize that the most likely response isn’t always the correct one. Hassan Taher believes this potential will allow AI technology like chatbots to bring positive change to the world, saying, “While some may disagree, I believe that with responsible use, AI can actually make the world a better place for everyone.”