Deep Learning/Machine Learning

Selected resources on and suggested keywords for Deep Learning/Machine Learning

Further Research: Concepts and Terminologies Across AI Dimensions

AI accelerator: high-performance parallel computation machines that efficiently process AI workloads like neural networks. They can be used to train both supervised and unsupervised models, and are often used in conjunction with GPUs. – Source link
AI agents: software programs that can interact with its environment, collect data, and use the data to perform self-determined tasks to meet predetermined goals. Humans set goals, but an AI agent independently chooses the best actions it needs to perform to achieve those goals. For example, consider a contact center AI agent that wants to resolves customer queries. The agent will automatically ask the customer different questions, look up information in internal documents, and respond with a solution. – Source link
AI alignment: In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards humans' intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances its intended objectives. – Source link
AI Inference vs. Training Models:
- Trained Models are the products of rigorous learning from historical data. They encapsulate the knowledge acquired during the training phase, storing information about the relationships between inputs and outputs.
- AI inference involves training models to recognize patterns and make predictions, and is the step where these training models are utilized to process new data. This process is akin to a well-trained human expert making decisions based on their wealth of experience. – Source link
- Another way for comparison: training chips process large datasets to build the model, while inference chips efficiently execute these models, delivering quick and accurate outputs.
ASICs (Application-Specific Integrated Circuits): Specialized: Designed for a single, specific purpose (e.g., cryptocurrency mining, AI inference). High Performance: Excel at the designated task due to optimized hardware and architecture. Energy Efficient: Often consume less power per unit of work compared to GPUs for their specific task. Limited Versatility: Cannot be easily adapted to other tasks. Costly Development: High upfront costs for research and development. Source ask Gemini
Bidirectional Encoder Representations (BERT): is based on Transformers, a deep learning model in which every output element is connected to every input element, and the weightings between them are dynamically calculated based upon their connection. – Source link
Categorization in AI: AI data classification is a process where AI systems are trained to categorize data into predefined classes or labels. By learning from patterns in historical data, AI classification sorts through vast amounts of data, creating order from the digital chaos. – Source link
Chain of thought (CoT) mirrors human reasoning, facilitating systematic problem-solving through a coherent series of logical deductions. Chain of thought prompting is an approach in artificial intelligence that simulates human-like reasoning processes by delineating complex tasks into a sequence of logical steps towards a final resolution. This methodology reflects a fundamental aspect of human intelligence, offering a structured mechanism for problem-solving. In other words, CoT is predicated on the cognitive strategy of breaking down elaborate problems into manageable, intermediate thoughts that sequentially lead to a conclusive answer. – Source link
ChatGPT: which stands for Chat Generative Pre-trained Transformer is a type of generative AI that helps with content creation and information retrieval. It is a conversational AI, which means it creates AI systems that can understand and generate human-like responses in natural language conversations. ChatGPT is trained on large amounts of text data, such as books, articles, and web pages. OpenAI uses a publicly available corpus of web pages called the Common Crawl, which includes billions of web pages. – Source link
Chinchilla Scaling Law (Hoffman scaling laws): a set of principles that optimize the training of large language models (LLMs). The laws state that the number of model parameters (N) and the number of tokens for training the model (D) should scale in approximately equal proportions when given an increased budget (in FLOPs). For every parameter in a model, there should be approximately 20 tokens in the training data. For example, to train a large language model with a billion parameters, a data set of around 20 billion tokens is required.
CHIPs in AI: FPGA and GPU processors can execute an AI algorithm much more quickly than a CPU. GPUs are most often used for initially developing and refining AI algorithms; this process is known as “training.” FPGAs are mostly used to apply trained AI algorithms to real-world data inputs; this is often called “inference.” ASICs can be designed for either training or inference. – Source link
Convolutional Neural Networks (CNNs): is a type of deep learning algorithm that can learn from data. CNNs are particularly useful for image and video analysis. They can:
- Identify objects, classes, and categories by finding patterns in images
- Classify audio, time-series, and signal data
- Extract features from images and videos
- Classify or detect objects or scenes – Source link
Compute Unified Device Architecture (CUDA): is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose computing on GPUs. CUDA was created by Nvidia in 2006. – Source link
CUDA (Parallel Computing Platform): a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). It extends languages like C, C++, and Fortran, enabling programmers to write code that can execute on the massively parallel cores of GPUs. Unlike CPUs which are optimized for sequential task execution, CUDA accelerates computations by dividing programs into smaller, independent tasks that can run concurrently across multiple GPU cores. This approach significantly enhances performance for parallelizable workloads.
DALL-E, DALL·E 2, and DALL·E 3: are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions, called "prompts." – Source link
Data Labeling: the process of identifying raw data (images, text files, videos, etc.) and adding one or more meaningful and informative labels to provide context so that a machine learning model can learn from it. For example, labels might indicate whether a photo contains a bird or car, which words were uttered in an audio recording, or if an x-ray contains a tumor. - Source link
Explainable AI (XAI): is a set of processes and methods that allows human users to comprehend and trust the results and output created by machine learning algorithms. As AI becomes more advanced, humans are challenged to comprehend and retrace how the algorithm came to a result, Explainable AI is used to describe an AI model, its expected impact and potential biases. – Source link

And separately, ChatGPT is not an explainable AI as it is a text-to-text generative AI app that takes text as input and produces text as output. - Source link

Generative adversarial networks (GANs): are a type of deep learning architecture that use two neural networks to generate new data. The two networks are a generator and a discriminator, and they compete against each other to produce artificial data that looks like real data. The core idea of a GAN is based on the "indirect" training through the discriminator, another neural network that can tell how "realistic" the input seems. - Source link
Generative pretrained transformer (GPT): are a type of large language model (LLM), and a prominent framework for generative artificial intelligence. They are artificial neural networks that are used in natural language processing tasks. GPTs are based on the transformer architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. – Source link
Language Model for Dialogue Applications (LaMDA): is a collection of large language models (LLMs) developed by Google for conversational applications. LaMDA is based on Transformer architecture and trained on human dialogue and stories. – Source link
Large language model (LLM): is a language model notable for its ability to achieve general-purpose language generation. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. However, LLMs can inherit inaccuracies and biases from the data they are trained on, and they can sometimes produce false outputs. – Source link … “current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data” – Source link.
Massive Multitask Language Understanding (MMLU): is a new benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. - Source link
Mixture of Experts (MoE) Models: MoE models process data by designating a number of “experts,” each its own sub-network within a larger neural network, and training a gating network (or router) to activate only the specific expert(s) best suited to a given input. The primary benefit of the MoE approach is that by enforcing sparsity, rather than activating the entire neural network for each input token, model capacity can be increased while essentially keeping computational costs constant. - Source link
Model Weights: refer to the parameters of a machine learning model that determine how input data is processed to produce an output. Key points about model weights: The initial values of weights are often random. During the training process, the model adjusts these weights iteratively based on the input data and the desired output.
Neural network (also artificial neural network or neural net, abbreviated ANN or NN) - A neural network is a method in artificial intelligence (AI) that teaches computers to process data in a way that is inspired by the human brain. It is a type of machine learning (ML) process, called deep learning, that uses interconnected nodes or neurons in a layered structure that resembles the human brain. Training neural networks has helped pave the way for AI systems like ChatGPT. - Source link
OpenAI o1 series models (Reasoning models): OpenAI o1 series models are new large language models trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. – Source link
Retrieval-augmented generation (RAG) is an AI framework for improving the quality of LLM-generated responses by grounding the model on external sources of knowledge to supplement the LLM’s internal representation of information. Implementing RAG in an LLM-based question answering system has two main benefits: It ensures that the model has access to the most current, reliable facts, and that users have access to the model’s sources, ensuring that its claims can be checked for accuracy and ultimately trusted. - Source link…… In a nutshell, RAG, requires a deep dive into its two foundational elements - retrieval models and generative models - the retrieval model acts as a specialized 'librarian,' pulling in relevant information from a database or a corpus of documents. This information is then fed to the generative model, which acts as a 'writer,' crafting coherent and informative text based on the retrieved data. The two work in tandem to provide answers that are not only accurate but also contextually rich. Source link.
RAG process: When users ask an LLM a question, the AI model sends the query to another model that converts it into a numeric format so machines can read it. The numeric version of the query is sometimes called an embedding or a vector. The embedding model then compares these numeric values to vectors in a machine-readable index of an available knowledge base. When it finds a match or multiple matches, it retrieves the related data, converts it to human-readable words and passes it back to the LLM. Finally, the LLM combines the retrieved words and its own response to the query into a final answer it presents to the user, potentially citing sources the embedding model found. Source link
Reinforcement Learning from AI Feedback (RLAIF): Reinforcement Learning - which mimics humans’ trial and error - from AI Feedback, or RLAIF, is a hybrid learning approach that integrates classical Reinforcement Learning (RL) algorithms with feedback generated from other AI models. This approach allows the learning agent to refine its actions not only based on rewards from its environment but also on insights garnered from other AI systems, thus enriching the learning process. - Source link
Recursive self-improvement (RSI): A process where a system, such as artificial intelligence (AI), writes its own code in repeated cycles of improvement. The system can make adjustments to its own functionality, resulting in improved performance. The development of recursive self-improvement raises significant ethical and safety concerns, as such systems may evolve in unforeseen ways and could potentially surpass human control or understanding. - Source link
Synthetic data: Synthetic data generated from computer simulations or algorithms provides an inexpensive alternative to real-world data that’s increasingly used to improve AI models, protect sensitive data, and mitigate bias. - Source link
Supervised learning: a category of machine learning that uses labeled datasets to train algorithms to predict outcomes and recognize patterns. Unlike unsupervised learning, supervised learning algorithms are given labeled training to learn the relationship between the input and the outputs. - Source link
Tensor Processing Unit (TPU): Google Cloud TPUs are custom-designed AI accelerators, which are optimized for training and inference of large AI models. They are ideal for a variety of use cases, such as chatbots, code generation, media content generation, synthetic speech, vision services, recommendation engines, personalization models, among others. Gemini is using TPUs and will exclusively use the chips. - Source link
Transformer model: a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence. Transformer models apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other. - Source link-1 and Source link-2
Vector database: A vector database is a collection of data stored as mathematical representations. Vector databases make it easier for machine learning models to remember previous inputs, allowing machine learning to be used to power search, recommendations, and text generation use-cases. Data can be identified based on similarity metrics instead of exact matches, making it possible for a computer model to understand data contextually. - Source link
Zero-shot learning, few-shot learning and one-shot learning: are all techniques that allow a machine learning model to make predictions for new classes with limited labeled data. - Source link
- Zero-Shot Learning: teaching a computer to recognize things it has never seen by giving it a general idea. For instance, training it on various fruits and then asking it to identify a fruit it has never encountered.
- One-Shot Learning: teaching a computer about something using only one example.