Artificial intelligence (AI) has been adopted globally at a rapid rate. Here are a few key terms you’ll encounter when reading or talking about AI.
Algorithms are detailed computational instructions that machines run on. They can describe ways to solve problems, perform tasks, and predict patterns like weather or behavior. Algorithms control how a device frames and processes data for its intended purposes.
Algorithmic bias describes the negative impacts of tools like AI when they draw from large datasets skewed by historical or selection bias. If the data inputs are biased, the outputs will be biased. Algorithmic bias is typically an unintentional byproduct of human programming. It can also arise from systemic inequalities reflected in the training data. Bias can be reduced by improving data collection methods and data training workflows. Businesses, legislators and watchdog groups are supervising AI and developing policies to correct algorithmic biases.
Artificial intelligence refers to computer systems, software or processes designed to mimic human tasks, conversational tone and reasoning. AI programs are trained on enormous datasets to accomplish tasks like analyzing speech, digitizing visual perception and learning. AI is not intelligent in the human sense, but it can process, learn and improve datasets to aid humans in sophisticated ways. AI can also exhibit “narrow intelligence,” meaning its applications are focused on specific tasks or subject matter, rather than general human-like reasoning. Examples of narrow intelligence chatbots, spam filters or recommendation systems.
Artificial general intelligence (AGI) is a theoretical form of AI that would be designed to perform any task a human could. It would be self-taught and able to complete tasks on its own without training. AGI doesn’t exist yet, but debates about ethically developing and classifying it continue. There are numerous ethical and technical challenges to consider, like ensuring AI’s alignment with human values and safeguarding against misuse. For example, AGI could handle unfamiliar tasks, learn, and secure its survival by applying its training data and coming to new conclusions. AGI is different from generative AI.
Generative AI responds to human commands or “prompts” to create text, images, video, audio and computer code. It processes data quickly, making it an effective assistive tool for repetitive tasks and data analysis. Generative AI depends on its training data or “datasets” to make inferences, recognize patterns and generate content. The datasets and algorithms used to train an AI influence its specialty. For example, an AI that uses generative adversarial networks (GANs) will focus more on visual and audio generation, while one that uses large language models (LLMs) will focus on text, data and programming code generation. (More on GANs and LLMs below.)
Generative AI creates lifelike outputs but isn’t conscious, and it differs from AGI. In legal contexts, AI-generated content cannot be copyrighted; only human-created content can be.
Training data is a dataset used to teach an algorithm or a machine learning model how to make predictions. The dataset can include anything, such as written text, images of faces, historical data, weather information, traffic patterns, or arrest records.
As noted above, training data is the basis of a machine’s intelligence. If the data input is limited or skewed, the output will be too. To improve AI results, it helps to clean and normalize data before training.
Transfer learning is a way to customize an off-the-shelf AI tool to fit your needs. This technique is used to fine-tune AI for specialized tasks, like tailoring it for customer service chatbots across different industries.
For example, say you install the generative AI ChatGPT using its pretrained data. First, you fine-tune it with data specific to your industry, such as commercial insurance. Later, you pair each employee with their own ChatGPT to further train it to simulate the employee’s writing style. Over time, the generative AI model becomes more valuable and customized to your business and employees.
A pretrained model can be more efficient than training from scratch. But some businesses might have intellectual property and data privacy concerns.
Large language models (LLMs) are AI systems that use written text patterns to generate responses. They depend on large, diverse volumes of data to create coherent, human-like responses (hence the term “large language”). ChatGPT is an example of an LLM. It can use human speech patterns to create text that simulates interactive knowledge content. LLMs don’t yet understand text like humans do. They rely on human prompts and probabilistic predictions determined by algorithms.
Hallucination is a slang term for AI producing inaccurate or illogical information in response to prompts. People may prefer “confabulation” or “inaccuracies” to avoid parallels with human mental illness. However, the term remains a common way to describe AI’s mistakes.
Neural networks are a series of algorithms that identify underlying relationships in a dataset. They’re like the digital version of the human brain’s neural network. GANs are a type of neural network used in generative AI.
Generative adversarial networks (GANs) are pairs of neural networks designed to work against each other to create realistic outputs. They are trained on the same dataset. One neural network (the “generator”) generates new data. At the same time, the second neural network (the “discriminator”) evaluates whether the generator’s output is real (based on the original dataset) or fake (created by the generator). Each neural network learns and improves from its interactions. The goal is to make the generator so good that its generated results fool the discriminator into thinking they are real.
Deepfakes are examples of GANs. The goal is to create data so real that the discriminator AI can’t tell if it’s fake anymore. The result is increasingly realistic image, video and audio outputs. You could think of GANs as master forgers, constantly improving their techniques until their forgeries are indistinguishable from the real thing.
Deep learning knits algorithms together to mimic how a human brain processes information inside its neural networks. The algorithms form an artificial neural network that can independently learn and make intelligent decisions.
Deep learning is a subset of machine learning. It can recognize patterns with extreme complexity. It’s the tech behind realistic AI-generated images and voice synthesis. Examples of deep learning applications include autonomous vehicles, medical image analysis and natural language processing.
Machine learning is an AI process that identifies patterns in data to make decisions and predictions. The machine generates these decisions without explicit programming. Machine learning applications include face recognition, language translation and self-driving cars. For example, a self-driving car uses machine learning to analyze images from onboard cameras, understand its environment, and make decisions like slowing down if it detects a pedestrian crossing the road.
Don’t use AI and machine learning interchangeably. AI encompasses a broad concept of a machine’s capability to perform tasks that humans consider smart. Machine learning is a specific AI system that uses algorithms to learn from data and predict outcomes, not just react based on predefined rules. Engineers use a few strategies for machine learning:
-
Supervised learning involves seeing the correct answers (labeled data) and using that data to make predictions.
-
Unsupervised learning allows the AI to access data to group similar things (finding patterns or clusters) without guidance.
-
Reinforcement learning praises the AI (reward) when it gets things correct and ignores incorrect behavior. This encourages the AI to adjust its actions to improve its performance based on trial and error. (More on this below.)
Reinforcement learning is when an AI agent learns to make decisions by interacting with an environment. Through trial and error, the AI agent makes decisions and receives feedback through rewards or penalties to guide it toward achieving the maximum rewards over time.
For example, imagine a robotic arm on an assembly line. It should pick up a metal part, move it to a face plate and solder it together. At first, the robot doesn’t know how to perform the task. It starts by trying random movements, often failing, but occasionally getting it right. Every time it succeeds, it gets a reward. The robot uses this feedback to adjust its movements and learns the most effective way to place the part correctly and solder it. In training, it’s given obstacles to improve its ability to respond to environmental distractions. Eventually, it completes the task, overcoming different environmental variables.
Data augmentation is a machine learning technique that generates more data from an initial pool of training data. AI systems were developed using a “model-centric” system. They collected massive amounts of data and trained algorithms to produce results.
Data augmentation requires large datasets to be effective. This data collection, labeling and validation can be expensive. Not every industry has huge datasets to work with. For example, a health care company researching a cure for a rare disease could have trouble with a model-centric system if it doesn’t have enough data. Parsing big data is expensive and requires high processing power. It can also include bias and be vulnerable to cyberattacks.
Data augmentation can be used for all types of data, including images. It can flip, crop and rotate images for training. These training techniques encourage the AI to learn from what it sees in pictures rather than memorize static images.
In cybersecurity, data augmentation helps AI systems identify threats like malware and phishing attempts. For instance, if there are only a few examples of a new phishing email or malicious code, AI can generate variations by tweaking them using examples. AI can spot attack patterns and identify phishing emails more efficiently, even if hackers disguise them as legitimate.
Newer AI data augmentation tools are steering toward cleaner data from the start. Data-centric systems make data the focal point. They aim to improve the data quality by choosing better labels, using complete and representative data and minimizing data bias.
AI on trend
AI and its uses will continue to transform, as will the terminology. This list isn’t exhaustive, but it can give you a quick knowledge infusion to stay on trend.