3.5 Conditioning

Achievable AP Psychology

3. Development & learning

Our AP Psychology course is now in "early access" - get 50% off for a limited time.

Conditioning

9 min read

Font

Discuss

Feedback

Observing behavior and mental processes

At the core of psychological science is learning: the process that allows organisms to adjust their actions and emotional reactions based on experience.

In the early 1900s, psychology went through a major methodological shift as the behavioral perspective emerged. This approach moved attention away from introspective reports of mental processes and toward observable, measurable behavior.

Two major models support this approach:

Classical conditioning, which focuses on forming associations between stimuli
Operant conditioning, which explains how consequences shape voluntary behavior

Key figures such as Ivan Pavlov, John Watson, and B.F. Skinner argued that the environment’s effects on behavior could be studied systematically. Because internal mental processes were often viewed as difficult to verify directly, learning theory and conditioning became central tools for explaining behavior across species. These ideas still influence research and applications in areas such as therapy and education.

Classical conditioning: associating stimuli

Classical conditioning is a learning process in which an initially neutral stimulus gains the ability to trigger a response that originally occurred in reaction to another stimulus. In other words, through repeated pairings, an organism learns to anticipate what’s coming next.

This involves:

Unconditioned stimulus (UCS): A trigger that naturally causes a reflexive reaction. For example, food (which naturally causes a dog to salivate).
Unconditioned response (UCR): The automatic, unlearned reaction to the UCS. For example, salivation when food is present.
Conditioned stimulus (CS): A stimulus that starts out neutral but comes to trigger a response after repeated pairing with the UCS (for example, a bell rung before feeding).
Conditioned response (CR): The learned reaction to the CS alone. For example, salivating when hearing the bell even when no food is present.

The learning process in classical conditioning is called acquisition. It happens in a predictable sequence:

Before conditioning, the UCS naturally produces the UCR.
During conditioning, the CS is repeatedly paired with the UCS.
After acquisition, the CS alone can trigger the response. At this point, the response is called the CR.

When trying to successfully achieve classical conditioning, one must consider some key nuances:

The presentation order and timing of CS and UCS is important: the neutral stimulus must come before the unconditioned stimulus and occur close in time so it becomes a reliable predictor.
Extinction: If the CS is no longer paired with the UCS (for example, the CS is repeatedly presented without the UCS), the CR may weaken over time or stop occurring. Extinction can also happen if the CS and UCS are separated by long intervals or presented randomly, which weakens the association. Importantly, extinction does not erase the learned bond; it suppresses its expression.
Spontaneous recovery: After a rest period following extinction, presenting the CS alone can again elicit the CR, though typically in a weaker form. This suggests the original learning remains accessible rather than erased.

Other observations have been demonstrated by classical conditioning studies such as:

Stimulus generalization: This occurs when stimuli similar to the CS also elicit the CR, such as a dog salivating to tones resembling the original bell.
Stimulus discrimination: Over time, learners refine their responses by distinguishing the CS from similar but irrelevant stimuli, responding only to the original predictive cue.
Higher-order conditioning: Once a neutral stimulus becomes a CS, it can be paired with another neutral stimulus to create a second conditioned stimulus that can elicit the CR (without direct association with the UCS). For instance, pairing a light with the bell eventually causes the light alone to evoke salivation. This allows learned cues to be chained together, producing more complex behavioral responses.
Emotional responses: These can also become classically conditioned. Many fears and phobias develop when neutral objects become paired with traumatic experiences. Therapists may use counterconditioning, which replaces fear responses with relaxation or positive feelings by associating the feared stimulus with soothing experiences.
Taste aversion: A learned response in which an animal (including humans) develops a strong avoidance of a specific food after it becomes associated with illness. This classically conditioned reaction shows biological preparedness (organisms are biologically predisposed to learn certain associations quickly for survival) and one-trial conditioning (the association can form after a single pairing of taste and illness, without needing repeated pairings). Interestingly, a person may avoid a food that caused sickness hours earlier, even though typical timing principles would predict weaker learning. This reflects evolutionary priorities in avoiding toxins.

Habituation is a separate, non-associative form of learning - distinct from classical conditioning - in which repeated exposure to a harmless stimulus leads to a decreased response. For example, a loud fan noise may initially startle someone but eventually be ignored. Unlike classical conditioning, no stimulus pairing is involved; the organism simply learns that a stimulus carries no consequence.

Operant conditioning: learning from consequences

Operant conditioning explains how behaviors under conscious control change based on their outcomes.

Reinforcement makes a behavior more likely to occur again, while punishment makes it less likely. By looking at the types of reinforcement and punishment, and at how behaviors are learned and maintained over time, you can see how powerful consequences are in shaping behavior.

Reinforcement and punishment

Edward Thorndike introduced the Law of Effect, which states that behaviors followed by pleasant consequences tend to increase, while behaviors followed by unpleasant consequences tend to decrease. B.F. Skinner later expanded this framework by describing how reinforcement and punishment shape voluntary actions.

Reinforcement increases the likelihood that a behavior will be repeated and comes in two varieties:

Positive reinforcement: Adding a rewarding stimulus (the reinforcement) after a behavior, such as giving a child praise for completing homework.
Negative reinforcement: Removing an unpleasant stimulus, like silencing a loud alarm (the reinforcement) when a button is pressed (the behavior), which increases the behavior that caused the removal.

Reinforcers can be either:

Primary reinforcers directly satisfy innate biological needs, including food, warmth, and water.
Secondary reinforcers acquire their motivating power through association with primary reinforcers. Examples include money, grades, or praise, which are valued because of learned connections.

While reinforcers increase behavior, punishment decreases the likelihood of a behavior. There are two types:

Positive punishment: Introducing an aversive outcome, for instance scolding a dog (the punishment) to deter an unwanted action (the behavior).
Negative punishment: Taking away a desirable item, such as confiscating a teen’s phone (the punishment) for missing curfew (the behavior).

Though both reinforcement and punishment affect behavior, reinforcement tends to produce more reliable and longer-lasting changes than punishment.

“Negative” does not mean decreasing behavior. In operant conditioning, “positive” and “negative” refer to whether a stimulus is added or removed - not to whether behavior increases or decreases. Negative reinforcement removes an unpleasant stimulus and increases behavior, just like positive reinforcement does. Punishment (positive or negative) is what decreases behavior. Students often confuse negative reinforcement with punishment - remember: if behavior goes up, it’s reinforcement; if it goes down, it’s punishment.

What is the difference between positive reinforcement and negative reinforcement in operant conditioning?

(spoiler)

Positive reinforcement adds a rewarding stimulus to increase a behavior, while negative reinforcement removes an unpleasant stimulus to increase the behavior.

Variations of reinforcement

Studies of operant conditioning show that the reactions and behaviors of animals (including humans) can also demonstrate:

Stimulus discrimination: Learning to respond only to a specific stimulus that is linked with reinforcement (such as a dog sitting only when hearing the exact command “sit”).
Generalization: Behavior that occurs when responses extend to similar stimuli (for example, responding to commands with slight variations in tone).
Instinctive drift: Only certain behaviors can be influenced by reinforcement. Sometimes, ingrained natural behaviors interfere with learned behaviors. For example, a trained animal might revert to instinctual patterns that conflict with conditioning, showing limits on how far conditioning alone can shape behavior.
Shaping: Complex behaviors can be built by reinforcing successive approximations toward a target action. For example, if the desired behavior of a pigeon is to peck at a button, you might reward the pigeon first for moving toward the button, and then reward only for pecking it.
Superstitious behavior: Happens when consequences reinforce behaviors that are actually unrelated to the consequence. In this case, behaviors are reinforced by coincidence rather than causality. A baseball player wearing a “lucky” uniform because it coincided with successes is an example.
Learned helplessness: A consequence of learning that you have no control over outcomes in a situation. When individuals repeatedly experience uncontrollable negative events, they may stop attempting positive actions (even when escape is possible), contributing to depression and passivity.

The pattern and timing of reinforcement influence how behavior is learned and maintained. Different schedules produce different response patterns and different resistance to extinction. The two main categories of reinforcement schedules include:

Continuous reinforcement: Reinforcing after every correct response. This is effective for learning new behaviors but often leads to rapid extinction once reinforcement stops.
Partial reinforcement: Reinforcing only after the behavior occurs a certain number of times (fixed or variable ratio) or on a time-based schedule (fixed or variable interval). This often produces slower learning but greater persistence of the behavior over time.

For each type of partial reinforcement, the response can be graphed in a way that results in unique patterns (with the X axis being time and Y axis being the cumulative number of responses):

Schedule	When reinforcement occurs	Response rate & pattern	Example
Fixed ratio	After a set number of responses	Fast, steady rate; post-reinforcement pauses (“break and run”)	Worker paid per item produced
Variable ratio	After an unpredictable number of responses	Highest and steadiest rate of any schedule; virtually no post-reinforcement pause; most resistant to extinction	Slot machines
Fixed interval	After a fixed amount of time	Scalloped curve - slow after reinforcement, accelerating as the interval nears completion	Weekly quizzes
Variable interval	After unpredictable time intervals	Steady, moderate rate with minimal pauses	Checking email or social media

How do classical conditioning and operant conditioning differ in the way behaviors are learned?

(spoiler)

Classical conditioning involves forming associations between stimuli to elicit involuntary responses, while operant conditioning involves learning behaviors based on their consequences, such as reinforcement or punishment.

Classical conditioning

Behavioral perspective: emerged from learning theories focused on conditioning, emphasizing observable behavior over internal thought.
Classical conditioning: learning occurs by associating two stimuli so one triggers a response originally caused by the other (acquisition).
UCS (unconditioned stimulus) becomes UCR (unconditioned response). After pairing with a CS (conditioned stimulus), the response becomes a CR (conditioned response).
Correct sequence of presenting CS and UCS is critical for learning.
Extinction: CR disappears when CS isn’t paired with UCS; spontaneous recovery: CR reappears after pairing resumes.
Stimulus generalization: response spreads to similar stimuli
Stimulus discrimination: response specific to the trained stimulus.
Higher-order conditioning: a CS becomes a UCS to create new associations.
Emotional reactions can be classically conditioned. Counterconditioning uses this principle to treat certain mental disorders.
Taste aversion research: demonstrates one-trial learning and biological predispositions to learn some associations faster than others.
Habituation: reduced response to repeated or prolonged exposure to a stimulus.

Operant conditioning

Operant conditioning: behavior shaped by consequences
Law of Effect: reinforced behaviors increase, punished behaviors decrease.
Reinforcement/punishment can be positive (adding) or negative (removing).
Reinforcers can be primary (biological value) or secondary (learned value).
Discrimination: behavior occurs only in specific contexts
Generalization: behavior spreads to similar contexts.
Shaping: rewarding steps toward a target behavior
Instinctive drift: some behaviors resist shaping due to biological tendencies.
Superstitious behavior: unrelated actions reinforced by chance.
Learned helplessness: belief that outcomes cannot be controlled, reducing effort.
Reinforcement schedules affect learning strength:
- Continuous: reinforcement after every correct response.
- Partial: reinforcement based on time (fixed or variable interval) or number of responses (fixed or variable ratio). Patterns produce distinct graph shapes.

Conditioning

Observing behavior and mental processes

At the core of psychological science is learning: the process that allows organisms to adjust their actions and emotional reactions based on experience.

Two major models support this approach:

Classical conditioning, which focuses on forming associations between stimuli
Operant conditioning, which explains how consequences shape voluntary behavior

Classical conditioning: associating stimuli

This involves:

Unconditioned stimulus (UCS): A trigger that naturally causes a reflexive reaction. For example, food (which naturally causes a dog to salivate).
Unconditioned response (UCR): The automatic, unlearned reaction to the UCS. For example, salivation when food is present.
Conditioned stimulus (CS): A stimulus that starts out neutral but comes to trigger a response after repeated pairing with the UCS (for example, a bell rung before feeding).
Conditioned response (CR): The learned reaction to the CS alone. For example, salivating when hearing the bell even when no food is present.

The learning process in classical conditioning is called acquisition. It happens in a predictable sequence:

Before conditioning, the UCS naturally produces the UCR.
During conditioning, the CS is repeatedly paired with the UCS.
After acquisition, the CS alone can trigger the response. At this point, the response is called the CR.

When trying to successfully achieve classical conditioning, one must consider some key nuances:

The presentation order and timing of CS and UCS is important: the neutral stimulus must come before the unconditioned stimulus and occur close in time so it becomes a reliable predictor.
Extinction: If the CS is no longer paired with the UCS (for example, the CS is repeatedly presented without the UCS), the CR may weaken over time or stop occurring. Extinction can also happen if the CS and UCS are separated by long intervals or presented randomly, which weakens the association. Importantly, extinction does not erase the learned bond; it suppresses its expression.
Spontaneous recovery: After a rest period following extinction, presenting the CS alone can again elicit the CR, though typically in a weaker form. This suggests the original learning remains accessible rather than erased.

Other observations have been demonstrated by classical conditioning studies such as:

Stimulus generalization: This occurs when stimuli similar to the CS also elicit the CR, such as a dog salivating to tones resembling the original bell.
Stimulus discrimination: Over time, learners refine their responses by distinguishing the CS from similar but irrelevant stimuli, responding only to the original predictive cue.
Higher-order conditioning: Once a neutral stimulus becomes a CS, it can be paired with another neutral stimulus to create a second conditioned stimulus that can elicit the CR (without direct association with the UCS). For instance, pairing a light with the bell eventually causes the light alone to evoke salivation. This allows learned cues to be chained together, producing more complex behavioral responses.
Emotional responses: These can also become classically conditioned. Many fears and phobias develop when neutral objects become paired with traumatic experiences. Therapists may use counterconditioning, which replaces fear responses with relaxation or positive feelings by associating the feared stimulus with soothing experiences.
Taste aversion: A learned response in which an animal (including humans) develops a strong avoidance of a specific food after it becomes associated with illness. This classically conditioned reaction shows biological preparedness (organisms are biologically predisposed to learn certain associations quickly for survival) and one-trial conditioning (the association can form after a single pairing of taste and illness, without needing repeated pairings). Interestingly, a person may avoid a food that caused sickness hours earlier, even though typical timing principles would predict weaker learning. This reflects evolutionary priorities in avoiding toxins.

Operant conditioning: learning from consequences

Operant conditioning explains how behaviors under conscious control change based on their outcomes.

Reinforcement and punishment

Reinforcement increases the likelihood that a behavior will be repeated and comes in two varieties:

Positive reinforcement: Adding a rewarding stimulus (the reinforcement) after a behavior, such as giving a child praise for completing homework.
Negative reinforcement: Removing an unpleasant stimulus, like silencing a loud alarm (the reinforcement) when a button is pressed (the behavior), which increases the behavior that caused the removal.

Reinforcers can be either:

Primary reinforcers directly satisfy innate biological needs, including food, warmth, and water.
Secondary reinforcers acquire their motivating power through association with primary reinforcers. Examples include money, grades, or praise, which are valued because of learned connections.

While reinforcers increase behavior, punishment decreases the likelihood of a behavior. There are two types:

Positive punishment: Introducing an aversive outcome, for instance scolding a dog (the punishment) to deter an unwanted action (the behavior).
Negative punishment: Taking away a desirable item, such as confiscating a teen’s phone (the punishment) for missing curfew (the behavior).

Though both reinforcement and punishment affect behavior, reinforcement tends to produce more reliable and longer-lasting changes than punishment.

What is the difference between positive reinforcement and negative reinforcement in operant conditioning?

(spoiler)

Positive reinforcement adds a rewarding stimulus to increase a behavior, while negative reinforcement removes an unpleasant stimulus to increase the behavior.

Variations of reinforcement

Studies of operant conditioning show that the reactions and behaviors of animals (including humans) can also demonstrate:

Stimulus discrimination: Learning to respond only to a specific stimulus that is linked with reinforcement (such as a dog sitting only when hearing the exact command “sit”).
Generalization: Behavior that occurs when responses extend to similar stimuli (for example, responding to commands with slight variations in tone).
Instinctive drift: Only certain behaviors can be influenced by reinforcement. Sometimes, ingrained natural behaviors interfere with learned behaviors. For example, a trained animal might revert to instinctual patterns that conflict with conditioning, showing limits on how far conditioning alone can shape behavior.
Shaping: Complex behaviors can be built by reinforcing successive approximations toward a target action. For example, if the desired behavior of a pigeon is to peck at a button, you might reward the pigeon first for moving toward the button, and then reward only for pecking it.
Superstitious behavior: Happens when consequences reinforce behaviors that are actually unrelated to the consequence. In this case, behaviors are reinforced by coincidence rather than causality. A baseball player wearing a “lucky” uniform because it coincided with successes is an example.
Learned helplessness: A consequence of learning that you have no control over outcomes in a situation. When individuals repeatedly experience uncontrollable negative events, they may stop attempting positive actions (even when escape is possible), contributing to depression and passivity.

Continuous reinforcement: Reinforcing after every correct response. This is effective for learning new behaviors but often leads to rapid extinction once reinforcement stops.
Partial reinforcement: Reinforcing only after the behavior occurs a certain number of times (fixed or variable ratio) or on a time-based schedule (fixed or variable interval). This often produces slower learning but greater persistence of the behavior over time.

For each type of partial reinforcement, the response can be graphed in a way that results in unique patterns (with the X axis being time and Y axis being the cumulative number of responses):

Schedule	When reinforcement occurs	Response rate & pattern	Example
Fixed ratio	After a set number of responses	Fast, steady rate; post-reinforcement pauses (“break and run”)	Worker paid per item produced
Variable ratio	After an unpredictable number of responses	Highest and steadiest rate of any schedule; virtually no post-reinforcement pause; most resistant to extinction	Slot machines
Fixed interval	After a fixed amount of time	Scalloped curve - slow after reinforcement, accelerating as the interval nears completion	Weekly quizzes
Variable interval	After unpredictable time intervals	Steady, moderate rate with minimal pauses	Checking email or social media

How do classical conditioning and operant conditioning differ in the way behaviors are learned?

(spoiler)

Achievable AP Psychology

3. Development & learning

Our AP Psychology course is now in "early access" - get 50% off for a limited time.

Conditioning

9 min read

Font

Discuss

Feedback

Observing behavior and mental processes

At the core of psychological science is learning: the process that allows organisms to adjust their actions and emotional reactions based on experience.

Two major models support this approach:

Classical conditioning, which focuses on forming associations between stimuli
Operant conditioning, which explains how consequences shape voluntary behavior

Classical conditioning: associating stimuli

This involves:

Unconditioned stimulus (UCS): A trigger that naturally causes a reflexive reaction. For example, food (which naturally causes a dog to salivate).
Unconditioned response (UCR): The automatic, unlearned reaction to the UCS. For example, salivation when food is present.
Conditioned stimulus (CS): A stimulus that starts out neutral but comes to trigger a response after repeated pairing with the UCS (for example, a bell rung before feeding).
Conditioned response (CR): The learned reaction to the CS alone. For example, salivating when hearing the bell even when no food is present.

The learning process in classical conditioning is called acquisition. It happens in a predictable sequence:

Before conditioning, the UCS naturally produces the UCR.
During conditioning, the CS is repeatedly paired with the UCS.
After acquisition, the CS alone can trigger the response. At this point, the response is called the CR.

When trying to successfully achieve classical conditioning, one must consider some key nuances:

The presentation order and timing of CS and UCS is important: the neutral stimulus must come before the unconditioned stimulus and occur close in time so it becomes a reliable predictor.
Extinction: If the CS is no longer paired with the UCS (for example, the CS is repeatedly presented without the UCS), the CR may weaken over time or stop occurring. Extinction can also happen if the CS and UCS are separated by long intervals or presented randomly, which weakens the association. Importantly, extinction does not erase the learned bond; it suppresses its expression.
Spontaneous recovery: After a rest period following extinction, presenting the CS alone can again elicit the CR, though typically in a weaker form. This suggests the original learning remains accessible rather than erased.

Other observations have been demonstrated by classical conditioning studies such as:

Stimulus generalization: This occurs when stimuli similar to the CS also elicit the CR, such as a dog salivating to tones resembling the original bell.
Stimulus discrimination: Over time, learners refine their responses by distinguishing the CS from similar but irrelevant stimuli, responding only to the original predictive cue.
Higher-order conditioning: Once a neutral stimulus becomes a CS, it can be paired with another neutral stimulus to create a second conditioned stimulus that can elicit the CR (without direct association with the UCS). For instance, pairing a light with the bell eventually causes the light alone to evoke salivation. This allows learned cues to be chained together, producing more complex behavioral responses.
Emotional responses: These can also become classically conditioned. Many fears and phobias develop when neutral objects become paired with traumatic experiences. Therapists may use counterconditioning, which replaces fear responses with relaxation or positive feelings by associating the feared stimulus with soothing experiences.
Taste aversion: A learned response in which an animal (including humans) develops a strong avoidance of a specific food after it becomes associated with illness. This classically conditioned reaction shows biological preparedness (organisms are biologically predisposed to learn certain associations quickly for survival) and one-trial conditioning (the association can form after a single pairing of taste and illness, without needing repeated pairings). Interestingly, a person may avoid a food that caused sickness hours earlier, even though typical timing principles would predict weaker learning. This reflects evolutionary priorities in avoiding toxins.

Operant conditioning: learning from consequences

Operant conditioning explains how behaviors under conscious control change based on their outcomes.

Reinforcement and punishment

Reinforcement increases the likelihood that a behavior will be repeated and comes in two varieties:

Positive reinforcement: Adding a rewarding stimulus (the reinforcement) after a behavior, such as giving a child praise for completing homework.
Negative reinforcement: Removing an unpleasant stimulus, like silencing a loud alarm (the reinforcement) when a button is pressed (the behavior), which increases the behavior that caused the removal.

Reinforcers can be either:

Primary reinforcers directly satisfy innate biological needs, including food, warmth, and water.
Secondary reinforcers acquire their motivating power through association with primary reinforcers. Examples include money, grades, or praise, which are valued because of learned connections.

While reinforcers increase behavior, punishment decreases the likelihood of a behavior. There are two types:

Positive punishment: Introducing an aversive outcome, for instance scolding a dog (the punishment) to deter an unwanted action (the behavior).
Negative punishment: Taking away a desirable item, such as confiscating a teen’s phone (the punishment) for missing curfew (the behavior).

Though both reinforcement and punishment affect behavior, reinforcement tends to produce more reliable and longer-lasting changes than punishment.

What is the difference between positive reinforcement and negative reinforcement in operant conditioning?

(spoiler)

Positive reinforcement adds a rewarding stimulus to increase a behavior, while negative reinforcement removes an unpleasant stimulus to increase the behavior.

Variations of reinforcement

Studies of operant conditioning show that the reactions and behaviors of animals (including humans) can also demonstrate:

Stimulus discrimination: Learning to respond only to a specific stimulus that is linked with reinforcement (such as a dog sitting only when hearing the exact command “sit”).
Generalization: Behavior that occurs when responses extend to similar stimuli (for example, responding to commands with slight variations in tone).
Instinctive drift: Only certain behaviors can be influenced by reinforcement. Sometimes, ingrained natural behaviors interfere with learned behaviors. For example, a trained animal might revert to instinctual patterns that conflict with conditioning, showing limits on how far conditioning alone can shape behavior.
Shaping: Complex behaviors can be built by reinforcing successive approximations toward a target action. For example, if the desired behavior of a pigeon is to peck at a button, you might reward the pigeon first for moving toward the button, and then reward only for pecking it.
Superstitious behavior: Happens when consequences reinforce behaviors that are actually unrelated to the consequence. In this case, behaviors are reinforced by coincidence rather than causality. A baseball player wearing a “lucky” uniform because it coincided with successes is an example.
Learned helplessness: A consequence of learning that you have no control over outcomes in a situation. When individuals repeatedly experience uncontrollable negative events, they may stop attempting positive actions (even when escape is possible), contributing to depression and passivity.

Continuous reinforcement: Reinforcing after every correct response. This is effective for learning new behaviors but often leads to rapid extinction once reinforcement stops.
Partial reinforcement: Reinforcing only after the behavior occurs a certain number of times (fixed or variable ratio) or on a time-based schedule (fixed or variable interval). This often produces slower learning but greater persistence of the behavior over time.

For each type of partial reinforcement, the response can be graphed in a way that results in unique patterns (with the X axis being time and Y axis being the cumulative number of responses):

Schedule	When reinforcement occurs	Response rate & pattern	Example
Fixed ratio	After a set number of responses	Fast, steady rate; post-reinforcement pauses (“break and run”)	Worker paid per item produced
Variable ratio	After an unpredictable number of responses	Highest and steadiest rate of any schedule; virtually no post-reinforcement pause; most resistant to extinction	Slot machines
Fixed interval	After a fixed amount of time	Scalloped curve - slow after reinforcement, accelerating as the interval nears completion	Weekly quizzes
Variable interval	After unpredictable time intervals	Steady, moderate rate with minimal pauses	Checking email or social media

How do classical conditioning and operant conditioning differ in the way behaviors are learned?

(spoiler)

Classical conditioning

Behavioral perspective: emerged from learning theories focused on conditioning, emphasizing observable behavior over internal thought.
Classical conditioning: learning occurs by associating two stimuli so one triggers a response originally caused by the other (acquisition).
UCS (unconditioned stimulus) becomes UCR (unconditioned response). After pairing with a CS (conditioned stimulus), the response becomes a CR (conditioned response).
Correct sequence of presenting CS and UCS is critical for learning.
Extinction: CR disappears when CS isn’t paired with UCS; spontaneous recovery: CR reappears after pairing resumes.
Stimulus generalization: response spreads to similar stimuli
Stimulus discrimination: response specific to the trained stimulus.
Higher-order conditioning: a CS becomes a UCS to create new associations.
Emotional reactions can be classically conditioned. Counterconditioning uses this principle to treat certain mental disorders.
Taste aversion research: demonstrates one-trial learning and biological predispositions to learn some associations faster than others.
Habituation: reduced response to repeated or prolonged exposure to a stimulus.

Operant conditioning

Operant conditioning: behavior shaped by consequences
Law of Effect: reinforced behaviors increase, punished behaviors decrease.
Reinforcement/punishment can be positive (adding) or negative (removing).
Reinforcers can be primary (biological value) or secondary (learned value).
Discrimination: behavior occurs only in specific contexts
Generalization: behavior spreads to similar contexts.
Shaping: rewarding steps toward a target behavior
Instinctive drift: some behaviors resist shaping due to biological tendencies.
Superstitious behavior: unrelated actions reinforced by chance.
Learned helplessness: belief that outcomes cannot be controlled, reducing effort.
Reinforcement schedules affect learning strength:
- Continuous: reinforcement after every correct response.
- Partial: reinforcement based on time (fixed or variable interval) or number of responses (fixed or variable ratio). Patterns produce distinct graph shapes.

Conditioning

Observing behavior and mental processes

At the core of psychological science is learning: the process that allows organisms to adjust their actions and emotional reactions based on experience.

Two major models support this approach:

Classical conditioning, which focuses on forming associations between stimuli
Operant conditioning, which explains how consequences shape voluntary behavior

Classical conditioning: associating stimuli

This involves:

Unconditioned stimulus (UCS): A trigger that naturally causes a reflexive reaction. For example, food (which naturally causes a dog to salivate).
Unconditioned response (UCR): The automatic, unlearned reaction to the UCS. For example, salivation when food is present.
Conditioned stimulus (CS): A stimulus that starts out neutral but comes to trigger a response after repeated pairing with the UCS (for example, a bell rung before feeding).
Conditioned response (CR): The learned reaction to the CS alone. For example, salivating when hearing the bell even when no food is present.

The learning process in classical conditioning is called acquisition. It happens in a predictable sequence:

Before conditioning, the UCS naturally produces the UCR.
During conditioning, the CS is repeatedly paired with the UCS.
After acquisition, the CS alone can trigger the response. At this point, the response is called the CR.

When trying to successfully achieve classical conditioning, one must consider some key nuances:

The presentation order and timing of CS and UCS is important: the neutral stimulus must come before the unconditioned stimulus and occur close in time so it becomes a reliable predictor.
Extinction: If the CS is no longer paired with the UCS (for example, the CS is repeatedly presented without the UCS), the CR may weaken over time or stop occurring. Extinction can also happen if the CS and UCS are separated by long intervals or presented randomly, which weakens the association. Importantly, extinction does not erase the learned bond; it suppresses its expression.
Spontaneous recovery: After a rest period following extinction, presenting the CS alone can again elicit the CR, though typically in a weaker form. This suggests the original learning remains accessible rather than erased.

Other observations have been demonstrated by classical conditioning studies such as:

Stimulus generalization: This occurs when stimuli similar to the CS also elicit the CR, such as a dog salivating to tones resembling the original bell.
Stimulus discrimination: Over time, learners refine their responses by distinguishing the CS from similar but irrelevant stimuli, responding only to the original predictive cue.
Higher-order conditioning: Once a neutral stimulus becomes a CS, it can be paired with another neutral stimulus to create a second conditioned stimulus that can elicit the CR (without direct association with the UCS). For instance, pairing a light with the bell eventually causes the light alone to evoke salivation. This allows learned cues to be chained together, producing more complex behavioral responses.
Emotional responses: These can also become classically conditioned. Many fears and phobias develop when neutral objects become paired with traumatic experiences. Therapists may use counterconditioning, which replaces fear responses with relaxation or positive feelings by associating the feared stimulus with soothing experiences.
Taste aversion: A learned response in which an animal (including humans) develops a strong avoidance of a specific food after it becomes associated with illness. This classically conditioned reaction shows biological preparedness (organisms are biologically predisposed to learn certain associations quickly for survival) and one-trial conditioning (the association can form after a single pairing of taste and illness, without needing repeated pairings). Interestingly, a person may avoid a food that caused sickness hours earlier, even though typical timing principles would predict weaker learning. This reflects evolutionary priorities in avoiding toxins.

Operant conditioning: learning from consequences

Operant conditioning explains how behaviors under conscious control change based on their outcomes.

Reinforcement and punishment

Reinforcement increases the likelihood that a behavior will be repeated and comes in two varieties:

Positive reinforcement: Adding a rewarding stimulus (the reinforcement) after a behavior, such as giving a child praise for completing homework.
Negative reinforcement: Removing an unpleasant stimulus, like silencing a loud alarm (the reinforcement) when a button is pressed (the behavior), which increases the behavior that caused the removal.

Reinforcers can be either:

Primary reinforcers directly satisfy innate biological needs, including food, warmth, and water.
Secondary reinforcers acquire their motivating power through association with primary reinforcers. Examples include money, grades, or praise, which are valued because of learned connections.

While reinforcers increase behavior, punishment decreases the likelihood of a behavior. There are two types:

Positive punishment: Introducing an aversive outcome, for instance scolding a dog (the punishment) to deter an unwanted action (the behavior).
Negative punishment: Taking away a desirable item, such as confiscating a teen’s phone (the punishment) for missing curfew (the behavior).

Though both reinforcement and punishment affect behavior, reinforcement tends to produce more reliable and longer-lasting changes than punishment.

What is the difference between positive reinforcement and negative reinforcement in operant conditioning?

(spoiler)

Positive reinforcement adds a rewarding stimulus to increase a behavior, while negative reinforcement removes an unpleasant stimulus to increase the behavior.

Variations of reinforcement

Studies of operant conditioning show that the reactions and behaviors of animals (including humans) can also demonstrate:

Stimulus discrimination: Learning to respond only to a specific stimulus that is linked with reinforcement (such as a dog sitting only when hearing the exact command “sit”).
Generalization: Behavior that occurs when responses extend to similar stimuli (for example, responding to commands with slight variations in tone).
Instinctive drift: Only certain behaviors can be influenced by reinforcement. Sometimes, ingrained natural behaviors interfere with learned behaviors. For example, a trained animal might revert to instinctual patterns that conflict with conditioning, showing limits on how far conditioning alone can shape behavior.
Shaping: Complex behaviors can be built by reinforcing successive approximations toward a target action. For example, if the desired behavior of a pigeon is to peck at a button, you might reward the pigeon first for moving toward the button, and then reward only for pecking it.
Superstitious behavior: Happens when consequences reinforce behaviors that are actually unrelated to the consequence. In this case, behaviors are reinforced by coincidence rather than causality. A baseball player wearing a “lucky” uniform because it coincided with successes is an example.
Learned helplessness: A consequence of learning that you have no control over outcomes in a situation. When individuals repeatedly experience uncontrollable negative events, they may stop attempting positive actions (even when escape is possible), contributing to depression and passivity.

Continuous reinforcement: Reinforcing after every correct response. This is effective for learning new behaviors but often leads to rapid extinction once reinforcement stops.
Partial reinforcement: Reinforcing only after the behavior occurs a certain number of times (fixed or variable ratio) or on a time-based schedule (fixed or variable interval). This often produces slower learning but greater persistence of the behavior over time.

For each type of partial reinforcement, the response can be graphed in a way that results in unique patterns (with the X axis being time and Y axis being the cumulative number of responses):

Schedule	When reinforcement occurs	Response rate & pattern	Example
Fixed ratio	After a set number of responses	Fast, steady rate; post-reinforcement pauses (“break and run”)	Worker paid per item produced
Variable ratio	After an unpredictable number of responses	Highest and steadiest rate of any schedule; virtually no post-reinforcement pause; most resistant to extinction	Slot machines
Fixed interval	After a fixed amount of time	Scalloped curve - slow after reinforcement, accelerating as the interval nears completion	Weekly quizzes
Variable interval	After unpredictable time intervals	Steady, moderate rate with minimal pauses	Checking email or social media

How do classical conditioning and operant conditioning differ in the way behaviors are learned?

(spoiler)

Key points

Classical conditioning

Behavioral perspective: emerged from learning theories focused on conditioning, emphasizing observable behavior over internal thought.
Classical conditioning: learning occurs by associating two stimuli so one triggers a response originally caused by the other (acquisition).
UCS (unconditioned stimulus) becomes UCR (unconditioned response). After pairing with a CS (conditioned stimulus), the response becomes a CR (conditioned response).
Correct sequence of presenting CS and UCS is critical for learning.
Extinction: CR disappears when CS isn’t paired with UCS; spontaneous recovery: CR reappears after pairing resumes.
Stimulus generalization: response spreads to similar stimuli
Stimulus discrimination: response specific to the trained stimulus.
Higher-order conditioning: a CS becomes a UCS to create new associations.
Emotional reactions can be classically conditioned. Counterconditioning uses this principle to treat certain mental disorders.
Taste aversion research: demonstrates one-trial learning and biological predispositions to learn some associations faster than others.
Habituation: reduced response to repeated or prolonged exposure to a stimulus.

Operant conditioning

Operant conditioning: behavior shaped by consequences
Law of Effect: reinforced behaviors increase, punished behaviors decrease.
Reinforcement/punishment can be positive (adding) or negative (removing).
Reinforcers can be primary (biological value) or secondary (learned value).
Discrimination: behavior occurs only in specific contexts
Generalization: behavior spreads to similar contexts.
Shaping: rewarding steps toward a target behavior
Instinctive drift: some behaviors resist shaping due to biological tendencies.
Superstitious behavior: unrelated actions reinforced by chance.
Learned helplessness: belief that outcomes cannot be controlled, reducing effort.
Reinforcement schedules affect learning strength:
- Continuous: reinforcement after every correct response.
- Partial: reinforcement based on time (fixed or variable interval) or number of responses (fixed or variable ratio). Patterns produce distinct graph shapes.