Operant conditioning is a learning process in which organisms come to associate a specific behavior with its consequent outcome, which may be either reinforcement or punishment.
When a behavior is followed by a pleasant or desirable consequence, that outcome encourages the behavior to recur in the future; conversely, if the behavior is followed by a negative consequence, it is less likely to be repeated.
Because this learning process emphasizes behavior rather than merely forming associations between events, reinforcement plays a critical role. In this framework, the labels “positive” and “negative” do not denote moral value but indicate whether a stimulus is added or removed.
Positive reinforcement involves adding a desirable stimulus to increase the likelihood of a behavior, whereas negative reinforcement entails removing an undesirable stimulus to achieve the same effect.
Similarly, punishment—whether administered in a positive or negative form—is intended to decrease the probability of a behavior. All reinforcers, regardless of being positive or negative, serve to increase the occurrence of a behavioral response, while all punishers reduce that likelihood. Punishment serves as the inverse of reinforcement, aiming to reduce the likelihood of a behavior.
In positive punishment, an aversive stimulus is added following a behavior, while in negative punishment, a desirable stimulus is removed, both serving to decrease the probability that the behavior will occur again.
Reinforcement | Punishment | |
Positive | Something is added to increase the likelihood of a behavior. | Something is added to decrease the likelihood of a behavior. |
Negative | Something is removed to increase the likelihood of a behavior. | Something is removed to decrease the likelihood of a behavior. |
One of the most effective methods for teaching a new behavior is known as shaping. Instead of rewarding only the final target behavior, shaping involves reinforcing successive approximations of that behavior. This process is necessary because it is highly unlikely for an organism to exhibit a complex behavior spontaneously. In shaping, the behavior is broken down into many small, achievable steps: initially, any response that bears some resemblance to the target behavior is reinforced; then, only responses that more closely approximate the desired behavior are rewarded, until eventually only the precise target behavior is reinforced. This technique is especially useful for teaching complex behaviors or a series of interrelated actions.
Reinforcement schedules
Among these, the variable ratio schedule is noted for its high productivity and resistance to extinction due to its unpredictable reward ratio that encourages one to keep trying, whereas the fixed interval schedule tends to be less productive and more easily extinguished.
Reinforcement schedule | Description | Result | Example |
Fixed interval | Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). | Moderate response rate with significant pauses after reinforcement | Hospital patient uses patient-controlled, doctor-timed pain relief |
Variable interval | Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). | Moderate yet steady response rate | Checking social media |
Fixed ratio | Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). | High response rate with pauses after reinforcement | Piecework—factory worker getting paid for every x number of items manufactured |
Variable ratio | Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). | High and steady response rate | Gambling |
In addition to reinforcement and punishment, the processes of escape learning and avoidance learning are central to operant conditioning: Escape learning occurs when an organism engages in a behavior to leave a dangerous or uncomfortable situation, such as withdrawing a hand from a hot stovetop. Avoidance learning, on the other hand, involves taking proactive measures to keep a safe distance from a harmful situation in the future.
Cognitive processes also contribute significantly to associative learning: Latent learning refers to the acquisition of knowledge that remains hidden until there is a reason to demonstrate it; for example, a child who learns the route from home to school while being driven may later guide someone along that route when required.
Problem solving involves formulating and implementing a plan of action to overcome obstacles or resolve challenges, often requiring the individual to pause, reassess, and explore multiple potential solutions.
Finally, certain biological predispositions and instinctive drift affect how learning occurs. Biological predispositions, which are often genetically determined, influence the way an organism acquires new behaviors, while instinctive drift describes the tendency for learned behaviors to revert to innate patterns, particularly when the learned behavior conflicts with natural instincts.
Sign up for free to take 3 quiz questions on this topic