Operant Conditioning

Operant Conditioning (Worth 30 Points)

The purpose of this writing assignment is to apply critical thinking skills to conduct a real-life application of operant conditioning.

Learning Objectives 3c and 5c

Select a target behavior that you would like to strengthen in a person or animal in which you have daily contact. For example, you may choose to have your child pick up his/her toys more often; try to get more hugs from your significant other; train a dog to sit on command, etc. Try to avoid selecting a target behavior you would like to weaken, which would require the use of positive punishment (punishment by application) or negative punishment (punishment by removal).

Step 1 Written Portion: State your target behavior. If you choose a target behavior in an animal, include the animal’s name, age, gender, and breed. If you choose a target behavior in a person, include his or her first name, age, and relationship to you (such as a friend, co-worker, child, or significant other).

Once you have decided on a target behavior, collect data over the next day to find out how often the target behavior occurs without your guidance or reinforcement. In other words, just observe and count the times the target behavior occurs on its own. For example, if you choose the following target behavior: Teaching your dog how to roll over on command, then you would give the roll over command and count the times the dog rolls over (without your interference or guidance). This data is called the baseline frequency.

Step 2 Written Portion: State your baseline frequency data. Describe your data collection, including the number of hours observed, where you observed the target behavior, and any other relevant information. Also, report any biases that may be introduced in your baseline frequency data collection. For instance, if you are doing your baseline frequency count on the number of times your dog sits on command, and you observe your pet during an obedience class, a bias will be introduced.

*Please note*: A baseline frequency of one day will implement a bias in your study. Report the bias, stating that a baseline frequency observed and recorded over several days may produce a more valid and reliable record of the target behavior.

On the next day, begin the process of operant conditioning. The first time the target behavior occurs; reinforce it with a behavior that you believe has meaning to the person/animal. Think through your operant conditioning terms. For instance, if the target behavior occurs, and you respond with “Great Job,” your compliment is positive reinforcement with a secondary/conditioned reinforcer, which increases the likelihood the target behavior will occur again.

If the baseline frequency is 0, in other words, if the target behavior does not occur on its own, then you will need to employ the technique of shaping.

Step 3 Written Portion: Write a paragraph reporting the number of times the target behavior occurred during the operant conditioning phase. Explain why you think the target behavior increased, decreased, or stayed the same. Use your operant conditioning terms to describe what you did, including your use of primary or secondary/conditioned reinforcers of positive reinforcement. Also, explain if and how you used escape or avoidance conditioning of negative reinforcement. In addition, identify if you stayed with one type of effective reinforcer or if you used many. Also, if you used shaping because the target behavior did not occur on its own, discuss how you applied shaping. Lastly, describe what you may have done differently, and report any conclusions you may have about your operant conditioning efforts.

Step 4: Review the grading rubric(webpage, opens in new tab), which explains the expectations for your writing assignment.

chp 1


Learning is a relatively permanent change in behavior due to experience. In psychology, learning is associated with the behavioral theories of Ivan Pavlov’s Classical Conditioning and B. F. Skinner’s Operant Conditioning. Also, Albert Bandura’s Social Learning Theory identifies a cognitive pathway in which we learn by observing others.

Classical Conditioning

In the early part of the 20th century, Russian physiologist Ivan Pavlov (1849–1936) studied the digestive system of dogs when he noticed an interesting behavioral phenomenon: The dogs began to salivate when the lab technicians (who normally fed them) entered the room, even though the dogs had not yet received any food. Pavlov realized the dogs were salivating because they associated the arrival of the technicians with the food that soon followed their appearance in the room.

Ivan Pavlov

Ivan Pavlov

Ivan Pavlov’s research made substantial contributions to our understanding of learning.

LIFE Photo Archive – Wikimedia Commons – public domain.

With his team of researchers, Pavlov began studying this process in more detail. He conducted a series of experiments in which, over a number of trials, dogs were exposed to a tone immediately before receiving food. He systematically controlled the onset of the tone and the timing of the delivery of the food, and he recorded the amount of the dogs’ salivation. In other words, Pavlov sounded the tone and then gave the dogs food, and the food automatically (without learning) made the dogs salivate. Initially the dogs salivated only when they saw or smelled the food, but after several pairings of the tone and the food, the dogs began to salivate as soon as they heard the sound. The dogs learned to associate the sound of the tone with the food that followed.

Pavlov identified a fundamental associative learning process called classical conditioning. Classical conditioning refers to learning that occurs when a neutral stimulus (a tone) becomes associated with a stimulus (food) that naturally produces a behavior (salivation). After the association is learned, the previously neutral stimulus (the tone) is sufficient to produce the behavior (salivation).

Psychologists use specific terms to identify the stimuli and the responses in classical conditioning. The unconditioned stimulus (UCS) is something (such as food) that triggers a natural occurring response, and the unconditioned response (UCR) is the naturally occurring response (such as salivation) that follows the unconditioned stimulus. The conditioned stimulus (CS) is a neutral stimulus which, after being repeatedly presented prior to the unconditioned stimulus, evokes a similar response as the unconditioned stimulus. In Pavlov’s experiment, the sound of the tone served as the conditioned stimulus which, after learning, produced the conditioned response (CR), which is the acquired response to the formerly neutral stimulus. Note that the UCR and the CR are the same behavior (in this case salivation), but they are given different names because they are produced by different stimuli (the UCS and the CS, respectively).

Image of Whistle and Dog

Image of Whistle and Dog

Top left: Before conditioning, the unconditioned stimulus (UCS), which in the image above is the food, naturally produces in the dog the unconditioned response (UCR), the salivation. Top right: Before conditioning, the neutral stimulus (the whistle) does not produce the salivation response in the dog. Bottom left: The unconditioned stimulus (UCS), in this case the food, is repeatedly presented immediately after the neutral stimulus (the whistle) is presented to the dog. Bottom right: After the dog experiences the classical conditioning association, the neutral stimulus (now known as the conditioned stimulus or CS), which is the whistle, is sufficient to produce the conditioned response (CR), which is the dog’s salivation.

This type of conditioning can be beneficial. Imagine, for instance, that an animal first smells a new food, eats it, and then gets sick. If the animal learns to associate the particular smell of that food with the sickness, then the animal, based on the smell alone, will avoid that particular food next time.


Classical Conditioning Application

Imagine you live in an apartment complex. You were heading home in the middle of an intense thunderstorm, and no close parking spots were available. You had to park far away from your apartment. You started to run for your apartment, and while running, you wondered if you left your cell phone in your car. You decided to stop under the large oak tree that was halfway between your car and your apartment to make sure you had your phone.

While under the large oak tree, a loud thunder boomed, and you jumped and flinched. Then, you immediately ran the rest of the way to your apartment.

The next day, you parked in a parking spot near where you parked during the thunderstorm. When you walked under the oak tree, you noticed that you felt jumpy and flinched. You wondered what was wrong, and then you walked the rest of the way to your apartment.

According to classical conditioning theory, what might have made you feel jumpy and flinch when you were under the tree the day after the thunderstorm?

To apply classical conditioning, begin with the reflex by identifying the UCS and UCR.

  • What is the UCS?
  • What is the UCR?

If you identified the UCS as the loud sound of the thunder, and if you identified the jumpiness and flinching as the UCR, you are correct!

The second step is to identify the process of learning by associating a neutral with a reflex.

  • What is the neutral that was paired with the reflex (the UCS and UCR)?

If you identified the neutral as the tree, which was paired with the UCS (the loud sound of the thunder) and the UCR (the jumpiness and flinching), you are correct!

The third step is to identify when learning has occurred. In other words, the neutral becomes the conditioned stimulus (CS) because it produces the response, even when the UCS is not present. When this occurs, the neutral becomes the CS, and its learned response becomes the CR.

  • What is the CS?
  • What is the CR?

If you identified the CS as the tree, you are correct! If you identified the CR as the jumpiness and flinching to the tree (even when the loud sound of the thunder is not present), you are correct!

The Persistence and Extension of Conditioning

After Pavlov demonstrated that learning could occur through association, he studied the variables that influenced the strength and the persistence of conditioning. In some studies, after the conditioning had taken place, Pavlov presented the sound repeatedly but without presenting the food afterward. The image titled “Acquisition, Extinction, and Spontaneous Recovery” shows what happened. After the initial acquisition (learning) phase in which the conditioning occurred, when the CS was then presented alone, the behavior rapidly decreased. In other words, the dogs salivated less and less to the sound, and eventually the sound did not elicit salivation at all. Extinction refers to the reduction in responding that occurs when the conditioned stimulus is presented repeatedly without the unconditioned stimulus.

Acquisition, Extinction, and Spontaneous Recovery

Acquisition, Extinction, and Spontaneous Recovery

Acquisition: The CS and the UCS are repeatedly paired together and behavior increases. Extinction: The CS is repeatedly presented alone, and the behavior slowly decreases. Spontaneous recovery: After a pause, when the CS is again presented alone, the behavior may again occur and then again show extinction.

Although at the end of the first extinction period, the CS was no longer producing salivation, the effects of conditioning had not entirely disappeared. Pavlov found that, after a pause, sounding the tone again elicited salivation, although to a lesser extent than before extinction took place. The increase in responding to the CS following a pause after extinction is known as spontaneous recovery. When Pavlov again presented the CS alone, the behavior again showed extinction until it disappeared again.

Although the behavior disappeared, extinction is never complete. If conditioning is again attempted, the animal will learn the new associations much faster than it did the first time. For instance, if a CS and a CR are paired again after extinction, reconditioning can occur. Imagine if you are working with a client that has a snake phobia. As the therapist, you recommend a behavioral treatment, called systematic desensitization, which is based on the principles of classical conditioning. Systematic desensitization involves three phases: The first phase is teaching relaxation skills because the fear response (the CR) and physiological relaxation cannot simultaneously occur. The second phase is to create a fear hierarchy with the client from least feared to most feared as related to the phobic object or situation (breaking down the CS). The third phase involves a combination of exposure to the CS (beginning with the least feared stimulus on the fear hierarchy) while the client masters relaxation in response to the stimulus (replacing the CR of fear with relaxation).

Then, the client progresses to the next fear on the hierarchy. After systematic desensitization is complete and the phobia is extinct, imagine if the client were bitten by a threatening snake. Then, the phobia can be reconditioned. Reconditioning is when the CS is paired again with the CR through exposure. Systematic desensitization may need to be repeated to replace the CR of fear with relaxation.

Pavlov experimented with presenting new stimuli that were similar, but not identical to, the original conditioned stimulus. For instance, if the dog had been conditioned to being scratched before the food arrived, the stimulus would be changed to being rubbed rather than scratched. He found that the dogs also salivated upon experiencing the similar stimulus, a process known as stimulus generalization. Stimulus generalization refers to the tendency to respond to stimuli that resemble the original conditioned stimulus. The ability to generalize has significance. If we eat some red berries and they make us sick, it would be a good idea to think twice before we eat some purple berries. Although the berries are not exactly the same, they nevertheless are similar and may have the same negative properties.

Lewicki (1985) conducted research that demonstrated the influence of stimulus generalization and how quickly and easily it can happen. In his experiment, high school students first had a brief interaction with a female experimenter who had short hair and glasses. The study was set up so that the students had to ask the experimenter a question, and (according to random assignment) the experimenter responded either in a negative way or a neutral way toward the students. Then, the students were told to go into a second room in which two experimenters were present, and to approach either one of them. However, the researchers arranged it so that one of the two experimenters looked a lot like the original experimenter, while the other one did not (she had longer hair and no glasses). The students were significantly more likely to avoid the experimenter who looked like the earlier experimenter when that experimenter had been negative to them than when she had treated them more neutrally. The participants showed stimulus generalization such that the new, similar-looking experimenter created the same negative response in the participants as had the experimenter in the prior session.

The flip side of stimulus generalization is stimulus discrimination, which is the tendency to respond differently to stimuli that are similar but not identical. Pavlov’s dogs quickly learned, for example, to salivate when they heard the specific tone that had preceded food, but not upon hearing similar tones that had never been associated with food.

In some cases, an existing conditioned stimulus can “behave like” an unconditioned stimulus for a pairing with a new conditioned stimulus, a process known as second-order conditioning. In one of Pavlov’s studies, for instance, he first conditioned the dogs to salivate to a sound, and then repeatedly paired a new CS, a black square, with the sound. Eventually he found that the dogs would salivate at the sight of the black square alone, even though it had never been directly associated with the food. Second-order conditioning in everyday life includes our attractions to experiences that stand for or remind us of something else, such as when we feel good on a Friday because it has become associated with the paycheck that we receive on that day, which itself is a conditioned stimulus for the pleasures that the paycheck buys us.

The Role of Nature in Classical Conditioning

Scientists associated with the behaviorist school argued that all learning is driven by experience, and that nature plays little-to-no role. Classical conditioning, which is based on learning through experience, represents an example of the importance of the environment. However, classical conditioning cannot be understood entirely in terms of experience. Nature also plays a part. According to the evolutionary approach, we are biologically prepared to learn some associations more so than others. Biological preparedness is a term associated with the evolutionary approach that explains that through genetic adaptations made over time for survival, we are biologically prepared to fear certain situations, such as drowning or falling, or certain objects, such as spiders or snakes, more so than other situations, such as sitting or walking, or other objects, such as books and desks.

Clinical psychologists make use of classical conditioning to explain the learning of a specific phobia, which is a strong and irrational fear of a specific object, activity, or situation. For example, driving a car is a neutral event that would not normally elicit a fear response in most people. However, if a person were to experience a panic attack in which he suddenly experienced strong negative emotions while driving, he may learn to associate driving with the panic response. The driving has become the CS that now creates the fear response.

Psychologists have also discovered that people do not develop phobias to just anything. Although people may in some cases develop a driving phobia, they are more likely to develop phobias toward objects or situations (such as snakes, spiders, heights, and open spaces) that have been dangerous to people in the past. Currently, it is relatively rare for humans to be bitten by deadly spiders or snakes, to fall from trees or buildings, or to be attacked by a predator in an open area. According to the evolutionary perspective, in the past, the potential of being bitten by snakes or spiders, falling out of a tree, or being trapped in an open space were important concerns that threatened survival. Therefore, humans are still biologically prepared to learn these associations over others (Öhman & Mineka, 2001; LoBue & DeLoache, 2010).

Another type of conditioning is conditioning related to food. In his important research on food conditioning, John Garcia and his colleagues (Garcia, Kimeldorf, & Koelling, 1955; Garcia, Ervin, & Koelling, 1966) attempted to condition rats by presenting either a taste, a sight, or a sound as a neutral stimulus before the rats were given drugs (the UCS) that made them nauseous. Garcia discovered that taste conditioning was extremely powerful. The rat learned to avoid the taste associated with illness, even if the illness occurred several hours later. However, conditioning the behavioral response of nausea to a sight or a sound was much more difficult. These results contradicted the idea that conditioning occurs entirely as a result of environmental events, such that it would occur equally for any kind of unconditioned stimulus that followed any kind of conditioned stimulus. Rather, Garcia’s research showed that genetics matters. Organisms are biologically prepared to learn some associations more easily than others. The ability to associate smells with illness is an important survival mechanism, allowing the organism to quickly learn to avoid foods that are poisonous.


The Research of Thorndike and Skinner

Psychologist Edward L. Thorndike (1874–1949) was the first scientist to systematically study operant conditioning. In his research, Thorndike (1898) observed cats who had been placed in a “puzzle box” from which they tried to escape. At first, the cats scratched, bit, and swatted haphazardly. Eventually, and accidentally, they pressed the lever that opened the door and exited to their prize, a scrap of fish. The next time the cat was constrained within the box, the cat attempted fewer of the ineffective responses before carrying out the successful escape. After several trials, the cat learned to almost immediately perform the correct response.

Observing these changes in the cats’ behavior led Thorndike to develop his law of effect, the principle that responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a similar situation, whereas responses that produce a typically unpleasant outcome are less likely to occur again in the situation (Thorndike, 1911). The essence of the law of effect is that successful responses, because they are pleasurable, are “stamped in” by experience and thus occur more frequently. Unsuccessful responses, which produce unpleasant experiences, are “stamped out” and subsequently occur less frequently.

The influential behavioral psychologist B. F. Skinner (1904–1990) expanded on Thorndike’s ideas to develop a more complete set of principles to explain operant conditioning. Skinner created specially designed environments known as operant chambers (usually called Skinner boxes) to systemically study learning. A Skinner box, (or operant chamber) is a structure that is big enough to fit a rodent or bird and that contains a bar or key that the organism can press or peck to release food or water. It also contains a device to record the animal’s responses.

The most basic of Skinner’s experiments was quite similar to Thorndike’s research with cats. A rat placed in the chamber reacted as one might expect, scurrying about the box and sniffing and clawing at the floor and walls. Eventually, the rat chanced upon a lever, which it pressed to release pellets of food. The next time around, the rat took a little less time to press the lever, and on successive trials, the time it took to press the lever became shorter and shorter. Eventually, the rat was pressing the lever as fast as it could eat the food that appeared. As predicted by the law of effect, the rat had learned to repeat the action that brought about the food and cease the actions that did not.

Reinforcement and Punishment

Skinner studied, in detail, how animals changed their behavior through reinforcement and punishment, and he developed terms that explained the processes of operant conditioning. Skinner used the term reinforcer to refer to any event that strengthens or increases the likelihood of a behavior, and the term punisher to refer to any event that weakens or decreases the likelihood of a behavior. He used the term positive to refer to “giving or applying” and the term negative to refer to “reducing or removing.” Thus, positive reinforcement strengthens a response by giving or applying something pleasant after the response, and negative reinforcement strengthens a response by reducing or removing something unpleasant. For example, giving a child praise for completing homework represents positive reinforcement, whereas taking aspirin to reduce or remove the pain of a headache represents negative reinforcement. In both cases, the reinforcement makes it more likely that the behavior will occur again in the future.

How Positive Reinforcement, Negative Reinforcement, and Punishment Influence Behavior

Operant Conditioning Term Description Outcome Example
Positive Reinforcement Add or increase a pleasant stimulus Behavior is strengthened Giving a student a prize after he gets an ‘A’ on a test
Negative Reinforcement Reduce or remove an unpleasant stimulus Behavior is strengthened Taking painkillers that eliminate pain increases the likelihood you will take painkillers again
Positive Punishment Present or add an unpleasant stimulus Behavior is weakened Giving a student extra homework after misbehavior in class
Negative Punishment Reduce or remove a pleasant stimulus Behavior is weakened Taking away an adolescent’s phone after missing curfew

Reinforcement, either positive or negative, works by increasing the likelihood of a behavior. Punishment, on the other hand, refers to any event that weakens or reduces the likelihood of a behavior. Positive punishment, which is also referred to as punishment by application, weakens a response by presenting something unpleasant after the response, whereas negative punishment, also called punishment by removal, weakens a response by reducing or removing something pleasant. A child who is given additional chores after fighting with a sibling (positive punishment because it “gives” the undesired chores) or who loses the opportunity to go to recess after getting a poor grade (negative punishment because it “takes away” the desired time in recess) would be less likely to repeat these operants (behaviors that initiate consequences), such as fighting or getting poor grades.

Although the distinction between reinforcement (which increases behavior) and punishment (which decreases it) is usually clear; in some cases, it is difficult to determine whether a reinforcer is positive or negative. On a hot day a cool breeze could be seen as a positive reinforcer (because it brings in the desired cool air) or a negative reinforcer (because it takes away the undesired hot air). In other cases, reinforcement can be both positive and negative. One may smoke a cigarette both because it brings pleasure (positive reinforcement) and because it eliminates the craving for nicotine (negative reinforcement).

Types of Positive Reinforcement and Negative Reinforcement

Types of positive reinforcement that are effective in everyday life are categorized as primary reinforcers (e.g., food, drink, and conditions necessary for survival) and secondary or conditioned reinforcers (e.g., experiences we learn to like, such as verbal praise or approval, the awarding of status or prestige, and money). Secondary or conditioned reinforcers are culturally dependent. For instance, in an individualist culture, the reward of a promotion is considered reinforcing (positive reinforcement, secondary or conditioned reinforcer), whereas in a collectivist culture, being singled out as better than the collective group may produce shame (positive punishment by producing or giving the undesired experience of shame).

Types of negative reinforcement are categorized as escape conditioning or avoidance conditioning. For instance, if an aversive stimulus (such as a headache) starts and the person demonstrates an action (such as taking a painkiller) to make the headache stop, escape conditioning is demonstrated. The person performs an action to “escape” the existing pain (thus taking away the undesired headache). On the other hand, if the person takes a painkiller prior to the beginning of a headache, then avoidance conditioning is demonstrated. The person performs an action to “avoid” the pain of a headache from beginning (thus taking away the undesired headache before it starts).

Schedule of Reinforcement

Perhaps you remember watching a movie or being at a show in which an animal, maybe a dog, a horse, or a dolphin, performed some amazing behaviors. The trainer gave a command and the dolphin swam to the bottom of the pool, picked up a ring on its nose, jumped out of the water through a hoop in the air, and then took the ring to the trainer at the edge of the pool. The animal was trained to do the trick, and the principles of operant conditioning were used to train the animal. These complex behaviors are a far cry from the simple stimulus-response (or operant-consequence) relationships that we have considered thus far.

One way to expand the use of operant conditioning is to modify the schedule on which the reinforcement is applied. To this point, we have only discussed a continuous reinforcement schedule, in which the desired response is reinforced every time it occurs. For instance, a continuous reinforcement schedule is each time your dog rolls over on command, you give your dog a biscuit. Continuous reinforcement results in relatively fast learning but also rapid extinction of the desired behavior once the reinforcer disappears.

Most real-world reinforcers are not continuous. They occur on a partial (or intermittent) reinforcement schedule, a schedule in which the responses are sometimes reinforced, and sometimes not reinforced. In comparison to continuous reinforcement, partial reinforcement schedules lead to slower initial learning, but they also lead to greater resistance to extinction. Because the reinforcement does not appear after every behavior, it takes longer for the learner to determine that the reward is no longer coming, and thus extinction is slower. The four types of partial reinforcement schedules are summarized in the chart titled “Partial (or Intermittent) Reinforcement Schedules.”

Partial (or Intermittent) Reinforcement Schedules

Reinforcement Schedule Explanation Real-world example
Fixed-ratio Behavior is reinforced after a specific number of responses. Factory workers who are paid according to the number of products they produce
Variable-ratio Behavior is reinforced after an average, but unpredictable, number of responses. Payoffs from slot machines and other games of chance
Fixed-interval Behavior is reinforced for the first response after a specific amount of time has passed. People who earn a monthly salary
Variable-interval Behavior is reinforced for the first response after an average, but unpredictable, amount of time has passed. Person who checks voice mail for messages

Partial reinforcement schedules are determined by whether the reinforcement is presented on the basis of the time that elapses between reinforcement (interval) or on the number of responses the organism engages in doing (ratio). Partial reinforcement schedules are also determined by whether the reinforcement occurs on a regular (fixed) or unpredictable (variable) schedule. In a fixed-interval schedule, reinforcement occurs for the first response made after a specific amount of time has passed. In a variable-interval schedule, the reinforcer appears on an interval schedule, but the timing is varied around the average interval, making the actual appearance of the reinforcer unpredictable. An example might be checking your e-mail: You are reinforced by receiving messages that come, on average, say every 30 minutes, but the reinforcement occurs only at random times. Interval reinforcement schedules tend to produce slow and steady rates of responding.

In a fixed-ratio schedule, a behavior is reinforced after a specific number of responses. For instance, a rat’s behavior may be reinforced after it has pressed a key 20 times, or a salesperson may receive a bonus after he or she has sold 10 products. Once the organism has learned to act in accordance with the fixed-reinforcement schedule, it will pause only briefly when reinforcement occurs before returning to a high level of responsiveness. A variable-ratio schedule provides reinforcers after a specific but average number of responses. Winning money from slot machines or on a lottery ticket are examples of reinforcement that occur on a variable-ratio schedule. For instance, a slot machine may be programmed to provide a win every 20 times the user pulls the handle, on average. Ratio schedules tend to produce high rates of responding because reinforcement increases as the number of responses increase.

Complex behaviors are also created through shaping, the process of guiding an organism’s behavior to the desired outcome through reinforcing successive approximations until the desired behavior occurs. In other words, shaping is a process of reinforcing in gradual, progressive steps until the entire target behavior occurs on its own. Skinner made extensive use of this procedure in his boxes. For instance, he could train a rat to press a bar two times to receive food. First, Skinner provided food when the animal moved near the bar. After that behavior was learned, Skinner provided food only when the rat touched the bar. Further shaping limited the reinforcement to only when the rat pressed the bar, to when it pressed the bar and touched it a second time. Finally, food was provided only when the rat pressed the bar twice. Shaping is also used in everyday behaviors, such as when you teach your puppy to roll over on command. Another example of shaping is when you potty train a child. Through shaping, operant conditioning principles are applied by reinforcing a chain of behaviors that lead to the organism completing the entire target behavior.

CHP. 3,

Learning Through Cognitive Processes


Although classical and operant conditioning play a key role in learning, they constitute only part of the total picture. One type of learning that is not determined only by behavioral conditioning occurs when we suddenly find the solution to a problem, as if the idea just popped into our head. This type of learning is known as insight. Insight is the sudden understanding of a solution to a problem. The German psychologist Wolfgang Köhler (1925) carefully observed what happened when he presented chimpanzees with a problem that was not easy for them to solve, such as placing food in an area that was too high in the cage to be reached. He found that the chimps first engaged in trial-and-error attempts at solving the problem, but when these efforts failed, they seemed to stop and contemplate. Then, after this period of contemplation, they suddenly seemed to have a solution by standing on a chair to reach the food or by knocking down the food with a stick. Köhler argued that it was this flash of insight, not the prior trial-and-error approaches, that allowed the animals to solve the problem.

Edward Tolman (Tolman & Honzik, 1930) studied the behavior of three groups of rats that were learning to navigate through mazes. The first group always received a reward of food at the end of the maze. The second group never received any reward, and the third group received a reward, but only beginning on the 11th day of the experimental period. As you might expect when considering the principles of operant conditioning, the rats in the first group quickly learned to negotiate the maze, while the rats of the second group seemed to wander aimlessly through it. The rats in the third group, however, although they wandered aimlessly for the first 10 days, quickly learned to navigate to the end of the maze as soon as they received food on day 11. By day 12, the rats in the third group had caught up in their learning to the rats that had been rewarded from the beginning.

Tolman concluded the rats that had been allowed to experience the maze, even without any reinforcement, had nevertheless learned from doing so, and Tolman called this latent learning. Latent Learning refers to learning that is not reinforced and not demonstrated until there is motivation to do so. Tolman argued that the rats had formed a “cognitive map” of the maze but did not demonstrate this knowledge until they received reinforcement. A cognitive map is a mental representation of a familiar environment. A cognitive map may differ from the actual physical environment because a cognitive map tends to focus on the most important characteristics of the physical environment.

Observational and Vicarious Learning

Please meet Albert Bandura, the Stanford University Professor that conducted the seminal “Bobo Doll” experiments.

Albert Bandura

Albert Bandura

Albert Bandura’s research made substantial contributions to our understanding of the role of cognition in learning to demonstrate behavior.

LIFE Photo Archive – Wikimedia Commons – public domain.

The Bobo Doll Experiment (1961)

The Bobo Doll Experiment

The first row of four images shows the adult model demonstrating aggressive behavior. The middle row of four images shows a boy imitating aggressive behavior. The bottom row of four images shows a girl demonstrating aggressive behavior. Source

The Bobo doll experiment studied the effects of exposure to aggressive and non-aggressive adult models on the amount of imitative learning among 37- to 69-month old children enrolled in Stanford University’s nursery school. The experimental group subjects were exposed to film endings with aggressive adult models striking the Bobo doll. The adult in the film ending punched the Bobo doll in the nose, kicked it, threw objects at it, and hit its head with a mallet. While demonstrating aggressive behavior toward the Bobo doll, the adult yelled, “Sockeroo!” Another film ending demonstrated the same type of aggressive actions, but a second adult rewarded the aggressive adult saying, “You’re a champion!” The second adult also rewarded the aggressive adult with candy and soft drinks. The subjects in the control group were exposed to adults in a film ending that were subdued and nonaggressive in their behavior toward the Bobo doll. The dependent variable, aggression, was measured by rating the subjects on four, five-point rating scales by the experimenter and a nursery school teacher, both of whom were well acquainted with the children. These scales measured the extent to which subjects displayed physical aggression, verbal aggression, aggression toward inanimate objects, and aggressive inhibition. Aggressive inhibition is defined as the subjects’ tendency to inhibit aggressive reactions in the face of high instigation.

Bandura’s research found that children who saw the adult rewarded for aggression demonstrated the most aggressive acts, according to the dependent variable measurement. Many of the subjects that were shown the film ending that demonstrated the adult behaving aggressively (with no rewards from the second adult) displayed aggressive behavior, according to the dependent variable measurement. Subjects exposed to non-aggressive adult models demonstrated significantly less aggressive behavior, according to the dependent variable measurement.

Observational Learning

Observational Learning, which is also called observational conditioning, refers to learning by imitation. The film ending that showed the adult model punching, hitting, kicking, and throwing objects at the Bobo doll while yelling, “Sockeroo!” initiated observational learning. The children imitated the aggressive behavior that the adult model demonstrated.

The image below shows the subject imitating the adult model’s behavior; in other words, the child learned by observational learning.

Adult Model Demonstrating and the Child Imitating Aggressive Behavior

The adult model is demonstrating aggressive behavior by hitting the bobo doll with a mallet.  The child imitates the behavior of the adult model.

The adult model is demonstrating aggressive behavior by hitting the bobo doll with a mallet. The child imitates the behavior of the adult model.

Vicarious Learning

Vicarious Learning, which is also called vicarious conditioning, refers to a type of observational learning that includes learning by watching or hearing about the consequences of behavior. The film ending that demonstrated vicarious learning showed the adult model punching, hitting, kicking, and throwing objects at the Bobo doll while yelling, “Sockeroo!” with the second adult rewarding the aggressive behavior in the adult model by saying, “You’re a champion!” and giving candy.



If you watched your co-worker ask your boss for a raise, and if that co-worker was fired for asking for a raise, would you ask your same boss for a raise? Of course not! The reason is likely because you saw your co-worker get fired, and you both have the same boss. In other words, you learned by observing an experience *and* observing the consequences associated with the experience. Therefore, you learned by vicarious learning.

Research Focus: The Effects of Violent Video Games on Aggression

Frequent exposure to violence through movies, video games, and other activities tend to be associated with an increase in aggressive behavior (The Henry J. Kaiser Family Foundation, 2003; Schulenburg, 2007; Coyne & Archer, 2005). The evidence suggests that an increase in viewing media violence is associated with an increase in aggressive behavior (Anderson et al., 2003; Cantor et al., 2001). The correlation between viewing television violence and aggressive behavior is about as strong as the relation between smoking and cancer or between studying and academic grades.

A meta-analysis by Anderson and Bushman (2001) reviewed 35 research studies that tested the effects of playing violent video games on aggression. The studies included both experimental and correlational studies, with both male and female participants, in both laboratory and field settings. They found that exposure to violent video games was consistently associated with an increase in aggressive thoughts, aggressive feelings, psychological arousal (including blood pressure and heart rate), as well as aggressive behavior. Furthermore, a negative correlation was found between playing video games and altruistic behavior. In other words, as playing violent video games increased, altruistic behaviors tended to decrease.

In one experiment, Bushman and Anderson (2002) assessed the effects of viewing violent video games on aggressive thoughts and behavior. Participants were randomly assigned to play either a violent or a nonviolent video game for 20 minutes. Each participant played one of four violent video games (Carmageddon, Duke Nukem, Mortal Kombat, or Future Cop) or one of four nonviolent video games (Glider Pro, 3D Pinball, Austin Powers, or Tetra Madness). Participants then read a story and were asked to list 20 thoughts, feelings, and actions about how they would respond if they were the person in the story. One of the stories they used follows:

Todd was on his way home from work one evening when he had to brake quickly for a yellow light. The person in the car behind him must have thought Todd was going to run the light because he crashed into the back of Todd’s car, causing a lot of damage to both vehicles. Fortunately, there were no injuries. Todd got out of his car and surveyed the damage. He then walked over to the other car.

The students who had played one of the violent video games tended to respond more aggressively to the story than did those who played the nonviolent games. Some of the responses were: “Call the guy an idiot,” “Kick the other driver’s car,” and “This guy’s dead meat!” As importantly, Seymour, Yoshida, and Dolan (2009) found that just as children learn to be aggressive through observational and vicarious learning, they can also learn to be altruistic.