Understanding Reinforcement Learning: Strategies and Benefits

Reinforcement learning has several different meanings. However, in the area of human psychology, reinforcement refers to a very specific phenomenon. Namely, reinforcement indicates that the consequence of an action increases or decreases the likelihood of that action in the future. However, reinforcement is much more complex than this.

Behaviors can be reinforced based on the behavior itself, the duration of the behavior, or the severity behavior. There are also different ways that behaviors can become reinforced. As such, learning about reinforcement and its role in teaching requires an in-depth exploration of the various concepts surrounding reinforcement.

Contents show

Positive and Negative

To understand reinforcement, it’s first important to understand what the terms positive and negative mean in this specific context. When discussing reinforcement, positive and negative don’t refer to good or bad. Instead, positive means to add a factor while negative means to take away a factor.

Take this simple example. A child performs an action. In response, a parent adds a factor. If that factor is something the child enjoys, then it’s a positive reward. If it’s a factor the child doesn’t enjoy, it’s a positive punishment.

To put this in concrete terms, a child who performs a good behavior is given a positive reward in the form of candy. In contrast, a child who performs bad behavior is given a positive punishment in the form of a spanking. In both cases, a positive factor was added: candy and a spanking.

The first was a reward and the second a punishment. This is how we arrive at the two psychological concepts of positive rewards versus positive punishments.

Now, let’s take a look at what negative means in this context. A child performs an action. In response, a parent takes away a factor. If that factor is something the child does not enjoy, then it’s a negative reward. If that factor is something the child does enjoy, then it’s a negative punishment.

So, a child who performs a good behavior has something taken from their environment that they don’t enjoy, like an educational video they’ve been forced to watch all day. In contrast, a child who performs bad behavior has something taken from their environment that they do enjoy, like a favorite toy.

The first example is called a negative reward because the child was rewarded for their behavior by taking the hated educational video out of the environment. The second example is a negative punishment because it takes the favorite toy out of their environment.

For most people first trying to learn about reinforcement, these concepts can be difficult to understand. Positive and negative do not intuitively make sense in this context, at least for many people. So long as you remember that positive means to add something to the environment while negative means to take something out of the environment, then you have a basic understanding of positive and negative in the psychological context.

Primary versus Secondary Reinforcers

In reinforcement, there are also the concepts of primary versus secondary reinforcers. Primary reinforcers are the reinforcers that are an inherent part of the human experience. People need water, food, and sleep. They instinctively respond to these, making these particularly salient and powerful reinforcers. These are therefore called primary reinforcers.

No matter how much time passes, no matter how a person grows, they will remain in need of things like food and sleep to survive. These innate, unchanging drives are the strongest reinforcers.

A simple example of using a primary reinforcer is in the form of food. If a child cleans up the entire house and is made a delicious meal in response, then they are having their behavior reinforced with a primary reinforcer. Important to the concept of primary reinforcers is the concept of pleasure. People receive innate pleasure from indulging their most basic drives, and pleasure itself is considered a primary reinforcer.

A secondary reinforcer, on the other hand, does not have the same core value to human beings. Instead, they are associated with primary reinforcers. A commonly used example of a secondary reinforcer is praise. Verbal praise can feel empty or it can feel as if it has value.

When verbal praise is routinely linked with completing a behavior and little else, then it can feel repetitive. If, however, praise is linked to affection, then it can become much more strongly reinforcing. Imaging accomplishing a certain task during a basketball game and receiving high fives and pats on the back. The simple addition of affection, which humans are driven to receive, associates the praise with the affection and makes it a secondary reinforcer.

Money is the most obvious example of a secondary reinforcer. Money, on its own, has no inherent value. We value money as a reward because we associate it with the ability to buy things that we receive pleasure from, such as food or entertainment. However, divorced from those things, money has no value.

Dollars may carry no value in a foreign country that doesn’t exchange money or accept dollars in a store. As such, money is only a secondary reinforcer because it has no inherent value on its own.

For teachers, primary and secondary reinforcers play a role in reinforcing behavior even when the teacher isn’t aware that they are at work. It’s not uncommon for teachers to reward students with tokens, for example, that can be traded for prizes. In this case, the prize would be a primary reinforcer, since it gives the student pleasure.

In contrast, the token itself is a secondary reinforcer. The token has no value on its own but is linked to the primary reinforcer that satisfies the student’s pleasure drive. As a result, many teachers have been drawing on the concepts of primary and secondary reinforcers without ever knowing about it.

Reinforcement in the Classroom

Reinforcement therefore has value in the classroom, since it can be an important behavioral management tool. Reinforcement can be used to teach skills or behaviors that the child will benefit from and that teachers hope to see more commonly in the class. As teachers become more skilled at reinforcing behavior, they can make the reinforcement process more effective and encourage more behaviors that they wish to see in their students.

First of all, it’s important for teachers to identify what has value to their students. Reinforcement only works when what is being used actually reinforces the behavior and motivates future repetition of that behavior. However, this is impossible to do without first identifying what is of value to the student.

Once this has been done, teachers can then actually reinforce what the student is doing. This requires the teacher to become familiar with their students and create an environment in which teachers have a close relationship with their students.

Teachers can also create a reinforcer survey to gauge their classes. These surveys basically gauge the students to come to a better understanding of what each student values. The best reinforcers are those that are personalized to the student and have significant meaning to them. This might not always be possible for all teachers though.

While some researchers recommend making personally tailored surveys for each student, the practical reality for most teachers is that they are already carrying a host of duties, from developing lesson plans to meeting with parents and handling different types of paperwork. However, teachers can still create a general survey that includes an open response section where students can provide some background on themselves and what they value.

Finding reinforcers can be a problem for teachers if the student has limited communication skills. Students with certain special needs may not be able to adequately communicate what it is that they enjoy. In these cases, teachers can take two approaches to reinforcing their student’s behavior.

The first step should be to observe the student to try and identify what it is the student values most in their environment. This can provide clues as to what acts as a reinforcer for that student.

Second, teachers should of course get in contact with that student’s parents. Even short discussions can help reveal what the student responds to. Parents are in a better position to understand what their child wants or enjoys than a teacher is, and communication can help identify ways of reinforcing communication in the classroom.

Phases of Reinforcement

Researchers suggest that reinforcement should be changed over time. In other words, how behavior is reinforced should change from the first time a behavior is reinforced to behavior that happens much later in the school year.

The first principle of rewarding behavior is that, if a behavior is going to be given, it should be given immediately after the behavior has been performed. There should be a minimal level of delay between the moment a student learns the behavior versus when their reward is given, which encourages an association between the behavior and the reward.

The longer the time between a behavior and a reward, the weaker the association between the two, which makes it less likely that the behavior will be reinforced. At the same time, teachers can pair the reward with verbal praise, a secondary reinforcer. This helps to associate the praise with the reward and gives the praise value as a reinforcer, making the verbal praise motivating and encouraging repeat behavior.

Over time, teachers should try to remain aware of whether their reinforcers are working to motivate repetition over a behavior. It’s possible that, over time, students become desensitized to the reward. They start to respond less strongly and they become less likely of repeating the behavior. When this happens, the teacher needs to start changing the ways in which they reward people.

The first way to change up the rewards system is through a concept known as deprivation. Using deprivation, teachers start depriving the student entirely of the reward.

Over time, this may motivate the student to start demonstrating the behavior again out of a desire to be rewarded again. The reward should be kept out of the environment entirely so that the student can’t indulge it in any way.

If a student responds highly to candy, for instance, there shouldn’t be any free candy available in the classroom. This will help prevent students indulging their desire for candy when the teacher doesn’t reward them.

Another concept that teachers can take advantage of is reinforcement thinning. Teachers don’t want students performing behaviors out of the expectation of reward alone. Students shouldn’t expect to be rewarded every time for performing a certain behavior. For this reason, teachers can start to delay the time between the behavior and the reward.

Once a student has associated a behavior with a reward, teachers can start delaying the time between when the behavior is performed and when it is rewarded. However, it is important that this occurs only after an association has been created between the two.

Another way that teachers can prevent students from performing behaviors based solely on the hope of getting a reward is by chaining several behaviors together before providing a reward. This makes it less likely for the student to associate the reward with any one behavior. Instead, they will associate a series of behaviors with a reward without the expectation that any one behavior will produce a reward. A teacher can request three different behaviors be completed before the reward is given.

Conclusion

Reinforcement can be a difficult concept for many people to grasp. The concepts of positive and negative, in a reinforcement context, aren’t always intuitive. However, reinforcement is a powerful means of encouraging specific behaviors in students. Rewards are particularly powerful ways of encouraging desired behaviors. However, how reinforcement is implemented should change over time.

Teachers should reward desired behavior but change up how these rewards are given out as the year progresses. Chaining several behaviors together or depriving a student of a reward entirely may help to encourage students to perform a desired behavior.

It may take time, but after experimenting with different reinforcement schedules, teachers may become more effective at encouraging desired behavior within their classrooms.