Reinforcement versus Punishment, and why they are both a bit tricky

IMG_0216In behavior science terminology, Reinforcement (R) and Punishment(P) are functions or “consequences” that either strengthen(R) or weaken(P) behavior.

You won’t see reputable zoos or marine mammal trainers trying to show their animals “who’s boss,” not only because it’s a dumb idea to try to boss around a dolphin, but because punishment stimulates anxiety and unpredictable escape/avoidance behaviors. These difficult to control, potentially aggressive, destructive, and unhealthy behaviors can emerge (and continue) well after punishment has ended.

Many dog trainers now call themselves “force-free” or “all positive.” Although I too recognize what marine mammal trainers and zoo-keepers and other behavior scientists have learned about the dangers of punishment, I don’t call myself an “all positive” trainer because I know that even a leash is not all positive from the point of view of many dogs. Any dog might understandably rather avoid a veterinarian visit, a toe nail clipping, or even ending a fun game. We can and should remove aversive things such as shock, choke, prongs from pet training programs, but the function of punishment is not so easily eliminated. Eliminating punishment, sounds like a good idea, but it’s like eliminating gravity. In spite of our best efforts, punishment happens. Our job is to recognize what punishment is and where it is happening, and do what we can to prevent it from weakening the animal behaviors we want to cultivate.

Accidental reinforcement is not quite as harmful as accidental punishment. Kindness, compassion, generosity, love does not ruin animals or make us mentally or emotionally ill, as excessive punishment does. Sure excess hotdogs are fattening, but they don’t really “spoil” a dog. What “spoils” behaviors is associating desired things with undesirable behavior, thereby reinforcing (strengthening) behaviors you don’t like.

Training is about associating the desired behaviors with desirable consequences, and undesirable behaviors with undesirable consequences. With humans, we can just explain (“after you do your homework, we’ll watch a movie”). But animals learn by experience, so training requires perfect timing for animals to clearly associate behavior and consequence. Delivering hotdogs right after the dog begs at the table? Toenail clipping right after the dog comes when called? That is a confusing.

The environment delivers rewards as well as punishments arbitrarily, sometimes reinforcing bad behavior (I found a cookie!) as well as punishing good behavior (puppy sits and someone steps on her tail). Hate it when that happens! This is why trainers place so much attention on setting the animal up for success with a carefully controlled environment. But even when you are working in a carefully planned low distraction environment, and you’ve set up your dog for success, mistakes happen. One common mistake is misuse of cues.

Ask yourself this: do cues function as rewards? Or punishment? Neither? Both?

Maybe this can gets to the crux of many training problems. To the degree that any or all cues (down/stay/sit/come etc) are disappointing/oppressing or bothering your dog (or kid!), expect those cues to fail. The function of punishment is to weaken or stop behavior and it does so because animals work to escape or avoid punishment. If your “sit!” or “down!” or “off!” cue is often functioning as punishment, your dog will be working to avoid it.

But, you’re a great trainer, and your cues are welcome opportunities, fun paying jobs that your dog loves! Your cues are music to your dog’s ears, like the tinkling bell of the ice cream truck! Like Pavlov’s dog, you cue “come!” and your dog salivates!

That’s great! But if you want that conditioned response to strengthen desirable behavior, it needs to be delivered during or just after the dog is performing a behavior you like. Conditioned reinforcement strengthens the behavior it follows. So if your dog is chasing a cat and you are yelling “come!” it might actually (oopsie!) strengthen the chase response. Or, if you say “sit” or “heel” and your dog doesn’t immediately sit or heel, and you say it again, and again, you’re strengthening the dog’s poor response. Much like feeding hotdogs when dogs beg at the table, some pet owners ignore their dog’s good behaviors, and feed them cues only when they are misbehaving!

So what does this mean in real life terms? Partly it means, don’t let your dog chase the cat. Set your dog up for success and prevent rehearsal of undesirable behaviors. You know where you and your dog are and what you are doing when your dog screws up. Don’t do that! If a plan (or no plan) is not working, change it. Make a plan to set your dog up for success and to practice and reward the behaviors you’d rather see happen. When accidents happen, as they surely do, just go get your dog and leash or crate or move her away, don’t stand there delivering conditioned reinforcement (cues) when you know she is not responding.

IMG_5500When my dog breaks a sit or a down, I avoid a re-cue. Instead I can deliver a “release signal” (“okay! all done!”) which leaves her wanting more. WHAT?! The game is over? Let me try again!

Don’t set the game up to fail. That is, don’t expect to condition your dog’s response to new cues in a highly distracting environment. Condition the response you want in a low distraction environment, and then build on that by practicing in many different environments, increasing the distractions and difficulty slowly. I also train conditioned “encouragement/discouragement” signals (“yay” versus “oopsie!”) and use them to help dogs think through and solve a puzzle. Like the game “colder colder/hotter hotter,” conditioned encouragement and discouragement (“oopsie” and “yay!”) can help dogs develop confidence in solving behavior problems and finding prizes, reinforcing behaviors that I like.

“Things” can function as both reinforcement or as punishment, depending on when and how you use them, and how you build your associations. Hotdogs are not always reinforcement. Lures, for example, often appear to function as punishment when trainers withhold food too long under the dog’s nose, and the dog gets frustrated, confused and gives up. The dog might be wondering, “Can I have that hotdog or can’t I? Am I supposed to follow the food in your hand, or will I get in trouble for doing that?”

Animals work to get information, and they work to avoid confusion. Animals aren’t born with any understanding of human language. Their response to cues is a conditioned response that develops through real-life learning experiences and associations, and not because you’ve shown the dog who is boss.

That’s enough for today! I enjoy comments or questions, and specific examples if you have them, below!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s