|Training without Pain|
|Written by Administrator|
|Saturday, 02 June 2007 22:33|
(Note: this article was written for a magazine called Dobe Capers during a period when I was the consulting behaviourist for the Dobermann Club of the Cape, so it refers a lot to problems training Dobermanns. The theory which it attempts to explain is, however, completely relevant to other breeds, and in fact other species!)
If you are reading this article, you have probably trained (or tried to train) a Dobermann at some stage in your life. Perhaps it was easy and enjoyable. On the other hand, perhaps it was a constant battle of wills, a battle between you and a powerful, intelligent, strong-willed animal who loved you, but did not particularly want to do what you wanted him to, and resisted all (or most) of your efforts to make him. Anyone who has been towed along by a Dobe supposedly obeying the command to heel will, I think, recognise himself or herself in this description.
Perhaps you have a dog who is loving and affectionate at home, but bored and resistant in class. Perhaps he avoids you when its time for practice. If he gets bored and resistant enough, perhaps you eventually lose your temper with him, shout at him, and try to force him to respond. He becomes even more resistant, and has to be physically hauled into any posture you want him to adopt, which he abandons as soon as you let him go. If youve gotten tough enough with him, you may even have been bitten.
Perhaps youve tried to do competition obedience with a Dobermann, and have watched your (naturally!) superior animal being easily outstripped by Border Collies, GSDs and other, well, nice dogs. Frustrating, isnt it.
Training like this can quickly become an unpleasant and distressing task, which is easily abandoned. (Any trainer can tell you what the dropout rate from obedience classes is like). When your dog loves you and is so affectionate at home, why go through the misery of fighting with him week after week? Its much easier to find an excuse not to go to class. The fact that you end up with an unreliable and disobedient dog from a guarding breed with a high potential for aggression is just the cross you have to bear. Dobermanns are difficult, and thats all there is to it.
If this has been your experience, dont feel alone. There are many others like you. I have vivid and embarassing memories of...er...training my first Dobermann, a rather plain but fortunately extremely good-natured chap called Billy. I was young at the time (this was 30 years ago) and Billy was very intelligent indeed; he certainly outwitted me every time he tried! I did basic and advanced obedience, tracking, agility (which in those days was aptly named obstacle work) and eventually manwork with him. It was uphill most of the way. He learned everything under considerable duress, except the odd exercise which he enjoyed; those he learned quickly and happily.
At the end of his training, Billy was like the little girl in the nursery rhyme; when he was good, he was very, very good, and when he was bad, he was worse than horrid. When he was good, he would work on hand signals from 100 metres. When he was bad, he wouldnt walk at heel. When I did a right turn, he would do a U-turn and go and lie down somewhere comfortable. He was trained, but he certainly wasnt obedient. Ring any bells?
Will power, stubbornness and resilience to punishment are characteristics of the Dobermann. If its superb intelligence can be harnessed, it is capable of being an outstanding working dog one of the best in the world but it is by no means an easy dog to train, and this has led to the German Shepherd being preferred as a police dog in many countries, including South Africa. In 1956, the New Zealand Police Dog Unit was established, using German Shepherds as the dog of choice. Its founder and Chief Trainer, Inspector Frank Riley, actually kept two Dobermanns as pets (clearly a man of taste), but had this to say about their use as police dogs, having worked with them in the UK:
This dog makes an excellent police dog, but matures slowly and for the best results needs an experienced handler who may have to experiment a little in his training methods.
(from: Born to Obey, by Valerie and Colin Salt, Collins, 1972)
Ring any bells?
Top Cape Town obedience trainer Sandy Lombard says that the Dobermann is far more resilient to punishment than the GSD. A Dobermann will stubbornly resist a series of harsh corrections which would permanently traumatise a GSD, and come bouncing back for more. This determination and hardness of temperament is a wonderful characteristic for a police dog to have, but is offset by the difficulty of training such a dog.
Although individual Dobermanns have performed exceptionally well at obedience, the breed does not dominate in the competitive obedience world, largely because of its stubbornness. Dogs such as GSDs and Border Collies are far easier to get results with, and are thus often the choice of competitive handlers.
Dobermanns can be managed after a fashion. As I have grown older, I have become more authoritative, and am better able to persuade my dogs that I mean what I say. They listen a bit better. But I have to admit that I dont really enjoy doing obedience work with them. I dont like speaking forcefully, correcting sharply, being in any way harsh with my lovely, affectionate dogs. In fact, Im really rather half-hearted about practicing obedience with Slug (my current male), and so is he. Its a frustrating state of affairs, because I really love my dogs and enjoy spending time with them. Wouldnt it be wonderful, wouldnt it be marvellous, I have often thought wistfully, if Dobermanns were like Border Collies, always looking for work, always waiting for the next command, always eager to do what you ask them to?
Well, actually, they are.
Actually, it is easy to harness all that will power, determination and superb intelligence and persuade your Dobermann (yes, that stubborn, recalcitrant you-know-what) to apply it in looking for work, working out what you want him to do, and doing it eagerly, just like a Border Collie, and with no harsh corrections whatsoever.
Does your dog have a reliable sit? By a reliable sit, I mean that the dog is commanded to sit, sits promptly and does not move until given the next command, no matter what the distraction.
Slug doesnt have a reliable sit. He sits when told, but moves as soon as something distracts him. As hes extremely affectionate, if I tell him hes a good dog in a pleased, excited tone of voice, he leaps around me in great excitement, trying to hurl himself into my arms and lick my face. So I decided to teach him to sit until released. As I have a fairly strong background in psychology, I designed the exercise myself.
Within three minutes of starting the exercise, he was sitting, while I praised him, told him what a good boy he was, how clever he was, sending him dilly with delight. His entire bottom was waggling, his eyes were shining, his ears were up - and he was quivering with the effort of holding himself in the sit! Being a Dobermann, of course, with all that intelligence, will power and stubbornness, he succeeded. I used no correction at all. In fact, he didnt have his lead on, and I didnt touch him once during the exercise, except to praise. For the next 15 minutes, he followed me around begging for the next command! Even one of his archenemies barking in the road outside my house failed to distract him. I could not believe my eyes!
Just like a Border Collie? Streets ahead!
At this juncture, you are possibly harbouring a suspicion that I was under the influence of some or other interesting substance while writing this article. Not at all.
The method used to achieve this happy state of affairs is called operant conditioning, and has been used by animal trainers for centuries. You have probably heard the term positive reinforcement somewhere along the line. It is a term used extremely loosely and casually, but applied in its strictest sense as a training method in conjunction with the other concepts of operant conditioning, it gets results that seem little short of miraculous.
Needless to say, I am by no means the first person to have thought of training a dog this way. (In fact, it took rather a long time for the penny to drop with me.) Dr C.W. Meisterfeld, an American canine psychoanalyst who is the first dog psychologist to have been certified as an expert witness by the US judiciary, has been developing a dog training method based on positive reinforcement since 1944. In 1957 he entered the competitive world of American Kennel Club obedience to prove that these principles could be successful in training a German Shorthaired Pointer which others considered (at that time) a breed that was far too stubborn (ring any bells?) for competitive obedience. On November 10, 1957 at the Southern Michigan Obedience Training Club show, Meisterfeld's bitch "Baroness Meisterfeld" received her third leg and the Canine Distinction Award for AKC obedience for earning an average score of 196-1/2 at three consecutive shows inside of seven days. In 1962 "Baroness Meisterfeld C.D.X." won the National German Shorthair Pointer Retriever Championship with a (considered impossible) perfect score of 500 points. She retained the championship for 1963 and 1964 where she also won the 1964 National All German Pointing Breeds Championship.
And using similar methods, other trainers have achieved such titles and awards as:
and many more.
Beyond belief? Not at all.
End of part one. For a discussion of how the method works, see the next issue of Dobe Capers, which will appear in about three months time, or possibly later, depending on how busy the committee are .
I beg your pardon? Oh, all right then, if you insist.
Operant conditioning: Operant conditioning is a natural learning process which was formally described and applied primarily by the American psychologist, Burrhus F. Skinner, an exciting and controversial figure in the development of psychology as a science. Beginning round about 1938, he developed a series of experiments to investigate learning in animals, and succeeded in doing things such as, for example, training a pigeon to peck only at a yellow disc, and to ignore discs of all other colours. Although Skinner would probably not have phrased it this way, the central idea behind his theory is that animals (and people) learn by trial and error, and will modify their behaviour in order to obtain a reward or to avoid something unpleasant. In the course of his experiments, he defined the term positive reinforcement, and several other related terms.
While it is not necessary to have a deep academic understanding of Skinners work in order to train a dog, understanding the most important terminology is extremely helpful as the methods used are somewhat different from those of traditional dog handling. While Skinner worked largely with pigeons, I will therefore try to explain his terminology using dog-handling exercises as examples.
Behaviour: A behaviour is something a dog does, such as sitting on command or pulling on the lead. It might be something you would like to get him to do, such as sitting on command. Getting him to do it the first time is called eliciting the behaviour. Getting him to learn to do it every time you ask is called establishing the behaviour. And getting him to keep on doing it instead of trying his luck is called maintaining the behaviour.
Alternatively, it might be something you want him to stop doing, such as pulling on the lead. Getting him to stop doing it as much is called weakening the behaviour. Getting him to stop doing it completely is called extinguishing the behaviour.
Positive and Negative Reinforcement: There are two ways of getting the dog to do something you want (establishing the behaviour), such as sitting on command. These are called positive reinforcement and negative reinforcement. Positive reinforcement means waiting for the dog to sit accidentally and giving him a reward, such as a food treat, as soon as he does (this is capturing the behaviour), or luring the sit and rewarding it when it occurs (this is eliciting the behaviour). He learns with remarkable rapidity that sitting on command will earn him a reward, and will repeat the behaviour if he continues to receive the reward, thus establishing the behaviour.
Negative reinforcement means doing something unpleasant to the dog, such as using a shock collar to shock him until he sits, and then immediately stopping the shock.
More simply put, positive reinforcement means giving him something good and negative reinforcement means taking away something bad. Reinforcement is always done immediately after the behaviour you want (i.e. as soon as he sits, you either give him the treat or stop shocking him), but never before (i.e. you dont give him the treat or stop shocking him to encourage him if he hasnt sat!)
Negative reinforcement works extremely well and produces behaviours which are extremely resistant to extinction (i.e. difficult to get rid of). Applying aversives correctly takes some skill, however, and getting it wrong can cause serious side effects such as stress, anxiety and even aggression, and there is also the training problem that the dog quickly learns to avoid the unpleasant stimulus which might result in his avoiding you! So unless you know exactly what you are doing, you're probably better off sticking to positive reinforcement.
Punishment and Non-reinforcement: There are several ways of getting the dog to stop doing something you dont want him to do, such as pulling on the lead. Academic psychologists call the two most common ones punishment and extinction, but to avoid confusion I am going to call them punishment and non-reinforcement.
You can punish in two ways, either by doing something unpleasant to the dog such as smacking him on the nose or spraying hin with a citronella spray, or by removing something pleasurable that he has, such as his food, a toy or your attention.
Non-reinforcement means exactly what it says. You dont reinforce the behaviour, either positively or negatively. You ignore it completely, and you continue to ignore it. For ever.
Punishment works better in the short term, as it stops the unwanted behaviour immediately, but once the punishment has been over for a while, the behaviour will return (unless the punishment is severe and your timing is excellent!). This is called spontaneous recovery. In other words, if you thump your Dobe for pulling on the lead, he will stop pulling for about two minutes (if youre lucky) and then start again. Dogs and most other animals also get habituated, or used to a particular punishment (psychologists love long words, dont they?) In other words, the more often you punish a dog in the same way, the less effective the punishment gets. One can surmise that a stubborn dog such as a Dobermann habituates, or gets used to a particular punishment very quickly, and soon learns to ignore it, in which case it no longer acts as a punishment. Shouting is a good example of a punishment to which the dog is habituated. (Ring any bells?)
Positive punishment has the same sort of fallout as negative reinforcement, and very few people apply it expertly enough to make it work well without either traumatising the dog or being so poorly applied as to be useless. It is thus an approach to be avoided except in extreme cases, and needs to be carefully thought out before being applied.
Negative punishment can have quite strong emotional consequences for the dog and is a very powerful technique. Again, it needs very good timing and understanding of what you are doing, so this artlcle will concentrate on extinction, or non-reinforcement.
Non-reinforcement wont have as much effect as punishment immediately, but over a longer period will often get rid of the behaviour completely or almost completely. It is also by far the best way to make sure that a behaviour you dont want doesnt get established in the first place. If you do not reward the dog when he pulls on the lead the first few times as a pup, he is far less likely to persist in pulling. (If that sounds rather odd to you, its supposed to.)
Some behaviours can be extremely difficult to get rid of, or extinguish, though, because they are so well-established that even one reinforcement occasionally is enough to keep them going. In other words, if your dog has formed a really strong habit of pulling on the lead, you can ignore his pulling for three weeks and then reward him for it once, and hell keep on doing it. (I hope youre becoming rather mystified at the moment, because of course you dont reward your dog for pulling on the lead. Or do you?)
Some behaviours are also intrinsically reinforcing, or are biologically motivated. For example, pulling on the lead is partly due to an opposition reflex called thigmotaxis, which means basically that the dog will tend to resist physical pressure by pushing or pulling in the opposite direction. Non-reinforcement can thus be very difficult to apply, and a third technique of extinguishing (getting rid of) behaviour called training an incompatible behaviour is sometimes needed.
Training an incompatible behaviour is a phenomenally complicated name for a fairly simple idea. Basically, if you want to get rid of a behaviour that you dont want, such as pulling on the lead, you use positive reinforcement to teach the dog a new, good behaviour which he cant do at the same time as doing the old, bad behaviour. The new behaviour is called the incompatible, or competing behaviour, because it competes with the old behaviour. (Original, that.) Because the new behaviour is reinforced using positive reinforcement, it becomes strongly established, and the old behaviour, which he cant do at the same time as the new behaviour, thus weakens and eventually disappears. In our example you would teach the dog to walk on a loose lead as a competing behaviour, because he cant walk on a loose lead and pull at the same time.
At this point, your mystification must be overwhelming, because you cant teach him to walk on a loose lead when hes pulling all the time. Or can you?
In order to explain this point, its worth having a close look at how a good competition handler stops a dog from pulling. (Remember that punishment stops a behaviour immediately, but that the effect often doesnt last.) As the dog starts to forge ahead, the handler will release all the slack in the lead, swivel sharply on her left foot and take off fast in the opposite direction. The dog is brought up very sharply at the end of the lead and is often pulled off its feet if the handler moves fast enough. This is clearly a punishment. There is thus a short window during which the behaviour stops, i.e. the dog stops pulling and walks on a loose lead (new behaviour!). The reason this correction works is that a good handler will punish every incident of pulling and then immediately praise the dog for walking on a loose lead (positive reinforcement), thus establishing the new, good behaviour. If this is done quickly and emphatically enough, the dog will learn the new behaviour thoroughly and the old behaviour will disappear. In this case, the old behaviour (pulling) has been successfully replaced by a new behaviour (walking on a loose lead). In fact, what the handler has done is punishment followed by the establishment of a competing behaviour.
Unfortunately, most of us dont handle that well. We make two mistakes. We dont punish the dog each and every time he pulls, and we dont praise him quickly enough when he walks on a loose lead. Most importantly, though, what we fail to realise is that every time we let the dog pull, even if it is only for two yards, we are positively reinforcing the pulling behaviour, and thus making the behaviour harder to extinguish.
Whats the difference? This is where the real difference between traditional training and positive reinforcement training starts to emerge, because of course we are not aware of rewarding the dog for pulling! We dont praise him or pat him or give him a treat when he pulls. So whats going on?
The question to ask at this point is why the dog pulls in the first place, and the simple answer is: because he wants to go somewhere. When you get towed along behind him, is he getting there? Of course he is! Every step you take behind a pulling dog reinforces him positively for pulling.
The next question to ask is how you stop reinforcing him, and the simple answer is: you stand still.
And suddenly, you have the means to stop him from pulling without hauling him around and dislocating your shoulder in the process, without shouting and yelling, without even raising your voice. Instead of applying punishment, you simply stop reinforcing the behaviour, and although this doesnt work as fast as punishment, it lasts a great deal longer.
Stopping a dog from pulling now becomes quite simple. All it requires from you is patience and consistency (remember, if you reinforce him even once, hes likely to retain the behaviour). Set out for a walk with him. As soon as he starts to pull, stop in your tracks and stand still. Dont shout at him, dont talk to him, dont yank on the lead, just stand there. When the lead slackens, start walking again and praise him effusively for every little bit that he manages to do on a loose lead. But as soon as he starts to pull again, stop. If hes a hardened puller, you might not get more than a few yards down the road on the first day, but dont give up; hell get the idea more quickly than you think, and of course you will be reinforcing the new behaviour all the time.
Now why didnt I think of that 25 years ago?
In psycho-speak (which I hope youre learning fast), instead of punishment followed by the establishment of a competing behaviour, we now have non-reinforcement followed by the establishment of a competing behaviour. We get all the advantages of positive reinforcement without any of the disadvantages of punishment. It sounds like a good deal to me.
The real beauty of this method is that a strong-willed, determined, highly intelligent dog like a Dobermann wont stop trying to go somewhere, but he will realise very quickly that if pulling doesnt get him anywhere, hell have to try something different; and because hes determined, hell keep on trying until he finds something that works. It wont take long for him to discover that walking on a loose lead reinforces him in two ways; he gets to go where he wants to go and he gets praised for it! Suddenly, all that will power and energy is being put into finding out what you want him to do. The more stubborn and determined the dog, the harder he will work to find a way of getting the reward! Streets ahead of a Border Collie? Light-years ahead!
Living with positive reinforcement: When using positive reinforcement training, its important to make that the basis of your entire relationship with the dog. The golden rule for achieving this is to stop giving him attention, affection and treats (no, it isnt harsh bear with me) and make him work for them.
Dogs which are stroked and petted endlessly by their owners habituate (get used to) to the petting, so it loses its value as a reward. They also get bored and frustrated because they are usually not doing enough work, and will often start getting up to mischief such as excessive barking, chasing cats, digging etc. A dominant dog may also misinterpret petting as submissive behaviour, and can become very aggressive when the owner tries to discipline him, particularly if he has not been consistently handled.
To make matters worse, if your dog has an easy life being fed and petted without having to do anything in exchange, hes not going to enjoy it when you tie a chain around his neck, drag him around the garden or exercise field, push him around physically and shout at him, and will strenuously resist this treatment. Wouldnt you?
To use positive reinforcement really effectively, you need to turn every interaction with your dog into a lesson. Get him to do some small task, even if its only a sit or his latest exercise, before you praise him, pet him, give him a treat or feed him, and dont let him demand attention from you ignore him if he does.
There are several benefits to this:
Try it and see.
Food treats and timing: Food treats are used extensively with this method. Some old obedience hands will question this for one of two reasons, first, that the dog becomes dependent on the treat and second, that the dog gets used to the treat and stops working for it. They are both absolutely right and completely wrong. What is critical is the timing of the treats, and this has to be planned very carefully:
Skinner and his colleagues devoted a great deal of research to what they called reinforcement schedules, and came to the conclusion that this was by far the most important element of operant conditioning, possibly even more important than the degree of pleasure afforded by the reward.
Remember the three stages of teaching a desired behaviour such as sitting on command eliciting, establishing and maintaining. During the first two stages, i.e. when the dog is learning a new behaviour, he should be reinforced continously. In English, that means that you give him the treat every single time he does what you want. When teaching him to sit, give him a treat every single time he sits on command.
However, there are problems associated with continuous reinforcement schedules, one being that the dog sates quickly if he is being given a treat every time he performs the desired behaviour. He simply gets full, or he gets bored with the treat; whatever happens, after a while he will stop working for it.
On the other hand, if you stop reinforcing him altogether, you are practicing non-reinforcement, and again, the behaviour will disappear, and if the behaviour was originally established with a continuous reinforcement schedule, it will tend to disappear quite rapidly!
Skinner & co. found that by far the best way to maintain an established behaviour was to reward it sometimes using a variable ratio reinforcement schedule (what a mouthful!). The best way to illustrate what this means is by using an example.
The ratio is the proportion of sits which are rewarded, and the variable is the number of sits in between reinforcements. Suppose you have successfully taught your dog to sit on command and you want to maintain the behaviour. First you decide what ratio, or proportion of sits you want to reinforce. This means that you reinforce 1 in 5 sits, or 1 in 10 sits, or 1 in 20 sits, whatever ratio you decide on. Suppose you decide to reinforce 1 out of every 5 sits.
You then make sure that you average 1 reinforcement to every 5 sits, but that you vary the number of sits in between reinforcements (hence variable ratio these names do make sense, sort of). It is very important not to reinforce him on every 5th sit, as he will see a pattern emerging. However, after a large number of sits he should have received on average 1 reward for every 5 sits.
So in 20 sits you would give 4 rewards, but NOT on the 1st, 6th, 11th and 16th sits! You might give one on the 2nd sit, one on the 9th, one on the 12th and one on the 17th, in other words, the number of sits which dont get rewarded is different each time, and the dog has no way of working out in advance which sit is going to be rewarded. This keeps him on his toes as he knows that the reward will turn up sometime in the future, but not when.
Behaviours reinforced in this way become extremely resistant to extinction, even in the absence of reinforcement. In other words, they become very difficult to get rid of!
It should now be a bit clearer why a behaviour such as pulling on the lead can be so difficult to eliminate. An occasional and apparently unimportant slip-up like letting the dog pull you a few yards down the road is actually a variable ratio reinforcement the most powerful way of maintaining a behaviour over a long period of time! Non-reinforcement really does mean non-reinforcement!
Positive reinforcement in practice: All right, how did I get Slug to restrain himself from leaping all over me when I praised him? It wasnt too difficult. First, I got out a reward (a food pellet) and made sure he knew I had it. Then I told him to sit, which he did very eagerly because he wanted the pellet. Once he was sitting, I started praising him in an excited tone of voice. As soon as he jumped up at me, I turned my back to him, folded my arms and stared at the wall for a few seconds. This is non-reinforcement of an unwanted behaviour, namely jumping up. Needless to say, he didnt get the pellet.
After a few moments, I turned back (to a very anxious dog!) and repeated the whole process. On the third try (these guys have got brains!) he managed to hold his bum on the floor for a couple of seconds and I immediately gave him the pellet (positive reinforcement) and made a fuss of him. You have never seen such a happy dog! On the fourth try, he stayed sitting for a moment again and then jumped up, just as he did on the third try, so I turned round and folded my arms. On the fifth try, he managed to sit for a bit longer so I rewarded him again. And so on.
(This little-by-little approach is called behaviour shaping. I havent actually discussed it in this article, but its quite easy: You reinforce the dog for doing something similar to the behaviour youre after, but if he repeats the similar behaviour on the next try you withhold the reward. If he then does something even closer to the behaviour you want, you reward him again, but if he repeats that behaviour, you withhold the reward again, and so on. This tells him that hes on the right track, but hasnt got there yet. Only once he is doing exactly what you want do you start reinforcing him continuously.)
Within a few minutes, Slug was determinedly holding himself in a sit while I praised him and made an enormous fuss of him verbally. He would not budge until I had given him a release command. At no time during the lesson did I touch him, except to add physical praise to the pellet, which of course increased the value of the reward.
In fact, by turning my back on him, I actually punished him for jumping up as well by removing my attention and praise, which he was thoroughly enjoying. (Remember that the second type of punishment is the removal of something pleasurable to the dog.)
Easy when you know how, isnt it? Ironically, I have been fascinated by the principles of operant conditioning for several years and have in fact successfully applied them to humans! As a result of using positive reinforcement in my previous career as an IT manager, I was several times treated to the extremely entertaining spectacle of five grown men sprinting down a corridor to make sure they were on time for a meeting run by a woman! But its taken me until now to start applying these extremely kind, sensible and successful principles to my dogs. Still, as Ive just demonstrated by writing this article, you can teach an old dog (of the female variety) new tricks!
By the way, did you manage to read through all the complicated theory in this article so you could find out how I taught Slug a reliable sit? Of course you did. Heres how I got you to do it. I dangled a treat in front of you by telling you what I did with Slug, but not how I did it. I reminded you that I had a treat for you when I suggested that you might have to wait for the next issue to get the answer, thereby keeping your attention. Then I dangled a similar treat in front of you a bit further along (the mystery bit about rewarding pulling on the lead), and a few paragraphs later I gave you the treat (the bit about stopping in your tracks). This was a reward - positive reinforcement for persevering with the behaviour I wanted from you continuing to read the article. You thus learned that continuing to read was a behaviour which would be rewarded, and so you persevered until the end of the article and got your treat! I do hope it was worth it, and that you will find even greater satisfaction and reward in applying these principles to your own wonderful Dobermann.
|Last Updated on Sunday, 30 September 2007 10:59|