I hope everyone had a lovely weekend! Last week I was chit-chatting about Clicker Training and that is where we will pick up today. There are quite a few more common problems that pop up when Clicker Training so I want to get back to technical service, troubleshooting.
“Okay, so Fido will work when I don’t have food in my hand but I am having a trouble getting him to not require food after each behavior. What am I doing wrong?!”
Now, in conjunction with the foundation work of keeping food out of your hands, there is a very methodical way to wean a dog from needing to be rewarded with food after every behavior. First thing to remember is that if you click, you MUST provide a food reward–so, you should condition a verbal marker (“yes!” or “good!” or something) to use in the process. I also figure out what non-food items are reinforcing to my dog–toys, play, butt skritches, etc. You can also condition secondary reinforcers. For Shayne, a hand target can be used as a reinforcer because it has been so heavily rewarded that the behavior itself is reinforcing.
There are many variations on when to start weaning but I start weaning off treats when a dog is 85-90% reliable in a given situation. Vague enough? Basically, I consider the 3-D’s (duration, distance, and distraction) along with location. If my dog is 85-90% on a behavior in low level distraction, low level distance, and medium level duration in X environment (say the living room), I will start weaning off food when I ask for a behavior that fits that specific criteria. If I increase any one of the 3-D’s or change environment, I continue to reward until the behavior with that criteria (in terms of the 3-D’s) is 85-90% reliable. I start weaning off food very quickly when I’m working low levels in the house–it doesn’t take too long for the pup to be a 85-90% in that environment since that’s the primary place of work at first.
So, you know when to start weaning off treats but how? Basically you start becoming more like a slot machine and less like a change machine. Both spit out quarters but one happens the same every time (change machine) and the other mixes it up. It’s called a “Variable Ratio” reward schedule. Basically what that means is that when you are training you are going to reward “randomly” but base it on average number. To start, I aim to reward 1 out of 2 behaviors (as an average). But if I simply only rewarded the second time I got the desired behavior, eventually the first would begin to become degraded (in terms of speed, precision, or compliance)–so, I mix it up and reward “randomly” to get the average. Example: So, if I ask for 10 sits in a set, I want to reward 5 of those (for the 1 out of 2 average)… my set my look like this: Sit 1: food, Sit 2: Good dog!, Sit 3: Good Dog!, Sit 4: Food, Sit 5: Food, Sit 6: Good Dog!, Sit 7: food, Sit 8: Good Dog!, Sit 9: Food, Sit 10: GOOD DOG!! and party (minus food)!
I then work to decrease that number from 5 rewards out of 10 behaviors, to 4 rewards out of 10 behaviors then 3 etc. continuing to reduce the number of reinforcements per-set (changing the ratio of rewards). This process is done over time, not in one sitting and if at any time you see a drop in success, go back a step and increase the reward ratio. This process should be repeated in different locations and when changing one of the 3-Ds. Once the pup can go an entire set with one food reward, I play with that for a while making sure I’m random with when that food is coming. Next step for me is to make the sets longer. So from 10 to 20 sits. When I increase this criteria I will make the ratio “easier”–so go from 1 out of 10 sits getting a treat, I’ll go up to something like 6 out of the 20.
Once working with bigger sets, I add in randomized jackpots to keep the behavior ultra strong (if you get a big win on a slot machine you are more likely to pursue through long droughts with no wins… we use this to our advantage as trainers). *A jackpot is like and ultra-uber fantastic party for your dog… in general I say a jackpot reward should last a minimum of 20 seconds. For my dogs that’s 20 seconds of me being all happy in a high chirpy voice, dancing around, feeding them LOTS of yummy food, giving them butt skritches and making a HUGE fuss over them.
I hope that makes sense… it’s not easy to write out because while there is a general path, each dog I work with is a little different and goes at a different pace. I’m sure their are lots more specific questions but the big ideas are to use ratios to work down the number of rewards in a set, be random with your rewards to keep the behaviors strong, and use random jackpots to further strengthen behaviors to remain strong through long droughts without food rewards.
“My dog can sit and walk nicely in the house, but when we are out and about it’s like I don’t exist”
The dog trainer response is normally, “dogs don’t generalize well.” Well, that’s my answer! If they can walk nicely in your house, how about in your garage or your backyard? If the answer is no, how can you expect your pup to walk nicely outside in a new environment? You have to work up high distraction areas–this is easier to do with some behaviors (more stationary things like sit/down/hand targeting) and more difficult with others (moving behaviors like lose leash walking and come). If a living area is too distraction (and it sometimes is), I suggest starting the training in a boring room like the bathroom. My general process is to start at the lowest distraction area where your dog easily works (this can sometimes be different depending on the behavior) but here’s my basic route to get to working outside: bathroom, living room, basement, garage, back yard, front yard, street in front of house, the neighborhood (or another really well known area if you dont have that type of set up, or pet store), and then a new environment. It really is a process of setting your dog up for success by taking steps toward a very difficult thing. The more generalization work you do the easier it becomes.
The other option, and one I used when working with Rio, is to do training in new places early on in the training so the pup learns to work with you regardless of environment. I worked with Rio on eye contact, name recognition, hand targeting and recall for about two weeks in the house but after that 1/2 of our training sessions were held outside and in various places (and most of the time he was dragging a leash). People often accidentally set up situations that actually teach our dogs that when the leash is off or they are outside that they don’t have to listen. I set up situations that teach my pup that regardless of location we are probably going to be working (training) so they are simply used to working in that type of environment. It’s one of the reasons I offer an outdoor training class when the weather is nice. This is not as easy a way to generalize with a dog who already has learned to not pay attention when in new places but if you have a new pup this is an interesting process (one I will devote a full post to later).
I know I’ve posted this video before, but it really does exemplify the type of focus you can get from a dog outside in distracting environments when you set them up for success by building the behavior in lower distraction environments while also teaching them that every place has the potential to be a training place (and thus a place for yum).
[youtube=http://www.youtube.com/watch?v=nDGdtKiUcC0]
We will continue some more clicker training information this week, not sure if it will be every day but keep an eye out!
Keep up the great work. I’ve been following your blog since it was on Dogster, and I love hearing about Rio and Shayne. You have the best posts! I’ve started using a clicker for tricks and obedience. Your information really helps.