# Homework 1

Due by 11:59 PM on Wednesday, September 16, 2020

Getting your assignment: You can find template code for your submission here at this GitHub Classroom link. All of the code you write you should go in hw1.Rmd, and please knit the Markdown file in your completed submission.

## The Rescorla-Wagner Model

The Rescorla-Wagner model, developed by Robert Rescorla and Allen Wagner in 1972, was extremely influential at the time of its publication because it was able to explain several puzzling findings in Pavlovian condition, especially the phenomenon of blocking. It has since been extended by researchers working in Reinforcement learning to account for a number of other interesting phenomena. It is also the basis of the delta rule used for training simple neural networks, as you’ll see later on the course.

You’ll work with the same simplified version of the model that you saw in class. The model describes the change in strength associated with a conditioned stimulus ($$\Delta V$$) with this equation:

$\Delta V = \alpha \cdot \left(\lambda - V_{total}\right)$

#### Problem 2: Write rw_delta_v (1 point).

You’ll use the stub in the R Markdown file in the GitHub Repository that looks like this:

rw_delta_v <- function(Vtotal, alpha = .1, lambda = 1) {

}

One thing to notice is that function parameters in R can have defaults specified. If you don’t pass in a value for that parameter, it will get the default value inside the function.

## Simple conditioning

To get started, you’ll simulate of simple conditioning experiment in which someone experiences 10 trials of positive reinforcement in response to a conditioned stimulus. You’ll want to produce a plot of the strength of the conditioned stimulus over the course of these 10 trials so you can see the changes that the model predicts.

There are lots of ways of setting up this experiment in R, and you’re welcome to do it however you like. In case you’d like a hand to get started, he’s one strategy:

• Make a tibble with 2 columns: trial and V. The trial column will have the values $$0$$ through $$10$$, and the V column will start as all $$0$$s.

• Write a for loop that iterates over the numbers $$1$$ through $$10$$—these will index into the rows of your tibble. Set the value of the V column in each row to the result of calling your rw_delta_v function on the value of V in the previous row.

• Make a plot with trial on the x-axis and V on the y-axis. You can make one plot for all four of your simulations if you’re feeling comfortable with the tidyverse, or 4 separate plots.

## Extinction

Now let’s see what happens if we take the reinforcer away. Set up a simulation where the participant is exposed to 10 trials in of positive reinforcement in response to a conditioned stimulus, and then 10 trials in which they get no reinforcement ($$\lambda = 0$$). Then make a plot of $$V$$ over the course of the experiment.

#### Problem 4: Try the same parameter values that you used above in this new experiment. How does the extinction curve depend on $$\alpha$$ and $$\lambda$$? (2 points).

This should be a fairly straightforward extension of the code you wrote for the last Problem. The critical thing will be to make sure that you are using the right value of the $$\lambda$$ parameter on each trial—remember, no reinforcer should have no reward and thus $$\lambda = 0$$.

## Blocking

Now you’re ready to test Rescorla-Wagner’s ability to account for the Blocking phenomenon. In this simulation, you’ll have two cues–x and y. For the first 10 trials, the simulated participant will be exposed to cue x and be positively reinforced. Then, on the subsequent 30 trials, both x and y will be present and the participant will get reinforced.

One way to make this simulation work is to replace the V column with two new columns: Vx and Vy. And then include two more columns in your tibble, x_present and y_present, which indicate whether each cue is present on each trial. You then want to make sure that you simulate updating the weight for all cues that are present $$\Delta V$$. And make sure that $$V_{total}$$ has the right value on each trial!

## Conditioned inhibition

Finally, you’re ready to simulate a phenomenon we talked about in class but that you didn’t see directly: Conditioned inhibition. The setup is similar to blocking–first one cue (x) is presented and reinforced, and then x and another cue y appear together. But, this time their combination is not reinforced. As a result, people (and the model) learn a negative value for y. Intuitively, y is the reason that x appearing did not lead to a positive reinforcer.