# Problem Set 2

What is the main question asked in this paper?

“`

>**[Q2]** Describe the Hadza. How do they differ from market-based societies?

“`

>**[Q3]** Summarize the experiment design. Pay attention to the source of randomization.

“`

>**[Q4]** Why did the authors use biscuits and lighters and their design?

“`

>**[Q5]** Summarize the main results of the experiment.

“`

> **[Q6]** How do the results of this study compare to the sportcards market study by List (2003)?

“`

> **[Q7]** What do these results tell us about preferences? Are they endogenous or exogenous?

“`

>**[Q8]** Why are these results valuable? What have we learned? Motivate your discussion with a real-world example.

“`

# Replication

<p style=”color:red”>

*Use `theme_classic()` for all plots.*

<p style=”color:red”>

Load the data. You may need to update your path depending on where you stored it.

“`

## Figure 2

> **[Q9]** The column `magnola_region` is the treatment condition. Use `mutate()` to create a new column called `magnola_region_cat`, a categorical variable, that takes the value `High Exposure` if `magnola_region == 1`, otherwise `Low Exposure`. Then use `mutate()` again and `factor()` to force the new column `magnola_region_cat` into a factor variable. Factors are how categorical variables are represented in R. Do both mutations in one pipe chain.

“`{r q9}

“`

>**[Q10]** Factor variables in R have “levels” or categories. R chooses a default order for these levels. Check the order of the levels in `magnola_region_cat` with `levels()`:

“`{r q10}

“`

>**[Q11]** Notice how `High Exposure` is the first level. That means it will be drawn first when we re-create Figure 2. If we want to perfectly re-create Figure 2, we need `High Exposure` to be drawn second. So, we have to re-order the levels in the column. Do so with `fct_relevel()`:

“`{r q11}

“`

> **[Q12]** Re-run `levels()` to check the new ordering of levels in `magnola_region_cat`:

“`{r q12}

“`

> **[Q13]** OK, let’s make figure 2A. Use `stat_summary(fun = mean)` to plot the averages and `stat_summary(fun.data = mean_se)` to plot the error bars (hint: set the width of the error bars to something like 0.1). Assign the output to the object `fig2a`. Use `ylim()` to set the limits of the axis to \$[0,1]\$, and make sure to label both axes.

“`{r q13}

“`

> **[Q14]** Figure 2b shows the fraction of subjects that traded by camp and distance to the village Mangola. This one is a bit more challenging. We have to scatter plot distance on the x-axis and mean trade on the y-axis — and then size each point by total trade. Let’s start by making these summaries. Use `summarise()` to create three columns by `campname`: `mean_trade` (the average trade), `sum_trade` (the total trade), and `distance` (hint: use `unique(distance_to_mangola)`):

“`{r q14, message=FALSE}

“`

> **[Q15]** OK, now pipe the output of what you just did to `ggplot` to plot `mean_trade` as a function of `distance` and size each point by `sum_trade`. Assign the plot to `fig2b`.

“`{r q15, message=FALSE}