# Andrew Hayes discusses &quot;Modern Integration of Mediation and Moderation Analysis&quot;

which is getting some way let’s do the same thing for M regress M on X you get then a regression coefficient a which has tomates how much two cases that you provide unit on X are estimated to differ on them there that’s the second equation and the third equation we’re going to estimate Y from both xn that will give us two regression coefficients one for action one for M which I’ve labeled be in C Prime so be estimates the amount by which two cases that differ by 1 unit on em but that are equal in X are estimated to differ on Y Z prime has the reverse interpretation reversing the roles of X and M how much two cases that differ by one unit or X that are equal I am are estimating a different line so we imagine these three regression coefficients that gives us out of these four regression coefficients it turns out that there’s a relationship between these regression coefficients expressed here C which we call the tonal effect of X on Y cleanly partitions into two components the direct effect of X on Y which is Z prime plus the indirect effect of X on Y through M which is the product of a and B so that’s this equation here so the total effect of X on Y I see C prime is the direct effect of X on Y and the product of a and B is the indirect effect of X on 1 through m algebra shows e the indirect effect can be thought of as could be thought of as the difference between the total and the direct effect and those of you who have done mediation analysis and other people sometimes make a deal a big deal of how much an effect changes in you introduce immediate and that’s sort of reflected here in that difference that’s just the indirect effect oh okay so this relation but between the direct total and indirect effect applies to any data you can give to an OS program does it make any difference that will always be true as long as you’re treating heavens lives cannibalize continua this relationship will hold you can’t make that you can’t give me an example where that wouldn’t be true because it won’t be true it’s not very often that I’m comfortable speaking an absolute so this is one now we do research and often we’re doing research with data that don’t lend themselves to unequivocal causal interpretation we have to deal with to balance right relationships can exist for a number of reasons among them being they are sham all kinds of maybe shared influences shared causes this is nice to get rid of those statistically what I want to show you here is that this relationship holds even after you account for those confounding variables so all I’ve done here is I’ve taken my model and I’ve added here I’m only added one but you could have many of these confounding variables added to these models and the same will be true so all of that is I’m added on confound your potential confounder variable into each of these equations we give them some weight we let a no less regression program figure out how to weight these things to produce the best fitting model and this screw pysics to halt so long as you partial these potential confounders out of both Y and out the example we’ll use today does include three potential confounding variables okay if you don’t want to take my word for it let me let me illustrate so here I’m predicting our entrepreneurs reporting fields of economic stress on their withdrawal intentions these are all scaled hot us such that higher values represent more so higher stress higher than drawl intentions so I’ve also got three variables that I’m partially now the sex of the entrepreneur their tenure which is essentially how long they’ve been in this business and then ESC is entrepreneurial self-efficacy so when we regress now withdrawal attentions on on economic stress we get C of point zero one nine if you can see that in green but there it is and that comes from this regression where I’m predicting withdrawal intentions from economic stress plus these confounders here’s an SPSS command that generates that output using these data so that’s the total effective economic stress on withdrawal intentions for the constants X tenure in entrepreneurial self-efficacy we do the same thing estimating depressed effect from economic stress as a lot as well as the potential confounders and we get point one five nine is our regression coefficient a

of the unknown sample distribution and I say empirical because we’re going to actually generate a representation of the sample sampling distribution through repeated sampling of the data so here’s how this works we have a sample we’ll call that sample of size n that’s the original dataset size hand so rows in a data file so let’s take a random sample of our sample then we’ll take a random sample of size n so we’re going to create a new sample from our original sample that is of the same size and we can do that without being silly about it because we’re doing this with replacement so when we draw when we build a new data set from our original data set sampling with replacement but if we draw a particular row from the data file as we’re building a new sample we can pre put that one back in to be redrawn if we didn’t do that obviously each time you do this you’ll get just the same data set so you have to do this with replacement so take a random sample of the original sample size n sample with replacement and then one will just say well this is our new data set it will calculate the indirect effect in that big data set and we’ll do this over and over again sampling the size again with replacement from the original dataset calculating the indirect effect in that new what’s called a bootstrap sample repeat this many many times I recommend at least 5000 so after this imagine you’ve got 5,000 of these estimates of the indirect effect well this is figure this is a data set 5000 rose there’s there’s a distribution there you could imagine visualizing it with a histogram or something like that that’s an empirical representation of the sampling distribution of the indirect effect when sampling from the original population so we use that distribution of the indirect effect over multiple resamples multiple bootstrap estimates as an approximation of the sampling distribution of the indirect effect in the original data with this distribution and we can generate an empirical estimate of the like we can generate an empirical confidence interval for the indirect effect let’s just imagine sorting these from low to high throw out the lower the opportunity a half percent take the two end points that now the lowest in the highest and that distribution those are the inputs of a 95% confidence makes no difference what the shape of the distribution is it doesn’t matter what’s happening the tails of what it looks like in the center because you’re only basing your efforts on those two end points after throwing out the extreme two and a half percent of the data on each side there are variations on this that’s what’s called a percentile approach to calculator which drop conference interval there are things you could do something called bias correction or bias structuring or acceleration which in theory should be better although we practice turns out not as this error to be or at least not always now it seems like a difficult thing to do and obviously requires a computer that’s not something you’re going to do when you’re clicking or whatever you want and all the times but it’s a perfect test for compute that’s what computers your bit so here’s an empirical representation of the Sanford distribution of the indirect effect in this model actually did ten thousand so ten thousand bootstrap estimates of the indirect effect here is a representation on that sampling distribution this is a this those ten thousand bootstrap estimates we threw out the logo in the upper two and a half percent the two values that are on the lower and the bottom of the top ten are point oh five six and point one seven so those are the endpoints of a 95% confidence interval for the interrupts event and because zero is not in that interval you use that as evidence that the indirect effect is different from zero with ninety five seconds sort of like saying P less than 0.05 although not technically that because hypothesis testing is condition of a true null hypothesis whereas this is not I always take questions that I’m having to take questions later about the rationale for this is this cheating are we making a day – no but it’s worth spending some time talking about those issues for them they’re asking okay so do you need special programming skills to do this remember help I have some program physicals but I would want to be doing this each time I

effect large versus small present versus absent so here’s a graphical or conceptual depiction of mediation where X is effect on Y is itself influenced by something else that’s the arrow from M pointing at the path from X to Y so M is depicted here to moderate the size of the effect of X on Y so that effect depends on it and in that case we would say that and as a moderator of the XY relationship we also use the term interact it’s interaction moderation to the same thing different terms for the same thing X we call a focal predictor and out the bottom rate it’s nice to use those terms because when we talk about these models it’s nice to think what is the the causal agents you’re interested in that’s the focal group during which is the moderator it’s a very very quick overview of one property of partial regression coefficients and that’s that they are unconditional effects so let’s consider a simple regression model like this we’ve got only two predictors X and M so here’s an arbitrary example just pulled some coefficients out of a hat and so here’s a bottle of Y from two variables X and M here is some various values of X and M and values of Y that model generates here’s a graphical depiction of that law so X on the x-axis y here and then various values of m and that’s the lines depict what Y happens is from those various combinations of X and M the thing to point out here is that regardless of which value of n we choose a one-minute difference in X is associated with the same expected difference in Y so one unit meaning the distance of one unit on X regardless of which I you within you choose the difference in Y is the same between those two values that differ by one your own X it makes no difference what value of any chips that an unworthy is a parent that’s what that’s that’s that’s property of parallel it doesn’t make any difference which line you pick as you move from left right you change bus a now on waxed in this case B 1 is the coefficient for extra point 1 as you change X by 1 unit Y hat increases by point 1 this regardless of which value of energies so we would say X’s effect is independent of them it doesn’t matter which value of M so the 1 is an unconditional effect it’s not conditioned on the value of M or any other variable in the box there’s only one variable in this model in this case so that’s a bad property if you’re interested in moderation because moderation means the opposite that variables effect is contingent on another variable in the model now if you read ahead I’ll recognize this if you’ve ever done moderation analysis here’s where how we get to this let’s let X is affect the a function event so so what I’ve done here is I’ve substituted that partial regression coefficient b1 for some function and that’s just for simplicity let’s set that function let’s make that function a linear function we could choose other functions but linear functions are the most common commonly used so I substitute that function in and here’s what I got now you just rewrite this do with algebra distribute the X alone the two terms and wind up with this so by many X’s effect be a linear function of M I end up at a model that looks like next so by adding the product of exit M as a predictor to this model we now end up with a bottle in which X’s effect is a linear function of it the strength of that moderation is going to be determined by the size of B 3 so here’s an example of how that works so I’ve tapped on X town as a new predictor and I’ve given it some weight so here is an example same data set difference tomates of course for y because it’s a different model now but what happens now the different two cases are different by you neuron X differ by a different amount of Y depending on what value of M you choose as we’re collected in the slopes that are not parallel that diverges from parallel will be determined by the size of B 3 so when B 3 is 0 so if you set B 3 is 0 you go back to this model when

we’re exes effect is in pinyin those lines are comparable as you deviate from 0 those lines then become none the more the larger B 3 is the less parallel they are so B 3 is it work here’s another way of representing this and I’ll be using this notation throughout the rest of today so let me go over it so here’s our bottle with the product of XM is an additional per acre we saw that we can be writing this way so our group terms involving X and then I isolator next pull that X on so that’s this about a representation with that model or let’s just substitute this thing theta XY into where I set that to this so that will call the conditional effect of X theta is the conditional effect events of what is defined by this function V 1 plus V 3 there’s a nice way of representing this conceptually where our focal predictor is X our moderator is out and so this effect form X of X on Y is this function theta X Y which is B 1 plus B 3 now so BC plays an important role in moderation because if B 3 is 0 that basically then that turns this whole function into just be one where excess effect is no longer dependent on them it’s independent so we test moderation by testing to the value of B 3 you get some alternative so here’s the example three pairs from our data set our original data set I’ve changed the rights of my labels around Belgium I can see later I’m still kinda economic stress apps now we’re call it depressed attic why before we were calling that out and now I’ve got another variable I’m calling em that’s the social ties very that’s the strength of their social networks a business-related social networks so we want to know do social ties influence the relationship between stress and depressed affix so so does the social type so they somehow buffer or amplify the effect of stress on depression so here’s this model conceptual form where my focal predictor is economic stress that’s the variable these affect I’m interested in in quantifying on the outcome why and then I say well this bad for fact is I believe may be moderated by social ties and then I’ve got a bunch of covariates that I’m concluding in here I did this earlier so I’m going to stick with that here’s the model in statistical form so this is we’re like what it would look like in the form of a structural equation model Wilhelm I filled out the error term then you often correlate all these things together if you were drawing it out in Amos or another program the important point is that this conceptual model translates into a simple quake and this is a graphical representation of that equation and our focus is on b3 display we want to always be 3s2 just to be different from zero that’s the product of XM in this model so here’s a simple SPSS code all I’m doing is I’m creating a new prop variable I’m calling in a wrap it’s the product of economic stress in social ties and then I’m gonna clear it as a predictive variable in the model and here’s the output that I get so here is our equation here’s a graphical representation of that equation and my rational coefficient is Smith for that product term is minus point two one two and it’s too distant different here p-value is smaller than wants to show so from that all we conclude is that the effect of economic stress on depressed attic seems to depend on whereas moderated by social ties that’s all that we’ve got from this so far this is a very abstract mathematical representation of the data how it we make sense of it is anything but clear we have as estimate we know it’s different from zero but so what does that mean exactly except that we now know that this relationship between stress and aspect is dependent on social ties a picture is always helpful and I would argue is in fact mandatory before you start talking about these things you how to draw a picture here is a picture without moderation so what am I doing here well top we see the regression model with those six regression coefficients now included and it’s p3 as our product term of the regression coefficient for a product term so those are two terms involving X in this model so I’m rewriting this

equation this way by grouping X and then isolating that so in factoring X out of two of the terms the reason why I do that is because this is that function when I was calling theta X Y right so fate XY the conditional effect of X on Y is this function B 1 plus B 3 M here I plugged in some values of social ties from the data plug them into this function and you get theta X Y those different values and those correspond to the slopes of these lines so what it seems like is happening is that social ties appear to be buffering the effects of economic stress on depressed ethnic people who report relatively few social ties now what is the lowest begin the data remember these are people who are members of a social networking site they’ve all got it one person in their network those people with relatively few ties the relationship seems to be positive between stress and depression but among those who have greater ties so don’t think of these an absolute number though these are just these aren’t how many friends you have or something like that it’s actually an index based on frequency of contact with members in the group but those who have more frequent contacts have stronger ties the relationship is weaker if anything is even zero and maybe even slightly negative for those who are much higher and their social and stronger had a much stronger social network so that’s a lot easier to see than trying to interpret that – point two one two so a picture is helpful and this picture allows us to see how these slopes link on to that function theta X Y we have a number of tools available for doing this one of them being something called the pick a point approach or you might hear this also called an analysis of simple slopes the same thing different lengths so when we probe it interaction we’re focusing ourselves on picking values of the moderator and then testing a hypothesis about the effect of X on Y those values so select a value of the moderator which you’d like to estimate the effect of the focal predictors effect then we’ll drive it standard error and proceed as always we can use that information to test the hypothesis or to generate a confidence interval for the conditional endure for the conditional effect theta X Y right we already know how to generate theta XY for any value of n that we choose we need a standard error and it turns out that’s not a difficult problem not even all that difficult to calculate by hand you get most of the stuff you need from a regression and an output standard errors and values of paths you don’t get this thing called the covariance of regression slopes automatically you’d have to ask for that and then know where to find it and with any luck you’re plugging in the right things in the right locations and you get a number that is almost certainly going to be wrong I mean anyway because rounding error is going to creep in there you’re doing these to two or three decimal places probably and that can be significant when you start squaring small numbers and taking square roots so you could do this by hand you’ll like me to make mistakes a long way there are online tools that do it for you and process will also do this for you on that so no need to have to go into a canned West Record code and find those formulas one way there are easier ways of doing it you’ll see an output for that word later I’ve implemented to pick a point approach here just using those values of M that I chose earlier properly in the conditional effect of X on Y and the entropy value from this method and that’s what you get so it looks like this for those who are pretty low in social ties arbitrarily one and one play five the relationship between stress and depression stuff statistically different from zero but above for this other values that shows 22.5 the relationship is not of course the problem with this though is how do you choose out right I mean what what what do we use to guide our decision I think that which values of M you choose and certainly with the value of n we choose it’s going to influence