Good afternoon my name is Ranell Myles and I want to welcome you to the NIH Office of Disease Prevention Medicine Mind the Gap Series this seminar series explores issues at the intersection of research evidence and clinical practice areas in which conventional wisdom may be contradicted by recent evidence from the role of advocacy organizations and medical research and policy to the importance of behavioral interventions the Office of Disease Prevention hopes to engage the prevention research community and thought-provoking discussions to challenge what we think we know and to think critically about our role in today’s research environment before we begin I have with some housekeeping items to participate by Twitter follow us @nihprevent and submit questions using the #nihmtg you may also email questions to prevention@mail.nih.gov there is also a link to a feedback form at the bottom of the videocast page where you can submit questions during the talk at the conclusion of today’s talk we will open the floor to questions that have been submitted via email and Twitter lastly please visit the seminar page on the ODP website prevention.nih.gov/mindthegap following today’s talk and click the link to the seminar evaluation under the resources section to submit your feedback about this seminar at this time I’d like to turn things over to Dr David Murray Associate Director for Prevention and Director of the Office of Disease Prevention thanks for now it’s my pleasure to introduce our speaker today Dr. Stephanie Lanza is a professor of bio behavior health and the Scientific Director of the Methodology Group the Methodology Center and State now she has a background of research methods human development and substance use and comorbid behaviors with papers appearing in both methodological and Applied journals she’s co-author of the book late in class and Layton transition analysis published by Wiley in 2010 led the development of proc LCA a SAS procedure for fitting latent class models her research interests include advances in finite mixture modeling and time varying effect modeling to address innovative research questions in public health she’s passionate about disseminating these methods to help health behavioral and social science researchers and has organized many NIH funded dissemination conferences taught many hands-on workshops and written tutorial articles to enable applied researchers to use the latest methods based on her work at this time I’d like to welcome dr. Lanza and turn the session over to her at Penn State hi thank you very much David thank you for now I’m very pleased to be here today so the the title of this mind the gap talk is time varying effect modeling to study developmental and dynamic processes and so I’ll be referring to time varying effect modeling as T of M for short in this talk the talk of structured as follows I’d like to give a little bit of a background a history of the method if you will and then present to empirical studies in brief the first study is one focused on nicotine addiction and the recovery process and will focus on recovery as a dynamic process the second study is going to be focused on the onset of e-cigarette use in adolescents in the US and this study will take on a more of a developmental perspective and I’ll just conclude with a few a few final thoughts okay so into the background I would argue as many would that human behavior as it relates to health is dynamic and this orientation is is highly relevant for understanding a lot about health behavior so for example we can think about behavioral changes change behavior is dynamic so it may change across age developmentally it may change across time in real-time there may be changes in processes related to the behaviors so and again the change in those processes may be across developmental age or real time and also we can think very creatively about dynamic or differential intervention effects and so behavioral interventions might have effects that that develop or evolve or wash out over age or or real time and so I’m one motivating example for on why we’re here today is to think about negative effect in craving so negative effect and craving to smoke cigarettes among addicted smokers those are tightly linked in addicted smokers um excuse me when they quit smoking what does recovery look like TM can address questions that we don’t typically pose or address with our standard methods for example is negative effect differentially associated with craving for nicotine at various points throughout the smoking cessation process

or how does a smoking cessation intervention affect that link between negative effect and craving over time and so you can get a sense of the essence of time or dynamics in these in these research questions so first let’s do a thought except is that a an alcohol abuse behavioral intervention was administered and there was a two-armed trial so you had the control group and the treatment group that received the intervention and after the treatment was implemented for the intervention was implemented perhaps you had one time point assessed post baseline measurement of heavy drinking or another way to design the study would be to assess post treatment assess say two time points of measurement or you could get a little bit more intensive on the data collection post treatment and get multiple or even many time points post intervention and you might think about the intervention effect showing itself or manifesting itself it is sort of like a moving window of time and so you would need a lot of time points to see that and so if we had only one time point post treatment we might see that the rate of heavy heavy episodic drinking after the intervention was say around point four six for the placebo group and 0.25 for the intervention group and if that is found to be a statistically significant difference we can say that our intervention worked and and we’re done but perhaps we also have a second post treatment assessment time point and so in this case we have the same time one data but now we also have the individuals followed – time – and we see that the rate of heavy up static drinking remains fairly steady in the put in the placebo our control group but we see that at time to the intervention effect may be weakening that is that the intervention group is starting to increase in their rate of heavy episode of drinking and so you can see a time varying effect of this intervention now suppose we actually followed individuals ten time points and so now we have the same data from times one and two and we see this early weakening of the effect and that weakening kind of continues but maybe after some developmental period or after some event around time eight we see a resurgence of that intervention effect for example and this is an example of a time varying effect that we would not be able to observe with a single time point post measurement okay so that’s just to get us thinking about this idea of time varying treatment effects so what is TM what are we here to talk about the time varying effect model is actually a very elegant and straightforward extension of the linear regression model that that we all know in the development of TM originally was funded by NIDA through a p50 mechanism that was awarded to the methodology Center at Penn State and so uh Renzulli my colleague here in the methodology Center was the statistician who brought this to life and so so why do we collect longitudinal data as behavioral researchers well we typically want to capture temporal change in some outcome over time and often we want to capture time very and covariates things like contextual variables that might be changing with time at the same time point as the outcomes but it wouldn’t it also be natural to expect that the associations between covariance and outcomes may be changing with time and TFM is designed to evaluate whether and how those associations might with time okay so let’s get a little bit more concrete again TM is a direct extension of the regression model so what is regression regression coefficients Express associations between variables between an x and a y say and so the traditional linear regression model for predicting an outcome Y from a covariant X looks like this so we have y equals beta naught plus beta 1 X plus an error term and it’s that beta 1 is of interest usually we’re looking at the effect of X on Y and that’s characterizing the beta 1 and the beta 0 is our intercept or our mean value on the outcome for a certain value of X okay so we interpret those beta naught and beta 1 the intercept in the slope now T of M modifies the equation as follows it allows any coefficients in your equation the intercept the slope or both to be dynamic and so that’s denoted in the statistical model with a

parenthetical T for time and time can take on a broad set of meaning but we’re going to refer to it as time for now so you’ll see that the the intercept function is a function of time and the slope is a function of time and so that allows us to estimate regression coefficients as a flexible non parametric function of continuous time and so we can have these parametric functions for our slope coefficient for our intercept coefficient and so on and the results are too large to present in tables typically and so we always look at these regression coefficients as functions that are plotted in a figure and so we would interpret this coefficient function for example and so as you see with the blue line on the chart we see that the regression coefficient around x 1 through 3 is negative and then there’s a positive regression coefficient between x 4 and 15 and then it basically goes to 0 or I’m sorry this is hovering around around 0 and it says minus 0.2 on the plot and so you get a sense of how regression coefficients betas can vary with time on the x-axis okay so just a little brief history of TVM I think it’s interesting because the history is so I’m in so brief it’s nice to take stock of where we’re at so in the 1990s theoretical statisticians did work on functional regression analysis was introduced in the statistical literature and some of those papers had an impact in the field of statistics but the impact did not cross over to the health and behavioral science as it as it could have I’m so it wasn’t until 2010 when I led by dr. Renzi Lee at Penn State the TVM software and procedure was developed as we know it today in 2010 dr. Lee was able to adopt these methods for behavioral scientists and and and work with us to address issues that we face in real data analysis and also the software at that time was implemented as a SAS macro that could be downloaded from the methodology Center website and then in 2012 we started publishing demonstration papers in doing pre conference workshops and we got our first r1 to apply to them to reanalyze existing data in this case smoking cessation dynamics which I’ll talk about in study one in 2013 we did additional papers and workshops to disseminate the method and then NIH funded a supplemental issue of nicotine and tobacco research which came out in 2014 that was totally focused on EMA methods and team Emma’s highlighted in there and that’s when we first started to see other researchers outside of the methodology Center adopt T them and then last summer we did our annual Summer Institute on this topic and we saw a really nice pick up of researchers across the country being able to use T them in their work quite readily and we have been working on some software developments and so this is the state of the art of TM and in the history of team M so let’s get into a study one so this is a nicotine addiction study again thinking that recovery from nicotine diction is not something that is black-and-white it doesn’t happen overnight it’s dynamic right and so we know that tobacco use is a leading cause of preventable death globally cessation attempts almost universally and in relapse and withdrawal symptoms are reported as a primary reason and so we’d like through this study to improve our understanding of withdrawal symptoms and how different smoking cessation treatments might alleviate them in hope so that these results might inform future tailoring of interventions or different designs for interventions and so the overall goal of this are a woman was to apply innovative methods in this case T of M and also lien class analysis to existing data from a randomized trial to understand and inform the next generation of smoking interventions okay this study was using data from the Wisconsin smokers health study that was funded by NCI and awarded to Mike Fiore and Tim Baker and Megan piper at Wisconsin and so they did a large-scale randomized control trial and for the purposes of this talk we’re going to think about our placebo group and all of the five treatment groups combined we’ll just call that the treatment group and real-time assessments were overlaid on this study design immediately post quitting they were assessing very intensively withdrawal symptoms mood behavior and so on and so the study design looks something like this you can see that there was a long-term longitudinal a prospective study and between two weeks prior to and two weeks

post quit date EMA data ecological momentary assessment so we collected on a mobile device and they were assessed four times per day for four four weeks and so the goal of this particular analysis is to study the underlying dynamics of craving during the smoking cessation attempt and to estimate the time varying effect of the intervention on craving but also importantly on decoupling craving from some of its key drivers for example negative effects and this was published in that special issue I mentioned so the variables in this study are craving that’s our outcome that was assessed four times per day we’re going to look at the first two weeks post quit the predictors were baseline nicotine dependence which is not a time varying variable but you will see in a moment that the effect of a baseline characteristic can actually unfold with time a negative effect which was assessed intensively and then we had intervention group as our monetary okay so first you have to specify your TVM model and the big decision points there are thinking about in this case within each intervention group what varies with time so of course we’re expecting our mean outcome our craving value to vary with time so that means that our intercept will be time varying we also know that the observed negative effect varies with time that’s a time varying covariant we also would like to specify that the effect of negative effect on craving may vary with time and also the effect of baseline dependence which is a time fixed variable that the effect may vary with time and in fact we see that that this is the case here’s the statistical model that was specified in this case so craving is a function of a time varying intercept plus a time varying effect of negative effect plus a time varying effect of dependence less an error term and here’s one of the results so the effect of negative effect on craving so we see the red line denotes the placebo group and the blue line the treatment group and we see in the first two days after quitting smoking these are addicted adult smokers in the first two days we see that the treatment condition that the nicotine replacement therapy and other treatments are effectively decoupling this link between negative effect and cravings so when you get into the bad mood are you craving a cigarette in that moment and we’re seeing that the treatments are effectively reducing that link similarly but with a very different result we saw a time varying effect of baseline dependence and what we see is that in a second week after these adults quit smoking during that second week the treatment group is their baseline dependence is less important but for the placebo group those individuals who were highly dependent at baseline they’re experiencing significantly more cravings during that second week okay so what are the implications we could draw from the study one important implication that I like to draw is is this idea that we can think differently about intervention effects so that interventions not only the effect can change over time but interventions can change the associations between variables in this case between baseline dependence and craving intervention can also in this case it diffused the role of negative effect which is we know is a key driver of craving which leads to relapse and so if a treatment can break that link between mood and craving then that might be a good a good way to think about how we can augment treatment in the future and so broader implications again we can think about static baseline variables or individual characteristics that can have effects that change over time and I think that this is highly relevant when we think about genetic variables that have effects that might change over time the effects of treatments in standardized randomized control trials may be static they may turn on or off but more likely there’s their interventions that unfold their effect unfolds with time maybe in a more cumulative way or maybe it washes out but we don’t know unless we look for time varying effects and so we can model intervention processes that we might suppose exist and at the end of the day hopefully the kind of information we can get from studying dynamics of behavior using TVM can inform inform the tailoring of

treatments to individuals and importantly I think to time and so thinking about the relevance of the timing of an intervention for adaptive intervention trial designs okay I’m going to switch now to a totally different study and a totally different application of T vem so this study is on e cigarette use in adolescents in the u.s. so we’re going to take a developmental perspective here and this is an early result from a new grant that NIDA awarded us to apply TVM to existing data to study substance use and end abuse and co use and comorbidities and so on and so we’re we’re just starting to kind of unpack this this large area of research okay so the first this study that I’d like to show you is is still under development but we’re very interested in the new phenomenon of e-cigarette use among adolescents and so the the history of e-cigarettes as many of you probably know better than me they were developed as a reduced harm product and so there’s a perception that they’re considered safe a safe alternative to traditional cigarettes and in some ways that is indeed true so these cigarettes or electronic cigarettes are inhalation activated devices and so when you inhale there’s a heat that is produced that turns on a solution that has a nicotine and other additives and so the good thing about e-cigarettes is it eliminates this combustion but the long-term effects are inconclusive and some negative effects have been starting to come out in the in the literature but we do know that the rate of adolescent use is rising very rapidly and perhaps it is because of the lack of FDA regulations it’s certainly concerning because of the lack of FDA regulations today and there is a some evidence that’s building that e-cigarettes may be a gateway to other tobacco products including combustible cigarettes okay so this study is going to be examining data cross-sectionally data from the National Youth tobacco study and so I know I just switched gears tremendously going from intensive longitudinal data and EMA data to now I’m just looking at cross-sectional data and that is right um TFM has a broad applicability and behavioral research and so then NY TS study was funded by the CDC to in this is the most recent data that was released this from 2014 and it was a large national study to look at tobacco related beliefs attitudes behaviors also looking at exposure to campaigns around tobacco use and the study was large 20 2007 u.s middle and high school students and so we retained everyone that was between ages 11 and older in this sample and so the on average they were about 14 and a half years old and the data were as I said nationally representative and so the goal of this study this first study is to examine the etiology of both traditional and E’s cigarette use first we wanted to estimate the disparities in the rates of use across adolescents throughout ages 18 to night 11 to 19 and we wanted to look at disparities across sex and race ethnicity population subgroups and so you’ll see that moderation is a theme that comes up a lot in our TVM analyses it’s very very straightforward to examine differential effects across subgroups and in this case we’re talking about differential time varying effects across subgroups and the second a second goal was to estimate the rate of use of traditional cigarettes e-cigarettes and both products individual adolescents who are using both of those products as a continuous function of age okay so the measures in this study were that the outcome while a predictor was concurrent traditional cigarette smoking and it was coded one if they used recently in the past month or not current traditional regular current traditional cigarette smoking and in the entire sample of adolescence six and a half percent said yes and also current cigarette smoking which was our primary outcome and so past thirty day each cigarette use was actually reported in nine point two percent of adolescents a third variable we analyzed was age and

this is this is the important time variable in our TV model and you’ll see age is plotted on the x-axis in the same way that real time was in the previous study and then we have indicators of sex sex and race ethnicity as moderators okay so again we’re at the point where we have our variables we have our study we have our research questions but it’s the point that is the most fun for me I think is when you’re specifying the exact model that you want to fit and so thinking again about what varies with time or in this case what varies with age the probability of cigarette use surely varies with age we know this the probability of e-cigarette use varies with age and the effects of sex and race ethnicity may vary with age and I would refer to that as taught as age varying health disparities potentially and importantly to the effect of cigarette use on e cigarette use may vary with age in other words the odds ratio linking traditional cigarette use an electronic cigarette use may vary with age and that was a purely exploratory question and we did not know the answer to okay so the model would look something like this it’s the log odds of cigarette use is expressed as a function of a age varying intercept plus an age varying slope for sex this would allow us to look at the sex differences in the age of varying rates of cigarette use and similarly this would be the age varying effect that beta one indicates the age varying link between traditional cigarette use and ecig arete use and so we’re going to convert that to odds ratios and get age varying odds ratios ok so here’s some of the results this is in the left panel the rate of the estimated rate of e-cigarette use for males and females as a function of continuous age and on the right is the rate of past 30 day traditional cigarette use for males and females as a nonparametric continuous function of age and so we see some considerable gender differences but they don’t emerge until around age 14 for e-cigarettes and 15 for traditional cigarettes with males using higher in that mid adolescent period and by around age 18 we see no gender differences in use one of the very interesting findings that we discovered is related to race ethnicity differences in e-cigarettes use and you’ll see on the Left panel the rate of Hispanic cigarette use is significantly higher than blacks and even than whites between ages 12 and 14 and so this was a somewhat unexpected finding of early Hispanic adoption of e-cigarette use and on the right panel I’m sorry on the right panel we see a less pronounced difference for Hispanic youth and in middle adolescence we see this expected difference of Hispanic falling somewhere in between the rate of white youths and black youth in e-cigarettes that’s definitely not what we see though and finally we wanted to look at the co use of e-cigarette use and combustible cigarettes traditional cigarettes again thinking of e-cigarettes as a potential gateway for use of Morris or harmful tobacco products and we see very very significant associations particularly in early adolescence and so for example these are odds ratios on the on the y-axis and age on the x-axis from 11 to 19 and so we can pull out a point on this curve and interpret it for example among those aged 12 adolescents using e-cigarettes are more than 40 times as likely to use traditional cigarettes compared to those not using e-cigarettes and so that we see incredibly large odds ratios which drop to around 15 by high school so what are some implications from this particular study well this approach enabled an identification of key ages of risk and that can inform perhaps targeted or age-appropriate intervention strategies um we see clearly that traditional and East cigarette use do go hand in hand particularly in very early adolescence and we see that early use of e-cigarettes is significantly more likely among Hispanic youth suggesting greater risk for future nicotine

dependence in this population subgroup so I’d like to just make a couple of concluding notes um next steps are it’s a it’s a brave new world of big data out there and the next steps are I think unchartered at this time but there is absolutely new information contained in lot a lot of new contemporary data sources it’s a very exciting time for methodologists in in behavioral research intensive longitudinal data such as EMA data or data from streaming from wearable devices we’re just on the cusp of understanding how to make use of some of that data to really inform prevention efforts electronic medical records genetic data these are other big sources of data that we need to figure out how to use and I think big and complex data really do mean big opportunity for for prevention and health the movement toward adaptive interventions and precision medicine is going to be very exciting in the coming years I do believe TM has a role to play in this new new age of new types of complex data TM can unlock new scientific knowledge from existing data and I think it’s a wonderful way to re-examine all of the wonderful NIH funded data we’ve collected over the years TM can help elucidate complex processes that unfold with time dynamic effects of interventions developmental associations for example what are risk factors for substance use at age 12 might be different than what are the risk factors at age 16 associations across historical time for example marijuana use and attitudes that link may be shifting with historical time and the complex link between ages of onset and later outcomes can be examined using TVM I would just like to first acknowledge my key collaborators in team M so these are all colleagues here at the methodology Center Johnse AK Renzulli Mike Russell and Sarah vasilenko and we work closely together on the development and application of TM and it’s been a really great fun over the last few years and we’re very very grateful to both NIDA and NCI for funding much of this research so again thank you for your time thank you very much Stephanie that was very interesting I I was struck by several things in your presentation it’s not often that in behavioral science we see odds ratios up in the 40s I was staring at the access making sure that wasn’t a decimal point somewhere that I was missing but but it really was for you that’s quite remarkable and it was very interesting to hear about this this new method we have a bunch of questions that have been coming in that we want to pose for you and Renault and I will share that that Duty Brunel did you want to start so the first question is have you found a minimum number of time points for applying team them in a randomized control trial of a behavioral intervention or when assessing relationships using large longitudinal surveillance data sets so right now the question was the number of data time points yeah yes so that’s a that’s a great question so I wanted to have a chance to comment about that briefly so so as with so many things in in statistics in behavioral research the answer is really it depends and depends on so many things so let’s take a minute and just talk about data requirements for T V M more generally if you are assessing intensive longitudinal data perhaps with EMA or wearable devices or step counters then individuals will have intensive data often sporadic data and the TM can easily handle data from different subjects coming in at different rates at different time points and so on and at the end of the day all of the data have to be mapped onto the metric of time a common metric of time and so what you want to have in these highly intensive data studies is you want to have a very good coverage of your time axis and that’s going to enable us to detect some of the nuances in changes or reactivity as a function of time if you have a panel study for example like add health and many of us are familiar with add health it has four waves of data collection that began in the 90s and is ongoing today and into wave five and so the in this case there was a very large core sample and if you take all person time time points that

were observed across the four waves we had over 34,000 of them and if you instead of thinking about the four panels the four waves of the panel study time one time two time three times four or wave one two three four if you instead pull out information about what age an individual is at that assessment exactly what age were they and I believe with ADD health we can code it to the nearest month then we have coverage across age that looks like this with four waves of panel data and so that allows us to very flexibly estimate functions across continuous age developmental changes and so the question that was posed is important but one of the things when you know when we were designing panel studies years ago two decades ago and further the idea was everyone had to there’s this assumption that everyone had to be assessed at the same time points going forward and if you have an intervention then you would do your follow-up set baseline at three months at six months and so on but with T them it’s actually a strength when you do not adhere to that and so if you have some individuals were assessed at three six and nine months and others were assessed at four four eight and twelve months and and then if you put it all together the information you’re getting and the inferences that you’re making about the population or the effects of the intervention can can be quite nuanced with relatively few waves of data and so it very much depends on the design of the study and the research question at hand in general it’s safe to say that the the better coverage you have of your time or your age axis the more dynamic fluctuation fluctuations you’ll be able to detect thank you have another question how does team M deal with missing values hmm it’s a good question this is on the list of areas for future research for sure so T FEM is unlike growth curve modeling where all available data for one individual are used to get the best guess at their growth trajectory with TM all available data from all times or ages are used but it’s not a within person approach it’s a between subjects approach and so whatever data you have at a particular point on the x-axis a time point those are what your regression estimates will reflect and so if you have differential attrition you will have potentially biased results as you go on so you know different things have been done so for example in the ad Health Study if listeners are familiar with that there are you could select the longitudinal sample people who were there at all of the ways and and eventually will you’ll be able to apply those longitudinal weights into them in the not-too just not too distant future and so then in that case you could use the full sample to get nationally representative data across the entire span but for now the best thing to do is to be very thoughtful about your missing data to study it to understand it and see at what point things start to fall apart in a 2012 paper in prevention science that was written by Maria Chico a former postdoc of the center we actually include relapse to smoking was one of our primary outcomes and we were looking at individuals during a cessation attempt and so some individuals quit relatively quickly and start some relapse very very quickly and others did not and so we looked at time varying differences early on in the process of recovery and we separated them in a multi grouped even by those who relapsed early and those who did not and it was remarkable how you could see quite different processes in those first few days after quitting for those who did and did not relapse and so of course the ones who do relapse they all are treated early and so for that reason we did not want to estimate the population level rates of craving over the full window of time for example and so we reduced the window of time and we actually differed it by those who did and did not make it through the study and it ended up being very interesting okay thank you um another question is your your

presentation covered several interesting issues regarding the role of baseline variables and outcomes potentially time bearing a challenging problem it seems is what implications are there for longitudinal modelling when treatment and control groups different baseline as was the case in one in your first slide controlling for baseline could potentially introduce more problems than it solves what is your perspective on this okay this is a complex issue and it’s a kind of nuance I probably won’t cover all of it now but the first study that was a randomized controlled trial of smoking cessation treatments and its effect on relapse and craving and dynamic processes that was truly randomized at baseline and so baseline nicotine dependence in that study was equivalent in the placebo and the control groups and then individuals were randomized to treatment and then we looked at how within a treatment condition how that baseline dependence eventually related to their craving throughout that two-week window of of recovery and so I don’t see any issues with that in this case because again they were randomized at baseline I’m not sure I think there’s something more to this question but I’m not quite sure what it is and the the the person who asked the question may upon looking at that early slide that you showed may have thought well there’s a difference at baseline between the two arms and but that’s not a actually what you were showing in the graph you were showing an effect of the baseline level on an outcome and how that varied over time that’s correct so the randomization worked very well in that study and the individuals in the placebo and the treatment groups were essentially the same right and but it the effect of baseline dependence was different for those who receive treatment than those who did not receive treatment and so it’s a it’s a very complicated story of the role of baseline dependence in this treatment process because it’s it’s moderated by treatment group and is moderated by time okay thank you if most of the people watching this and most of the people who will watch it going forward because it will be recorded and available on our website are gonna be new to this method they won’t have experience with it what advice would you give to somebody who wants to get acquainted with it wants to get started or are there are their classes or their readings are there other kinds of things I know the methodology Center has that available I don’t know if there are others around the country that do what advice would you give to someone who’s just starting to play with this or to think about it so yes so I’m as you as you are aware I’m very passionate about the training opportunities they are you know not as frequent as I would hope and so far they’re only was available through the methodology Center I will go to this one slide so TVM is a is a SAS macro as I said earlier and it’s um because it was supported by funding from NIH it’s free for anyone to download from our website at methodology PSU edu and there’s a corresponding user’s guide that I would highly recommend that you download at the same time and the users guide very technical information about how the estimation is done and information about model selection and so on but it also has the how to kind of information it has the information about basics about syntax and how you specify the different parameters in that in the macro and so getting the software of course you have to be able to run SAS today to do this so getting the macro downloaded to your machine and the users guide to look at it and then if you’ve never if you’ve used SAS but you have never run a macro in your life that’s no problem we have a three or four minute video that you can watch on the same website you can just click on it and watch how to run a SAS macro in general and then it’s it’s quite easy and then we’ll walk you through it in the users guide it sounds like that could recommend is to find an application that’s been published that is most similar to what you want to do and chances are if it was written by someone in our centre the syntax is included as an appendix thank you you can put that video on YouTube and see if it goes viral you have another question from our audience what about outcome measures that are highly correlated over time how do they affect coefficients estimated B&T them

outcomes that are highly correlated over time so um you know coming myself from a background in in late class analysis I’m very keen on outcomes that are multi-dimensional complex behavioral and multi-dimensional outcomes Shivam however is is linear regression it is regression and the effects are these you know incredibly informative nonparametric functions of time but it is still regression and so the same limitations that you have in linear regression you have here the outcome in TVM can be a continuous variable it can be binary it can be a count it can even in some case ba0 inflated count outcome but it is a single unit dimensional outcome and the other outcomes in that you’re interested in just don’t come into play in the model at all you’ve used a couple of phrases that I want to make sure that I understand how they fit together you’ve said several times that TM is a straightforward extension of linear regression you’ve also referred to it as nonparametric and I don’t think of those two things as as going together so could could you help me there so so yes if you think about if you think about linear regression you’re looking at you’re estimating a beta or the intercept and say a beta for the effect of X on Y and that is what you get in regression in T them and you get parametric test as the parametric estimates its parametric estimates and so you get into them you get a parametric estimate of X on Y and the intercept for a given time so if you freeze time and you take one vertical slice from that figure those figures that I’ve shown you those are linear regression coefficients that are parametric what’s nonparametric is the shape of the function with with age or time and so the shape is basically the data let let the shape of the function come to the surface for you and and there is you know some selection going on behind the scenes or or in a hands-on way to select a function but the functions have no parametric function of time and so and whatever shape most accurately represents how those regression coefficients change across age or time are expressed in those figures but within a slice of time or age it is linear regression does that help it does and PAP how are you estimating those Co those time varying coefficients is that is that least squares is it is it maximum likelihood how are you estimated estimates but the functions are estimated with spline regression and so so so the TM software has two different estimation options for the spline regression that’s underlying T bem wine is called the truncated power spline basis function and that’s referred to as the P spline and the other is a B spline basis function and so the technical details of those different those two different estimation approaches that are basically under the hood of TM are described in chapter 10 of the users guide but so depending on which of those two estimation techniques that the user specifies then the spline regression is operating behind the scenes and estimating the functions for this coefficient curves and then there’s a I’ve alluded to a model selection procedure that it can be done either automated or manually to choose the exact coefficients that you’re going to report in a manuscript okay from the two examples that you showed us it’s clear that you can handle repeated observations on the same people so the software has to be able to take care of the correlation among observations taken on those same participants you also showed us a cross-sectional example where we have no expectation for that kind of correlation you had one question about correlated outcomes but but what about other kinds of correlation that often exists in in data from prevention studies from health related intervention studies where we may have a correlation among people who go to the same school

or who live in the same community or work in the same Factory yes so these are great questions David and remember these are early days of T of M so more functionality will be coming in the next few years however I think where the software is at we can answer a lot of our so let’s just kind of figure out where the software is today so can’t even handle clustering within a person yes as you said and there are two different ways that TM can handle clustering within an individual that’s repeated assessments within individuals over time one way is through random effects in knots in the B spline estimation approach you can have a random intercept and if you want perhaps a random slope in there alternatively there’s a I would say the more user-friendly way and when you’re just starting out with TVM I would recommend the piece line approach and that approach doesn’t have random effects but it uses robust standard errors which is basically GE we have data very intensively assessed over time and we want to think about mediation usually our mediator and our outcome are both assessed intensively over time and how to model mediation conceptually is unclear in that framework to me today but if if you were simply wanting to know whether if you add this other predictor into the effects of these other predictors reduce certainly you could do that today okay thank you mediation that I think is quite complex in this context we apologize to our audience for the technical difficulties that we just experienced a few minutes ago I don’t know if others out in the audience have the same issues that we did but I suspect that you lost picture and probably sound for a few minutes and apologies for that you have a question from the audience okay this person wanting to better understand how much space you can have between time points for example say I have five data collections over 20 years on N equals one thousand of folks who are within two years of same age if my time variable is age I find that there might be two to three year gaps with no person time data point can you shed any light on this well we have not investigated to the full extent sort of what the exact data requirements would be in a situation like that however I will say when we first started to apply to them we thought that the data requirements would be quite intensive that anything that you might fit a growth curve model to reasonably which is like five six seven time points per individual that you would have no business using TV but I don’t think that’s I don’t think that’s quite the case in the National Youth tobacco survey data that I showed in study two we had age coded to the nearest eighth to the nearest year only and it was so we had individuals that were 11 12 13 and on up to 19 and we were able to estimate quite interesting time varying effects I think as you as your data become more and more sparse and your time between ages becomes larger I would say that what’s likely to happen is not that Tevan won’t work but you just won’t pick up on any kind of complex functions and so the parametric functions with time will probably tend to get quite smooth in which case you might just be better off estimating a linear parameter for time okay I have another question in your East big example you had cross-sectional data but your method gives the impression of longitudinal data from which one could distill causative effects your report your approach with the odds ratio seemed to give the impression that ecig with it as a gateway to normal things but it could also be that normal things are gateway to e-cigs how do you distinguish well – you know we would need to have certain kind of data in the survey about which came first for an individual and we did not have that in the NYT s dataset I wish we did very much I think that the fact that it at many of these ages of adolescents from 11 to 19 the fact that many of them are using e-cigarettes more than they’re using regular cigarettes is quite telling and the fact that regular cigarettes are more problematic at least

in terms of of cancer incidents thinking about e-cigarettes as a potential pathway to the more risky combustible cigarettes is of interest however it is and oh I should I should just say this more broadly – if you’re using T of M in your research and you want to talk about these time varying effects it can be more useful in the context in the body of the paper to refer to these as time varying associations so as to avoid this idea that there is a causal effect these are regression coefficients and nothing more and so they should be interpreted as such and so the odds ratios we could have fit the model in the opposite direction but but these are associations with age there was another question embedded in there and now that was about using cross-sectional data and it conveys this idea that we’re looking at things developmentally are over actually over time or age and really I think you know the way that I think about those models is to keep it very clear that we’re looking at a snapshot in 2014 this is very very recent data on a tobacco use tobacco products and at that snapshot of time in 2014 we have a huge wonderful sample of adolescents and they span these ages and so we’re really thinking about age as a very important descriptor in how the behavior is manifests itself in the population today and so it’s it’s important to keep that in mind and that it is not a developmental study the in the main reason why you would want to be I think cautious about thinking about that as developmental trends is because we know that over historical time a cigarette use is shifting very very rapidly and so there’s no reason to think that those kids who are 11 in the survey in 2014 will look in eight years like the 19 year olds do in that survey they’ll probably look much much riskier well thank you so much dr. Lanza for all of the useful information and thank you to everyone who participated in today’s webinar on the Medicine Mind the Gap website which is again prevention@nih.gov/mindthegap you will find several resources for this talk including slides references experts and a link to complete and evaluation your feedback is very important to us as we play the remaining sessions for 2016 thank you again for your time thank you Dr. Lanza