Truth in Numbers – using online data to measure inflation and inform policy

I was born in Argentina which is a wonderful in great country but unfortunately we have a lot of macroeconomic problems and one of the most terrible and pervasive problems is inflation which is the persistent increase in the aggregate price level now I learned about the importance of inflation as a young kid I was only ten years old in 1989 when the country went through a massive period of hyperinflation a very high inflation level in fact in March of that year the monthly inflation rate went from 17% to just over a hundred ninety percent in July that meant that prices were tripling in a month in the annual inflation rate for that year ended up being twenty thousand percent prices were rising so fast that everyone’s lives was affected in fact I remember how my mother would go to the bank every single day I thought she worked at the bank and she would do that to change the Australis which is the local currency into dollars in a desperate attempt to get rid of the Australis and preserve the value of her family savings and our family’s income just as all Argentines were doing at the time and then she would only keep enough Australis in her pockets to be able to make the purchases of the day we would go to supermarkets and nothing would even have a price because prices were changing so quickly that the retailers wouldn’t even bother posting them instead they gave people at the cash register an updated list of prices every few minutes and as customers we had to hurry up to make sure we got there before prices rose people spend so much time trying to adjust to due to inflation that the economy suffered tremendously you know everyone’s lives was greatly affected it had a very big impact on the way Argentines perceive inflation and the problems of inflation now it should be to know what surprise there in that approximately 15 years later in 2004 2005 when the inflation rate started to rise once again a lot of people got worried particularly when when the annual inflation rate went over ten percent the government became very concerned most governments around the world when they see inflation rising they usually ask the central bank to cool down the economy raise interest rates so that the inflation rate Falls but that means that the economy is cooling down and it’s not a very popular policy to pursue the government Argentina wanted to do things differently they said well why don’t we talk to the people responsible for computing this number and see if we can change the way we measure the inflation rate which was a disaster because it created a second additional problem not only we had an inflation rate that was rising but also we couldn’t from then on be sure what the real a natural inflation rate was now the way inflation is measured around the world is roughly the same there’s a statistical agency that sends people to the physical stores they go they write down the prices of all the goats that a normal consumer typically buys and then they calculate the percentage change over time of a basket of goods that’s the inflation rate so one of the first things the government of Argentina apparently tried to do at this time was to find out where they were collecting the data so they could go and pressure the retailer’s to keep their prices fixed to which the National Statistical agency said no that’s a statistical secret as it is in every country precisely because we want to avoid any bias in the index and the government still forced them to do a lot of methodological changes which were never really fully disclosed but the inflation rate started to stabilize and eventually they even fire the people responsible for computing the index at the National Statistical agency and that’s when the official inflation rate went below 10 percent and for and has been like that for a very very long period of time now one thing happened here when inflation goes above a certain level but differently 10% people start to notice it in the prices that they see at the stores just that like it had happened to me in 1989 and if you ask people like University of the Tayla did in Argentina they put on this survey where they asked people how much they felt the inflation rate was they would reply with numbers that were much

higher than what the government was showing they they say the inflation was over 20% and the government was only recognizing 10% of that so a lot of people started to get suspicious about the official numbers but nobody really knew what was going on because if we really want to measure inflation well we need a reliable data source that can provide information on a very wide range of goods and consumer prices that type of theta didn’t exist at the time because only the government had the resources to pursue such a large-scale data collection effort but I had a very simple idea why don’t we use online data now I was a PhD student at the time and I had ready noticed for my thesis work that a lot of the largest retailers in Argentina were posting information in prices of every single good that they sold on their website and I realized if we can tap into this source of information if we can reliably collect this data over time then I would have very simple a low-cost way to measure inflation and compare it to the government’s number and try to understand what was going on the technology to collect it all this data online it’s actually very simple to understand behind every website that has information about prices there’s a very structured language called the HTML code which basically tells the browser that we use how to display the information graphically on the page now this language is formed by these tags that identify where it’s relevant piece of information lies on the page and I realized I could teach the software to recognize each of those tags collect the information between them and store it in a database it was a lot of work initially to customize the software but once it was done I could let it run automatically every single night and over a period of a few months I had a Meister huge amount of data for every single good that was sold by the largest retailers in Argentina and I had the daily prices for each one of these schools so I could build my own inflation statistics using the methodology and the equations that the government was supposed to be using just with a different source of data in the what I found was strikingly different from the official numbers this comes from a paper I wrote in 2012 in the Journal of monetary economics and the black line shows the price index produced by the government of Argentina and it’s normalized to 100 the aggregate price level in the economy and what the black line is showing you is that after a period of four years the price level in Argentina had risen or approximately 35 percent according to government numbers now when I repeated the exercise with online data I found that the increase was over a hundred percent prices had more than doubled in that time period and this divergence was very persistent over time it was not a matter of just a few months but it lasted for many many many years it was extremely puzzling but very important to understand why it was so different because what it’s the slope of this line is essentially telling you is what is the inflation rate and if you use the official numbers the inflation rate in the country according to the government was only eight point five percent but with the online data I was getting numbers close to twenty percent which is by the way what people felt that the inflation rate appeared to be and it was consistent with other estimates per the used by provincial governments and in another economist the the magnitude though of the difference is not proof of anything maybe I was using a different data source that have different inflation rates and it was there was something in the way I was measuring the inflation rate that was different so the first thing I tried to understand the difference was to replicate the methodology in other countries of Latin America and see if I got similar or different results in those countries and no matter where I looked whether it was Venezuela which had a inflation rate of almost 30 percent or I went to Chile that had a very low inflation rate of 3% the results I would get with online data in the methodology I was using was very similar to the official results the only exception was her Santina so I wondered maybe the data I’m using for Argentina is different or can I find a way to replicate the official numbers the next

thing I tried was alter the data source choose a different retailer to produce my index that didn’t really change results much and there was still a huge difference with the official index I also tried my favor logical changes there and even just focusing on a very special subset of goods the government at that time was imposing price controls on a set of goods that they considered important and a price controls especially a maximum price that those Goods are allowed to be sold and for a time no retailer could offer them at higher prices the goal was to try to dampen the inflation rate and in fact I did find that during short periods of time the inflation rate of those Goods was much lower but then as soon as those goods were no longer in a price control they would increase very quickly and catch up to the previous trend so overall if you think of the slope of this line as being the inflation rate it was much higher again than the official numbers there was essentially no way to replicate the official estimates except for one the only way I could replicate it was a bitter very simple algorithm or mathematical solution we just simply divide by three when I did that I got numbers that were extremely similar to the government eventually everyone realized DARS and Tina was manipulating the price index first Argentines themselves normal people businesses and even the local media in Argentina as this article from amateur newspaper showed in 2008 everyone realized the official numbers could no longer be trusted and they stopped using it eventually global media organizations like The Economist magazine which stopped in 2012 the publication of official indices and replaced them actually with data that we were producing already at the time and finally even multilateral organizations that like the IMF to which Argentina belongs that decided that statistics being produced by the government could no longer be trusted for Argentina all this process with the price index was tremendously hurtful in terms of its economic impact it undermined the stability of our economy introduced a lot of uncertainty a consumer confidence dropped an investment confidence dropped which will affect our growth potential for many years to come and it’s a trend that would be very hard to reverse because it’s very easy to lose trust in a statistic like this and very hard to regain it but there’s a bright side and this is beyond the simple case of Argentina we realized that online data and these new technologies provided a reliable source of information for the measurement of inflation a source of information that had advantages that went beyond just being able to produce alternative indices in a country that manipulated it like the case of Argentina even if we focused in an economy like the US where inflation is correctly measured we could produce statistics that had significant advantages on many dimensions what are those dimensions well first of all we could dramatically lower the cost of collecting someone all this data and by doing so we can not only liberate resources for the statistical agency to to apply them elsewhere but we could also dramatically improve the accessibility of this type of of information anyone could collect data now it could be someone like I was in 2007 a graduate student with an old laptop to any firm that wanted to produce their own statistic to even the government that we could move also from having to rely on smaller baskets of goods that are carefully selected but they’re still small to be collecting data for every single goat that is being sold out there and as soon as a new goats appeared on the stores and the preferences of consumers changed we could start collecting information on those Goods and in creating these dynamic baskets that bear can better capture the important inflation trends in an economy is experiencing and finally we could do this much faster we could go and produce this almost in real time and with that objective in mind we created this project we called the billion prices project at MIT Sloan and it an academic initiative geared towards using all these new sources of information to improve economic research but we had a very practical goal back then which was see if we could experiment and innovate in the way inflation could be measured in real time and we started with the US economy we

collected data here from 2008 to 2010 and in 2010 we produced an inflation index using online data and here what we found was very different from Argentina first of all the index tract or Co mode very closely with the one being produced by the Bureau of Labor Statistics the official agency in charge of these in the US but our index had a big advantage we could produce it very quickly post it online and it was very effective detecting this particular moment in time when there’s a big change in the trend of inflation that can have dramatic consequences for policymakers or businesses that have to make decisions in particular for example in September 2008 when Lehman Brothers went bankrupt in the u.s. some of the largest retailers in the country started very quickly dropping their prices in anticipation probably or an expected drop in demand now our index picked out information immediately and we were able to observe it right away now this was very useful information for someone for example like working at the central bank of the u.s. having to decide their policy to have at the time and since then there have been many of instances that were very similar where our index was able to anticipate this changes in the inflation trend which were by the way are very hard to forecast unless you have information of what is happening with prices in real time and that’s what our index could provide we replicated the same methodology all over the world we now produce inflation indices in over 20 countries on a daily basis these are countries that have over 40 percent of the world population in over 75 percent of the world’s GDP in current US dollars and in every country we do this we find that we can reliably construct inflation indices that not only track the main movements of the official inflation rate but can significantly anticipate some of these big changes that policymakers are so eager to have in real time the fact that we did this put us at the forefront of all the discussion about big data big data is basically about dramatic improvement in the way we do data collection there’s a revolution on that front and as I mentioned before anyone can collect data now it can be a graduate student it can be a professor in a university or it can be a government agency that needs to to do it themselves and this will have an impact in economics mainly along two lines one is what I’ve been discussing today there the ability that we have to improve the way we measure things in our economy we measure our economic lives we can improve statistics just as we did by experimenting with all these methodologies in the case of inflation we can improve other statistics some of them we are already measuring some are new things that we would like to know about our economists and we haven’t yet been able to measure them correctly but there’s also another big advantage in economics particularly in terms of economic research in academic research economists have had work in research have had for a really long time to rely on datasets that are collected for other purposes or by governments that filter them and adjustment in certain ways now these data collection techniques allow people like me and researchers around the world to go out there and collect the data we need customized for specific needs and could dramatically increase the amount of valuable information we have to improve the models we make about our economies and our understanding of how our economies words so it’s a big opportunity and and at the billion prices project that MIT Sloan we’re very excited about this opportunity if there’s anything that my experience with Argentina has shown me both back then when I was a kid and recently when I had this experience with the inflation index was that things like inflation have tremendous impact in our lives and we have to solve them and the first step to solving them is been able to measure them correctly and giving that information to the policymakers who have to make those decisions not the billion prices project and MIT Sloan we’re very excited about it we hope to continue to be at the forefront of this revolution and we will certainly continue to work very hard at trying to achieve it thank you