Street Network Models and Indicators for Every Urban Area in the World

(upbeat music) – Hi everyone, welcome This is the third of our triple header I started this early in October to talk about these three seminars is triple headers and now that we’re all waiting to see if the dunk Dodgers make it it’s becoming more relevant all the time Welcome to, I think we have, this is our last seminar this semester, we’re in our short semester And of course, it’s great that we have one of our own from USC price presenting today, Geoff Boeing And before I introduce him I believe that we’re going to have an introduction by Lily Nei from the undergraduate planning group So Lily – Thanks professor Giuliano So undergraduate planning at price is a student org that launched last spring actually So we’re very new to specifically meet the needs and interests of undergraduate urban planning students, and just undergrads in general, who are interested in planning We host a variety of different programming, but today specifically, we’re really excited to be collaborating with my trends on this event and to have the opportunity to engage with faculty in this new format So thank you very much And we really look forward to professor Boeing’s presentation – Wonderful, and I hope I hope we’ve got lots of undergrads in the audience today Okay. I want to introduce Geoff Boeing He is an assistant professor here in the price school He’s in the department of urban planning and spatial analysis He’s also the director of the urban data lab The, I have to admit, I didn’t know we had until I was looking this up He received his PhD from UC Berkeley He spent a little bit of time on the East coast before we persuaded him to come back to California And as those of you who are in the school know he is our newest data science expert, not only a transportation person, but a data science person So I think you’re all gonna be very impressed with what he has put together, looking at street network models And when he says every urban area of the world he’s actually not kidding So with that, I would like to introduce Geoff, welcome – Hi Jen. I appreciate it No pressure now I really have to show you models for every urban area in the world (both laugh) Thanks everyone for having me here today to join you I’m looking forward to getting to talk about this work So some of this is work that I’ve recently finished and some of this is work in progress that I’m currently working on and we’ll be over the next couple of months So I’m looking forward to your feedback on some of that too Hopefully you can all see my slides now So the title of this talk is street network models and indicators for every urban area in the world That’s what it will be about I want to set this up a little bit I thinking first why are we analyze street networks in the first place? So in the context of urban planning or transportation planning what role does street network structure form geometry topology surf in the city And I want to transition to briefly talk about some new tools, including a software package that I’ve developed called OSMnx which allows us to easily model and analyze street networks anywhere in the world to create the models I talked about and also to produce the indicators to look cross-sectionally at a lot of different places Then we’ll look at those models and indicators and think about some of the different findings coming out of this work So looking at street networks what can we say about cities and important topics about walkability, public health, greenhouse gas emissions and so on Now, before we launch into all of that, I wanna briefly set this broader project up by thinking about why we study street networks in the context of urban travel equity, et cetera, as urban planners

So street networks and other transportation and circulation networks provide a sort of substrate and connective tissue that helps to organize all of the city’s human dynamics So here we see four different cities street networks all held at the same scale of one square mile, holding them at that scale It does easily look across different kinds of urban form as structured by the street network So here we can see the pattern and texture and grain of a street network and how it reflects different transportation technologies and design paradigms, different areas of urbanization politics, and political systems, expressions of power underlying terrain, and the local culture and economic conditions in different places at different times For example, you can consider the different histories and values that are embedded in the classic street grid from HIPPA domicile, early urban design work in ancient Greece to the law of the Indies the US homestead act and the New York commissioner’s plan The grid has been used for centuries and millennia to try to express a certain human order over the landscape or to organize transportation around spaces of power and control, or to make land more amenable to real estate speculation and development Alternatively consider the different values and goals that are embedded in say Houseman’s renovation of Paris and its planning or the autocentric form of different US Sunbelt cities and the notions of lifestyle aspiration privilege or exclusion that are deliberately planned into certain types of cities and suburbs around the world But in particular all of these street networks demonstrate varying levels of connectedness, accessibility, compactness, resilience, and sustainability And we can really start to see that when we add in other urban features like building footprints here or land parcels or demographic information or points of interest Overlaying those other kinds of features let us understand how the street network percolates into the rest of the urban fabric, how it influences household access to jobs or schools or fresh food destinations and how different kinds of networks are more or less resilient to disasters like floods or earthquakes, how they enable evacuation or multiple different routes when some routes get knocked out during an extreme event And we can also think about how things change over time So transportation planning has restructured urban networks over the centuries in cities and countries around the world As it does that, it often deliberately redistributes access and equity for different groups So this is a US example, looking at street network section from the 1900s typical section from the 1940s and once in the 1990s And we can see the sort of trajectory moving away from the sort of imperfect grids from the pre automobile era where urbanization and straight network forum are predicated on the logic of walking and streetcars moving into the 1940s as new values and ethics around automobility and a new urban future based around the car, coming to the forefront where cities and suburbs were increasingly designed with certain geometric values that also lended themselves to the motorists and the moving into the 1990s and the sort of culmination of the automobile centric, urban form, we can see a street network with more disconnected streets, more dead ends and cul-de-sacs more circuitous routes to get from point A to point B and much longer blocks much lower intersection density place where it’s a lot harder to walk but a lot easier in many ways to drive So transportation has studied Gentrification planners are studied street networks like these for years, but today we can evaluate the outcomes of these different transportation plans using new tools coming out of network science and computation So I want to think about some of those new tools in that context And in particular, I want to introduce to you a software package that I have developed over the past few years called OSMnx

and think about how it can help us characterize travel connectivity and the urban form itself in some novel ways So OSMnx stands for open street map Plus networkX, open street map is kind of like Wikipedia meets Google maps It’s a worldwide mapping platform and geospatial database that anybody can contribute to And it has mapped the entire world with some varying degrees of quality in different places for different types of things NetworkX is a network analysis software package that was developed at Los Alamos national lab And as you might be able to guess from OSMnx’s name, it serves to connect together open street map data with network X So OSMnx itself is a software package that lets us easily download model and analyze street networks using raw data from open street map and turning them into graph theoretic models using network X So its key features are downloading and modeling street networks themselves but also algorithmically correcting and processing network topology So that’s a little bit of word salad, right there Network topology means how things are connected or configured across the network Now, when we want to work with street network models that come from open street map raw data, we have to do a bunch of correction and processing before we get a model that roughly approximates how we think of street networks in urban transportation or urban design The reason for that is because open street map just has a whole bunch of line segments and points showing how those straight line segments are compiled together to form a larger whole like a full street So OSMnx handles all of that for you under the hood, turning it into a network model where you have nodes at intersections and dead ends and network edges or connections for all the street segments And once we’ve created that model from the original rods data, we can do all kinds of different analyses with it like calculating routes, visualizing different aspect of the street network or calculating a bunch of geometric and top illogical indicators So remember as a typology refers to the idea of connections and configurations in the network geometry refers more to the idea of angles, widths, areas, lengths, stuff like that Now beyond street networks, themselves OSMnx also lets us work with building footprints point of interest, data like access to amenities or public transit stops and elevation data so that we can look at the elevations of our network and how steep the streets are in it Before I go more into uses of OSMnx, I want to think a little bit about the raw data So all these data are coming originally from open street map And if you’re not familiar with open street map, this is a screenshot of it in downtown LA So it’s kind of like Google maps, only all public crowdsourced and freely accessible to anyone in the world Now, traditionally when we wanted to work with straight network data in the US context, we’d usually go to the census Bureau and download tighter line road shape files, and use that geometry to try to figure out stuff about the street network And the problem with that is that shape files are not networks They provide the geometry part of it, but they don’t tell us much about the topology So connections and configurations, those shape files also don’t lend themselves very well to bulk or automated analysis You don’t want to be downloading one shape file at a time, trying to process it to exactly what you need and then doing some analyses on it can be really cumbersome and takes a human in the middle of the machine to do a lot of those steps manually But another problem with those traditional shape files they didn’t include a lot of informal paths or pedestrian circulation that contained limited information about bicycle infrastructure And of course, if you’re getting it from the us census Bureau, that tells us nothing about the rest of the world So street network analysis has historically been hamstrung by our inability to do big cross-sectional work looking at how different kinds of places differ from each other because of different digitalization standards or data availability It’s been almost impossible to understand cross-sectionally around the world, what street networks are actually like even though they underpin so much of our lived experience in the city So enter open street map

So open street map as you can see here, contains roads intersections You can make out some public transit lines and this map but also notice all the informal circulation paths going through the different parks between buildings passageways, through parking lots and so forth It also includes bike lanes and other kinds of bike infrastructure You can make out the building footprints here around downtown LA as well as a bunch of different amenities like hotels, libraries, hospitals, banks, et cetera The nice thing about open street map, particularly in the US context that is data quality is really high So it originally had a big data dump from tiger line shape files and then has had a decade or so of additions corrections improvements to get a lot more stuff into there But OSMnx can be a little bit hard to work with because it’s data are a little bit hard to extract You normally have to write a bunch of complicated queries that kind of look like querying a SQL database to get information out of it And that can be a big barrier to entry for a lot of people Because once you get that raw data, you’ve got to turn it into a model somehow, which can be very complicated to do So I developed OSMnx to be a really simple software library that with one easy intuitive line of code, you can create a model of the street network anywhere in the world If you’re not a coder, you don’t have to work today This will be the extent of the code that we look at All that I’ve done here is I’ve imported OSMnx And then one line of code I asked it to create a graph from place So graph just means a model of a network of past in Los Angeles, California And I specify the network type is equal to drive So that means that without one line of code, I have downloaded the entire street network for the municipal boundaries with the city of LA they’re drivable And you can see the model here right next to it Now the nice thing with open street map data, we’re not limited to just the US so we can just as easily get the street network for Medina or the Medina of Tunez We can also get drivable networks walkable networks, bikeable networks, or all of the above And we’re just interested in looking at the infrastructure itself I always liked this example of a Medina because you can really clearly see two different urban forms here as structured by the street network So in the center left of this figure, you can see the old medieval Medina and it’s more self-organized organic urban form And just to the right of it, you can see the orthogonal laid up by the French colonialist planters between the Medina and the port, very different spatial signatures of different planning paradigms That beyond street networks themselves As I mentioned earlier you can get building footprints points of interest all that other kind of stuff to calculate accessibility but you can also pull in elevation data So here I created a model of the San Francisco street network in one line of code And then in one more line of code, I told her to download and attach the elevations to every node in the street network So remember in this network model, the nodes are points As you see here, all represent intersections dead ends the edges or the faint gray lines you can see in the background or present street segments or the links between any pair of intersections or dead ends The next poles in the elevation data automatically here for all of the nodes Once we’ve done that we can calculate street grades to look at the incline of every street segment in the network This is useful for a bunch of different things For example, with that node elevation data we can quickly model sea level rise and look at which parts of the network are most prone to being swamped by sea level rise given different projections over the coming century We can also look at different flooding events So given different projections of sea level rise or just more extreme climate conditions, what parts of the network are most prone to flooding given different events and what happens to the rest of the network and shortest path trips if those sections get knocked out during a flood We can also use elevation data here to create better impedance functions for trick calculation Now, often in network analysis studies, we’ll use some kind of shortest path to understand the route that someone will take between point A and point B The simplest way to do that is just minimizing the distance traveled So if I want to get from point A to point B I can find the shortest path That means I’ve traveled the fewest meters to get

between the two, a little bit richer than that would be minimizing travel time which is pretty easy to do If you have speed limit data or in the case of OSMnx you can impute an inferred street network data given the sparse data that exists on open street map and then calculate shortest path by travel time But if you’re a cyclist very often, you’re not interested in trying to minimize travel time let alone distance traveled For example if the shortest path by time, has you biking directly up over a Hill and directly back down the other side to shave off four or five seconds of your total trip, you might rather just bike around the Hill So we can use that information to build better impedance functions to calculate more plausible trips for different kinds of modes And then we can also compare them against empirical evidence but what kind of routes people take to better understand that trip taking behavior Okay, so that’s the short gist of the tool And now I want to talk about a couple of recent and ongoing projects where I’m using this tool to build some models and indicators for a couple of different empirical projects Now, first I want to show you a little bit of a recent work So this project was just published in Japa last week I have a link at the end of this section If you’re interested in seeing it, you can scan it with your phone in a couple of slides So in this recent work, I modeled a street network of the entire US at multiple scales So that includes every city in town, every urbanized area for County census tract and each Zillow defined neighborhood So to get those boundaries I pulled the first four from the census Bureau So they have the boundaries of every city urbanized area, County, census tracks And then from Zillow I used their shape files to finding individual neighborhoods across hundreds of cities around the country Once I got those study sites polygons, I used OSMnx to download the street network within each, build a model of it and then calculate a couple of dozen different indicators of the network for each of them Now, those indicators cover common stuff that we think about in transportation planning and urban design So stuff like intersection density, block lengths the proportion of four way intersections, but also bringing in some newer stuff from network science as well So trying to understand between the centrality is and maximum between the some trials, which tell us about the topology of the network, looking at average node connectivities which can tell us about how easy it is to break the network into unrideable sections, by knocking out one node at a time, once we have all those indicators I created an open data repository which I deposited the models, indicators and the code to do all of this online for public reuse So I’m going to talk about some of those indicators here And first introduce one that might not be as familiar for folks coming from the transportation or urban design field So this is the idea of orientation entropy So orientation entropy is, has gotten kind of common in urban geography recently and it drives from the idea of entropy from physics or math, a measure of disorder of a system The easiest way for me to wrap my head around the idea of orientation entropies to look at visually So here I have two different cities, I guess a borough and a city Manhattan is on the left and Boston is on the right, and you can see their street networks Now next to each one of them is a polar histogram So this is a histogram of street network orientation and Hatton’s an easy one to illustrate this with And that happened if you look at the street network itself, you can see that it’s aligned about 29 degrees off true North and almost all the streets of Manhattan follow on a grid So offset North, South, and East West, we can see that reflected in its polar histogram So this histogram is just like a regular histogram that we’re used to seeing It shows us the relative frequency across bins, the differences rather than being stretched out horizontally This histogram wraps in a circle around the face of a compass Each bin in the histogram represents 10 degrees around the compass And each bar in the histogram represents the relative frequency of streets pointing in that direction it’s a compass So in Manhattan, we can see a test to Graham shows us that almost all the streets point

in one of those four directions Boston tells us a very different story So Boston has a long history for hundreds of years of being an amalgamation of a bunch of different small townships villages, old farms all agglomerated together into the metropolis we know today now Boston city limits include a bunch of different street networks and including a couple of grids against South Boston or the back Bay, but aren’t aligned with each other as they were planned separately from one another And then a bunch of winding paths that follow old CALPADS and village routes from centuries ago If you look at Boston’s polar histogram you can see it’s not as neatly organized into those four bins like Manhattan does rather in Boston, the streets point more evenly in all directions around the compass There isn’t a simple set of a few bearings the streets tend to have So the way that we would characterize this statistically is that we would say that Manhattan has a low entropy street network Boston has a high entropy street network We can use that idea of entropy start thinking about credit Innes So here I’ve calculated a grid index indicator and this combines together that idea of orientation entropy with the proportion of four way intersections, as well as the straightness of the streets in each census tract around the country And the idea here is that theoretically, a grid has straight streets It has four way intersections and those streets all tend to point in similar directions to one another And we can see across all the contiguous US census tracks where the grid tends to exist today in particular, across the Midwest and great Plains, we see really high grid redness in California, central Valley We do as well, but also notice across the entire country we tend to see this little archipelago of cities, which are places of high grounded-ness surrounded by seas of dark purple or lower gridredness And this just reflects the spatial legacy of US urbanization and particular 19th century expansion from East to West across the country We can see written into the street that works today the spatial legacy of the US homestead act and other orthogonal planning instruments that laid out space that subsequent city planning was deeply influenced by But when you look at this, you might hypothesize other covariates here too for grittiness For example, typography if you’re familiar with where mountain ranges are located in the US you can quickly pick out that there aren’t many grids there across the Appalachians or the Rocky mountains or the Sierra Nevada And that makes sense, right? It’s hard to build a grid on uneven or steep ground with the exception of San Francisco, but you might also think about covariates like the era of urbanization So when places were built might be an important feature for determining a grid like they are today or suburbanization and a different value system and ethic, and how we laid out streets in the first place So I’m unpacking all of this with a regression model We can see the same kind of story that we see here visually with these charts Now, this is a snapshot of characteristics today of street networks built in different decades So essentially what I’ve done is I’ve used a set of algorithms to tag every census tract in the US with which decade in which it primarily developed And then I can look at those decade buckets and see different street network characteristics trends over time So here we’re looking at average values across those different decades and we can see a really consistent story So grittiness declined from World War II to the 1990s orientation order decline So entropy Rose straightness declined proportion of four way intersections declined Intersection density declined conversely dead-end proportions rose, average street segment lengths or that’s the block size rose as well And number of vehicles per household today is much higher in places that were built and later decades And they were in an earlier decades Now we also see an interesting feature here hinges around that low point in the 1990s over the past two decades, we see a small return back toward more traditional urban patterns, sort of returned towards the straighter streets, higher intersection density

and higher four-way intersection proportions in more newly built places over the past few years So even when we control for these other features in our regression model, these key trends still hold across the different indicators that you see here If you’re interested in that study that just got published last week, you can type in that URL, or if you have your phone handy you can scan the QR code to pull up more information about it while you’re doing that I want to transition to the second part of this talk, which is looking at some current work Now the current work got started over the summer and is ongoing right now as we speak So if you have any feedback on it, I would love to hear So I was inspired by that previous project of looking at all the US street networks to now try to model all the street networks of every urban area in the world And historically this has been impossible for a couple of reasons First, we haven’t had easy to access street network data everywhere in the world And second, we need to have some kind of study site definitions So open street map helps us deal with that first problem We now have for drivable roads and in urban areas, pretty good road coverage worldwide But the second problem was that we need to define those study boundaries I don’t want to just use a point or presenting a town in India without any concern about how big or small the actual study site around it should be Over the past couple of years, the global human settlements layer project which is a big international project led by a bunch of development agencies and UN working groups has produced a new database of urban area boundaries worldwide using remote sensing and machine learning algorithms to try to define where cities are and are not So it’s basically a big polygon data set telling us where urbanization has occurred or where no urbanization has occurred So I used that dataset to extract out urban area boundaries worldwide using their consistent algorithms for saying what is or is not a city I then used OSMnx with those boundaries to pull in the street network data for every urban area in the world I built a model of it, by the way, just to clarify those street networks are drivable street networks for every urban area in the world And then once that was all modeled I calculated a few dozen indicators for each one And then like in the last project, I deposited those models and indicators and the code, and just some open repositories online for other people to work with Should it be a foolish thing to do before I published anything on this, but it’s all out there if you want to start playing around with it So I’m gonna break down those workflow for a couple of minutes here to think through how all of this worked, especially if you’re interested in doing this or replicating some of it So to get these worldwide street networks like I mentioned, I used each urban area in the global human settlements layer database a couple of coffee outs First, they mark true positives and false positives in their data set It’s a combination of fitting a model based on a human training set So people went in and validated on a random sample which urban areas were correctly or incorrectly tagged And then they fitted the model and then ran it on the remainder of the data set to give a true positive score So I’m only looking at the ones that were scored as true positive by the global human settlements later Secondly, I’m only using the urban areas that are at least one square kilometer in size That’s pretty small but it still helps us weed out little villages and so forth So we’re trying to look at something larger than that really small village scale I then used OSMnx to download and model everything So those are drivable networks within the boundaries of those urban areas And then I put all the code and models online So that’s on the Harvard Dataverse The code itself is on GitHub I also attached elevation data to every node in every model So there were, I can’t remember off the top of my head, but there were tens of millions of nodes to do this too So I had to figure out easy way to do this and also make it reproducible Now, OSM annex itself has built in the ability to query the Google maps elevation API, to attach elevation data to your network But if you exceed some threshold could be something like a million total nodes in a month, you have to start paying for it

It’s not a free for unlimited use So I wanted to find open data to be able to attach to this also that way that I could put it into an open data repository, unlike Google It doesn’t want me to put there closed source for profit data on an open data repository So I looked at three different data sources here The first is ASTER which is a 30 meter by 30 meter resolution data set of elevations around the planet based on satellite imagery It’s pretty fine resolution but it’s fairly noisy due to error That’s 30 meter by 30 scale Secondly, looked at SRTM3, which is a 90 meter by 90 meter elevation dataset It’s coarser grained, but it’s much less noisy So there’s a lot less error in it And it’s also been processed to fix further errors and to fill in voids in the dataset And then third as a validation set I also attach the Google elevation API data to each of those nodes as well So each node ended up with those three values for elevation estimates across the attends to a hundred million nodes in my data set Now none of these data are perfect so there are trade-offs to using any of them So I came up with a set of elevation value selection rules so that I could assign a single elevation value to each node in the network The way I did that is that I by default use the ASTER value So ASTER has the finest resolution So I prefer that, but it has a bunch of null values So I fill in those NOLs with the SRTM3 data And then I look at the two of them STRM3 and ASTER and I see if they differ from each other by more than the SRTM specs error, which is roughly 16 meters And if they differ by more than that, then I look at the Google elevation data as a tiebreaker for which one’s closer to that to try to triangulate across the three, which one is the best value to pick So in the end, we have a single elevation data for data point for each node in the network And then I calculate edge grades to understand the steepness of streets in every street network around them So with that data I’ll put together, I then calculate a bunch of different indicators So like I mentioned earlier these are indicators with both network geometry It is like lengths, widths, areas, angles, and then also indicators of network topology So how things are connected and configured to one another, a lot of these are pretty common and transport, urban design and network science I have some examples here, including intersection density block lengths, circularity, or straightness One is the inverse of the other, the orientation entropy which I showed you earlier with those polar histograms And then a lot more, I’m gonna talk about the intersection density a little bit more in a minute because I’m calculating that in a new way that I think is a big improvement on traditional methods But first I want to talk about orientation entropy a little bit more So I showed you earlier in the US looking at Manhattan and Boston, those polar histograms of what directions, the streets point in So here I’ve taken a sample of 100 cities around the world and we can see their orientation entropy with polar here So in the top left Chicago, for example, has the lowest entropy of any of these networks, and in the bottom, right Sao Paulo and Charlotte have the highest entropy or the most disordered street networks according to the orderly logic of the crib We can also look at a bunch of those indicators and do a cluster analysis to see what kind of places are more or less similar to each other So this is across a feature space with like six or eight dimensions in it I’m including orientation, entropy And I’m looking at that same set of 100 large cities here So this visualization project that eight dimensional feature space onto just two dimensions using TCNE the clustering all occurred in the eight dimensions And these are colored by cluster a and we can see some of these different similarities between places So for example the big Chinese cities, Beijing and Shanghai towards the top of the figure in yellow are aside in their own cluster So the left of them, Las Vegas and Phoenix so big Sunbelt cities with very coarse-grained networks are very similar to each other and rotating down toward the bottom left From there, we can see the grid, American cities like Chicago, New York, and Washington, Buenos IRAs This is very similar to them

and it’s street network pattern Now I mentioned that intersection density calculation a minute ago to you So typically when we work with intersection densities when you look at a total count of intersections in a city, and then we divide it by the total area or the net area of the city to give us intersections per square kilometer or something equivalent to that, or the problem with that when we’re working with point data for intersections, is that very often either from tiger line data or from open street map data or any source, intersections are usually where two lines intersect with each other Now in a non planar graph model that will account for bridges and tunnels but it doesn’t account for complex intersections And the left panel here you can see San Vicente near where I live in LA, it’s a complex intersection at almost every point at which it crosses the little residential streets because of the divided road And also because of the turn lanes on that street So to model all of that, we have all these extra points representing intersections along the way And we know reality if you’re there as a pedestrian or a motorist you think of this thing as a single intersection So what I don’t want to do is privilege these kinds of places that have a bunch of points representing a single intersection by over counting their intersections and in turn their intersection density Now, one simple way to account for those complex intersections is to just buffer their geometries and merge together, overlapping geometries and say that merged overlap is one single true intersection You can see that logic in the center panel here So basically if they’re within 10 to 15 meters of each other I’ll call them a single intersection and only count the ones But I don’t really like that either because there are times when you can be within 10 to 15 meters of each other, but it’s not really one intersection So for example, bollards that you can’t get through, shouldn’t be joined together into one thing or if you have an overpass and an underpass right by each other, just because they’re within 15 meters of each other and an X, Y plane doesn’t mean that we should treat them as one single intersection So I developed a new algorithm to let us work topologically with intersection count consolidation So what that algorithm does is it traces along the network itself So find nodes that are within 10 or 15 meters of each other, and then only along the network if they’re reachable within that amount of distance are they consolidated into a single intersection object? So I’ve ran this algorithm for all of those cities in the world Here I’m looking at just the 4,500 largest So that’s approximately the half that are the largest cities And here’s what I found I compared my topologically cleaned up intersection densities with those raw intersection densities where you just say, how many points divided by how many square kilometers What I find across all these cities is that just the point method tends to over count by about 16%, the true number of intersections by compared to a topological method This would be in the US roughly similar if you use tiger line data or open street map data, then I looked regionally to figure out which places have a greater or lesser over-count I found the worst regions were Australia New Zealand and Southern Europe I’m in both of those regions The over count is greater than 29% So you’re getting 29% more intersections counted with just the point counted than you do with the topological consolidation method The worst countries are Australia, Spain, and Israel All of them have a greater than 33% over count And that’s huge That’s significant, you know, intersection counts Aren’t high that we should have a 33% over and still think we’re roughly talking about the right thing And importantly here, there is a strong bias between different kinds of places So what I found is that there’s error isn’t evenly around the world, but different kinds of places tend to have different kinds of over-count And in particular one thing that I’ve found is that a 1% increase in the urban areas, GDP per capita is associated with the 0.25% increase in over-count So what that means is wealthier places around the planet tend to have a greater over-count in that intersection density My theory for why that is, would be

that places that are wealthier tend to have people who use computers more have better internet connectivity tend to do more volunteer geographic information activities And so there’s more fine grained digitization of the road network If there were a big arterial road in a small town, a developing country you’re less likely to have somebody go in and digitize each divided edge, digitize individual turn lanes, and so forth So with that added detail in places like the US and Western Europe we also get something like greater intersection over-count, unless we topologically clean it up first Now, another thing that I’ve recently been looking at are CO2 emissions So here I’m looking at predicting CO2 emissions as a function of a bunch of different street network characteristics and different covariates as important controls So my response here is looking at total transport sector CO2 emissions at the urban area scale And then I have a bunch of controls in here like, does it have an airport or a water port? What’s the percent of open space in that city? How big is the built up area? How hilly is it which we can use as the median street graded as a proxy, what’s the nightlight emissions and GDP per capita to try to understand other economic conditions in that place And then I also have controls for the UN development group that that city is in Now in particular, the things I’m most interested in on street network form are the average no degree straightness intersection density and per capita street length So average, no degree is a measure of how connected the street network is It means how many streets per node on average And I find that a 1% increase in average no degree is associated with a 3.1% decrease in transport sector, CO2 emissions in that urban area while holding those other control variables comes too I also found that a 1% increase in straightness which is a measure of the efficiency of the road network and the efficiency of trip taking is associated with a 5.1% decrease in CO2 emissions I’m at 1% increase in intersection Density tells us about the, how fine-grained the network is so associated with the 0.15% decrease in CO2 emissions And lastly, per capita street length, it tells us about how much road infrastructure there is in an urban area per person, a 1% increase in per capita street length is associated with a 0.45% increase in CO2 emissions So the response here, CO2 emissions again is transport sector, CO2 emissions, specifically from non short cycle, organic fuels So stuff like gasoline and diesel Now if you’re interested in that study, I have a working paper put together right now for the data and the modeling and just a thumbnail sketch of some of the analytics You can pull it up here at that URL If you have your phone handy you can scan the QR code to learn more And like I mentioned, the models and indicators in a preliminary form are already up online I’m going to be revising them with some improvements over the next month and the revised version should be up in December So briefly we wrap this all up My goal for this tool OSMnx is to make these different street network analyses easier particularly for urban planners too often over the past 10 to 15 years, a lot of cool new stuff coming out of network science and big data or machine learning or computational social science has been really heavily driven by computer scientists and physicists, which is great for developing the methods But too often, it’s just a solution in search of a problem Like I know some nifty models and algorithms I’ve lived in a city before So let me try to apply them to a city What we really need in this field right now are people who are interested in urban theory and planning practice and the lived experience of human beings on the ground, especially marginalized communities, rather than just the voices of a bunch of physicists doing urban data science So I hope that a tool like this that’s really easy to use and helps you generalize a way a bunch of the technicalities and complicated stuff of building models can bring more voices into these discussions can lead to different kinds of people building better maps

or better indicators, or doing community activism based on comparing their community to another community Thinking about planning impact It could be arising in different kinds of places I work with the models and indicators is kind of along the similar vein People might not feel comfortable working with even a few lines of Python code So if they’re not, hopefully these models and indicators give them one more stepping stone to do cool analysis without having to know anything other than the current tools that are already working with And along those lines of final motivation here is around the open science movement So too often, researchers will conduct a study have all of their Ad hoc code or tools that they use to do it put into a file on their computer and hidden away and never seen again The more that we can make our code, our data and our findings opensource and publicly accessible, the more we can have science function in an efficient way building on each other’s work So I’m hoping that by placing all of this stuff online, more people can look at it and do interesting stuff with it that I would never have thought of to do myself the many hands, many eyes model of conducting science So I will leave it at that and open it up to any questions My contact info is here If you want it to reach me, thank you – So starting off this off, how would you use OSMnx to quantify and, or describe changes to a street network over time? – Yeah, and that’s a tricky thing to do So we can’t look at open street map data longitudinally and make claims about what has changed over time in and of itself because we don’t necessarily know what has changed over time on the ground, in the real world versus what has changed over time at its database So maybe that street was always there but it wasn’t until 2018 That’s when finally digitized it in the database In the US we can make some more robust claims about that, because the open street map network was pretty well built out about 10 years ago What I did for that Japa paper that I showed you a few minutes ago was I looked through census data and tax assessor and property transaction data to estimate the statistical distribution of build dates of buildings in census tracks around the country And then I used those distributions to tag when the tract was initially built which is a rough proxy for when the street network would have been initially laid out So doing that, I was able to look at these snapshots of what street networks look like today in census tracks of different vintage And that idea of today is important So we’re just looking at a snapshot today of what tracks from different areas look like, but that snapshot usually works pretty well because once they’re built, street networks tend to be fairly stable over time Urban renewal can change that somewhat but in general, once patterns are built they’re there for centuries or sometimes millennia in roughly the same form in most cases – Thank you so much We also have our urban planning group that provided introductions earlier So I’m gonna queue up Tessa to join us on video to ask the question she has from her urban planning group Thank you – Thank you, professor Boweing from your presentation My name is Tessa, I’m the co-chair for communications at up If I understood correctly I believe you mentioned earlier in the presentation how the elevation data used restream modeling can provide information on sea level rise and prepare for flooding Could you elaborate on this? I’m interested specifically on how this data can intersect with climate change awareness or mitigation – Yeah, there are two simple ways to look at that The first would be having some kind of flooding model to understand where flows will move through the city So for example, if there’s a torrential rainstorm where will water accumulate and where will it be channelized down to get to lower ground? You can use elevation data in that kind of a model to understand the flows of water through city infrastructure The second thing you can look at would be different projections of sea level rise So given different model estimates of what sea level rise might look like over coming decades or centuries, we can use those levels to see which parts of the street network would be subsumed by the sea over years You can think about that from a planning perspective

of how we need to reorganize our transportation system or build something to protect it Since we’re often now faced with climate mitigation rather than adaptation, given how late in the game we were starting to deal with a lot of these problems So having that elevation data can let us compare parts of the street network and communities whose street networks are most vulnerable to sea level rise and then start targeting interventions accordingly there – Thank you so much Tessa Our next question from the audience So thank you, professor Boweing for the great presentation You mentioned the OSM is open source application and I’m wondering about the accuracy of the data behind it You mentioned that anyone can contribute Are these contributors certified by OSM to update the database? How does OSM incorporate updates and changes in general? – Yeah. Good question No, there’s no certification Anyone can sign up and contribute to it There are the top of my head over a million contributors to OSM now There is some editorial oversight though and it works similar to Wikipedia where if someone has committed vandalism on open street map, there are people watching the change log and quickly reversing anything It’s obviously vandalism for things that are just smaller errors those happen, and they persist until they’re caught So part of the drawback of this system is that anyone can contribute to it So like on Wikipedia, you’ll both get some sampling biases of which topics get covered And also some user errors where people entered in incorrect information but that’s also the benefit of the system If anything is ever wrong, if you’re building a model and it doesn’t look right, all you have to do is go in click a few times make a correction, and you’ve fixed it there Some of this happens through a validation process as well though where people are comparing, what’s been added to the system against satellite imagery and so forth to make sure it lines up generally well with the real world Now that I mentioned this briefly earlier too, but part of it has to do with where you’re looking and what you’re looking at So in say us or Western European city centers there’s really good data coverage And as you move outside of cities or did it say developing countries you see less good data coverage If you’re looking for drivable roads, you generally see really good data coverage and accuracy, but if you’re looking for bike lanes you might see every single bike lane in the city of San Francisco, but probably not all of the bike lanes in Des Moines, let alone in a town in Vietnam So it really depends on where you’re looking and what you’re looking at – Great. Thank you so much I’m going to go ahead and invite Issie on with his video Another one of our students from the urban planning student organization who has a question directly for you – But first of all I’d like to thank you again for this amazing presentation I just had a question as well With COVID-19 impact on urban transportation and behavior What current trends and or patterns do you think might change in these future models or indicators? – Good question You know I’m always hesitant to weigh in on COVID-19 impacts because everyone’s got an opinion but we really don’t have much data by which to hypothesis test those opinions yet Things will probably change I doubt that it’ll be as extreme or as permanent as a lot of the doom and gloom urban fearmonger is, are staying I guess that’s my personal opinion The big factor will be, you know, what’s the nature of work from home versus face-to-face interaction as time goes on there, we’re pushing Paul effects to both If people aren’t commuting as much, we might see street number patterns predicated more around the idea of the neighborhood and being able to spend time in the public right of way as a family, rather than just driving through it as high of speeds as possible So stuff like that could potentially change the way that we use our existing streets or plan new ones over time But in general for COVID-19, I’m a little skeptical I’d to wait and see this it’s a really hard year or two for all of us, but there will be an end in sight at some point And there are a lot of reasons why cities exist the way they are for good or bad and through path dependence that makes it really hard to change existing street network patterns So with that in mind, I’d be skeptical about the indicators changing that much If you look at LA today versus LA 10 years from now in terms of network design – Thank you so much USC, another one of our questions from the audience, and I’m gonna quote a research here I hope I pronounce the names correctly,

but you were in survey role from 2010 And their meta analysis found that certain street network metrics correlate with transportation behaviors in cities around the globe, which of the network metrics you’ve covered today might correlate the most with observed transportation behaviors and which of these correlations might be causal and which might be due to other variables such as self-selection the neighborhoods by income level – Yeah, that’s a good question So I should be able to quote the specific elasticities from that paper because Bob sir, Vero was my exam committee chair but I can’t now as far as I recall the gist of it, was that design metrics were important but they tended to have smaller elasticities than other stuff like jobs, access and so forth I believe intersection density was one of the ones that you hang in, serve arrow singled out as having a consistently significant elasticity with non-motorized trip taking In general, the idea of correlation and causation is really tough You normally need some kind of quasi experimental design to try to tease out, which is which or qualitative research to better understand why people live where they live But there’s a lot of endogeneity here So often when we’re talking about behavior change, as you mentioned self-selection effects are really important And just because we build more connected streets or we build tensor street networks doesn’t mean we’re changing existing residents behavior but it might give people the opportunity to express different travel behaviors And I guess my personal take on a lot of this would be that’s a better framing for it than the idea of causing people to change Creating freedom of mode choice is really important And so often, like in a place like LA, we design transportation infrastructure that does not allow freedom of mode choice So if people do want to walk or bike to their jobs or school, it’s often impossible or at best dangerous to do that because we have designed our road infrastructure to accommodate cars over every possible inch of it So even if we’re not necessarily changing behavior by shifting some urban design variables for street networks we can allow people to express different kinds of behavior The last part of your question is what would I guess would be the indicators most correlated with walking And since I haven’t studied that question specifically with like a walking response variable, I would just guess the same things that we’re familiar with from the literature So typically in the urban design and transport literature around urban form and walking for network design, we’ll often see things like intersection density, block length connectivity So how easy it is to navigate through the urban fabric tend to all be well correlated with walking rates But again, I’m skeptical of causal claims around that – Thank you so much So right now this will be our concluding question but we may get a few more this is from an anonymous attendee in our audience asking how does this model deal with the modifiable aerial unit problems and AUPs with different areas around the world? – Yeah, interesting with networks in general, because whenever you’re modeling a network a real-world network, you have to impose artificial periphery on it So in reality, what if you model say a social network? It’s not like the social network is actually a closed group, right? Everyone in that social network knows someone outside of it And at some point you have to cut off a threshold there, it’s the same thing with street networks So when I do a model of like the Los Angeles urbanized area I’m cutting it off where the census Bureau has decided that population densities are low enough that there’s no longer as included within that contiguous urbanized area That means I just have a hard stop on the roads that are there OSMnx does a couple of things under the hood to try to attenuate some of those artificial periphery effects So whenever you make a model, it expands out with a buffer looking about half a kilometer on all sides of the boundary that you gave it It uses that so that when you chop off all those roads at the actual boundary of your study site and it doesn’t treat it as a dead end or anything like that, it says, this is a true four way intersection but all the streets that connect to it go off outside of the periphery So I’ll count it as a four way intersection but I won’t model all those streets that leave the boundary So technically that’s how I’ve approached that problem with modeling itself But in general, when you’re doing empirical work with network models to evolve, to avoid those problems

with modifiable areas, you should look at multiple scales So you might look at your single study site and then different buffers around it to understand like through robustness checks how much your indicators vary as that periphery changes Another thing you can do is look at different administrative spatial scales like neighborhoods or tracks or boroughs or cities or counties, et cetera, and see what the distribution of those indicators are across that place or within that place looking at between and within variation is a nice way to characterize indicators with a distribution rather than with a single value which can really help for Emma AUP kind of problems – Thank you I’d like to invite Tessa back to join us again, our prize student from the urban planning student organization group with another question Thank you Tessa – Hi again So my second question is about orientation entropy in your experience, how has the accessibility and quality of public transportation affected in urban areas, orientation, entropy, if at all? – So that’s a good one that we do have some causal knowledge of because of how transit lines are designed It’s much easier, both for transit rider legibility as well as for operational efficiency If a bus line for example, it doesn’t have to make a lot of turns So in general grid street networks lend themselves to running buses through them Both is your operation, but also if you’re looking at a bus route, you can just take the seven 28 that runs up and down Olympic rather than trying to figure out what number goes on, what part of the street segment before it turns and goes a different direction So that’s not to say that grids are like the, be all end, all of street design but for buses, at least it is easier if you have fairly low entropy street networks for those couple of reasons – Thank you Tessa I believe that’s all the questions we currently have We might get a couple more but professor Geoff is there anything that you didn’t get to cover that you’d like to talk a little bit about? We do have some time and always happy to give people their time back but the floor is yours – Yeah, no, I have a brain dump to onto all of you what I recently been doing and what I’m currently doing So we can all grab lunch if there are no further questions – Okay, fantastic, cool Well, I think professor Jen Juliano is able to jump back on I know she had a little bit of technical difficulties I think she’d like to jump on and give us a little conclusion for today – [Jen] I wanted to thank you very much, Geoff for giving the presentation a special thank you to Harley for stepping in, to handle everything and thank you to our undergraduate planning group for being co-sponsored And this was a great way to end our sessions for the fall semester Stay tuned for our announcements for what we’ll be doing in the spring So thanks very much, everybody