Build effective OEM-level apps on Android Things (Google I/O '18)

[MUSIC PLAYING] DAVE SMITH: Well, good afternoon, everybody Oh, come on I got nothing? Good afternoon, everybody! [APPLAUSE] There we go There we go Well, thank you so much for spending your time with me today My name is Dave Smith I’m a Developer Advocate here at Google working on the Android Things platform And I’m here today to talk to you a little bit about building apps for the Android Things platform and how you can be more effective in the apps that you build using the Android SDK So whether you’re new to Android or whether you’ve been building Android apps since the beginning, targeting Android Things devices– it has some subtle differences from what you may or may not be used to in working with Android Understanding these differences is what will ensure that you can build better apps on the platform But before we jump into too much of that, let me just do a quick overview of what Android Things is, maybe for the uninitiated So Android Things is a fully-managed platform for building connected devices at scale It’s a variant of the Android platform that is optimized for use in embedded devices It enables you to build apps for embedded in IoT using the same Android SDK and Google Play services that you use to build for [INAUDIBLE] mobile You can develop apps using the same tools, such as Android Studio, to deploy and debug your apps to devices as well It includes the Android Things Developer Console This is a place where you can securely manage your software stability and security updates for your devices You simply upload your apps, choose the OS version that you want to run on your device, and then deploy those updates to those devices over the air Security updates are even deployed automatically to those devices for you Android Things also supports powerful hardware that’s suitable for edge computing and production, capable of driving artificial intelligence and machine learning out to the edge This hardware is packaged into system-on modules that make it easy for you to integrate into your final production designs So when you look at all these things together, the process is a little bit different than when you were building apps on a mobile device Building a typical app for Android devices means distributing a single app binary– through the Google Play Store, typically Apps have to work on multiple devices made by multiple OEMs targeting multiple versions of the Android operating system, typically requiring you to do various compatibility checks and other things like that to make sure that your app runs well across that entire breadth of the ecosystem With Android Things, you are the device OEM You control when the OS on your device gets upgraded, and the various apps that are bundled into that system image along with it And you do all of this through the Android Things Console instead of through the Google Play Store This can greatly simplify your code, because you don’t need to incorporate a lot of those same compatibility checks But there are some things to consider that are going to be a little bit different Let’s start with displays In Android Things, displays are optional They’re supported, and you can use the full Android UI toolkit to build applications that have a Graphical User Interface, whether it’s touch-enabled or not But we’ve removed a lot of the default system UI and disabled or reworked some of the APIs that assume that graphical displays are in place, because many IoT devices will not have these pieces, and we don’t want to place those requirements in there The best example of this in practice is App Permissions So in Android Things, permissions are not granted at runtime by the end user, because we can’t assume that there’s a graphical display to show things like this dialog And we can’t even really assume that user granting specific types of permissions is appropriate for an IoT device So instead, these permissions are actually granted by you, the developer, using the Android Things Console, OK? So as the owner of your device, you’re responsible for taking control of the apps that run on this device, and the permissions that those particular applications have, OK? Now, because of this, permissions may not be granted by end users So that means you don’t necessarily have to check whether or not those permissions were grant– or you don’t have to request for those permissions to be granted at runtime But since they are still granted dynamically, the best practice is still for your code to verify that you have that permission, OK? Because that permission could have been revoked by one of the console users, and you don’t want your application code to behave improperly in those cases So you’ll still want to have checks like this one in your code when you are accessing dangerous permissions that could be granted or revoked

by the console, but you won’t have to include the code that requests those permissions upfront from the end user Not doing this will result in the same security exception that you would otherwise see by trying to access those protected permissions if, in fact, that permission is disabled Additionally, in Android Things 1.0, permissions are no longer granted automatically on reboot This is something that we did in some of the earlier previews and is no longer the case So that means that as a developer, you can’t simply just reboot your device to try and get all those permissions brought into your app automatically You have to actually use the tooling to make that happen So during development, what you’re going to want to do is provide the -g flag when installing the applications on your device And this will grant all the permissions requested by your app by default Android Studio actually does this for you automatically So whenever you click Build and Run out of the IDE, this process is taken care of for you But if you want to do this from the command line, you’re going to have to add that flag yourself Another option is to use the [? pm ?] [? (grant) ?] command to individually grant or revoke permissions inside of your application You can do this during development, or maybe just to test what the individual behavior is of a certain permission if you deny that inside of your application If you prefer to use the Gradle command line, or perhaps you’re running automated tests or other things where the IDE is not involved, you can actually add this to your build.gradle file using an adbOptions block to apply that same -g flag any time your application is installed Speaking of UI, we should probably talk a little bit about activities Most developers think that activities are essentially screens So if we remove displays, why do we need to keep them around? And it turns out that activities are a little bit more than that An activity represents a component of user focus in Android For devices with graphical displays, that does mean that it will render the contents of the view onto the window But even for devices without displays, activities also handle all of the user input events, whether that’s coming from a touch screen input– or maybe it’s a game controller or a keyboard– or any other external input device that you may have connected All of those events are going to be delivered to the foreground activity So even without a graphical display, activities are still a very important portion of Android user interface, even though the user interface might not actually include the Graphical UI It’s important to note, also, when we’re talking about activities that activities are still vulnerable to configuration changes the same way that they are on Android So as an Android developer, you’re probably used to, at least at some point, having to deal with an orientation change of a device and having that destroy your activity and recreate a new instance of it That’s effectively a very common configuration change on Android mobile devices While on Android Things, that specific instance probably is not very common– if it would happen at all– there are still a number of other configuration changes that might still happen on Android Things devices Things like changing the default locale or connecting or disconnecting a keyboard– a physical keyboard– from the device All of these events have the same net effect in that that activity will be destroyed and recreated if it happens to be in the foreground So generally speaking, if you’re working with activities on Android Things, the same rules apply to activities in terms of the logic that you put into those components They’re effectively just as fragile in terms of their lifecycle So you’re only going to want to have view-based logic or user interface-based logic inside of these activities Try not to put too much additional state into these components You’re going to want to push that out into other parts of your application Android Things even uses activities to launch your primary application as part of the boot process We do this using the HOME intent, which is the same intent that’s used to trigger the App Launcher on an Android mobile device This intent starts your app automatically on boot– and specifically the activity inside of that application– started automatically on boot And in addition to that, if that application crashes or terminates for any reason, Android is going to restart that application automatically So this becomes the main entry point into your application that is automatically managed by the Android Things platform So we don’t want to forget about activities just yet Couple other things about Android Things devices Android Things devices are also relatively memory-constrained when you compare them to an Android phone

A typical Android Things device may have 512 megabytes of RAM or so Compare that with the multiple gigabytes of RAM that you would have on an Android phone– like, say, a Pixel or a Pixel 2 What this translates to for you, the developer is that there’s actually a much lower per-process heap size for your individual application So if you’re not familiar with this idea, Android sets a fixed heap level on every application running on that device, and it’s significantly lower than the total available memory on that device And since the Android Things devices are relatively memory-constrained, that per-process limit is significantly lower than it would be on an Android phone Because of that, if you’re porting code from an Android mobile device over to Android Things, you just have to realize that if you’re using the same amount of memory in your app, there’s going to be a lot less free memory in that same process available to you, OK? So you have to keep that in mind And you also want to realize that this can also translate into a significantly larger amount of garbage collection events happening as you allocate new objects, OK? So you want to keep a close eye on object allocations, how often you’re doing object allocations Because you may run into that ceiling much more quickly than you otherwise would on an Android device, or you might see the garbage collector kicking in quite a bit more The Memory Profiler in Android Studio is a really great resource to help you keep an eye on what’s going on inside your memory It will allow you to track those allocations over time, as well as see, overlaid into it, the individual garbage collection events So you can get a really good idea of whether or not your application is allocating too much memory and causing trouble Some of the things that you can do to help understand your device a little bit better is use some of the ActivityManager methods to do some inspection on the memory capabilities of your particular device So for example, you can use the MemoryClass attribute on ActivityManager This will give you the exact heap size that’s available to your application The value that is returned is the value in megabytes That is how much memory you have The largeMemoryClass attribute is what your application would have if you added the large heap attribute to your manifest I would caution you against doing this on Android Things Generally speaking, because Android Things devices are memory-constrained, the MemoryClass and the largeMemoryClass of these devices are generally configured to be the same value So adding this attribute to your manifest is essentially not going to do anything, OK? You also want to inspect the low memory threshold of this device to get a sense for what that actually looks like When the available memory on the device falls below that memory threshold, the device is in a state that we call memory pressure And we’re going to talk a little bit more about what that means and why it’s important in a little bit, but just keep it in mind for now So I want you to notice something else about this diagram that I had up before Because of this per-process heap limit that’s a fixed value for a single application, if you try to put all of your application code into a single process or a single APK, you’re going to be severely limited in your ability to fully utilize the memory that is available on this device, OK? Now, keep in mind, with Android Things, the only apps on the device are your apps So you should be able to take full advantage of those memory resources as much as you possibly can The way to do that is to split your application into multiple processes, OK? Because that limit essentially will apply to each one of those processes individually So if you can federate the design of your application out into multiple components that are actually running in separate APKs, you’re going to have a much better ability to fully utilize the memory available on whatever device that you’re running To make the most effective use of our device, we’re going to break this app up into multiple APKs, with the primary activity running in the foreground and additional apps running in the background with support services running inside The additional benefit of running this architecture is that it actually insulates these various components from one another So in this scenario, if a crash happens in one of these components, it’s localized just to that element And it won’t bring down your entire application and have you have to restart all of that from the beginning So you can manage those individual issues just within that component and leave the rest of the applications or components running on your device to be unaffected It also means that you can launch or relaunch these components individually as needed by your application So if you don’t need to load everything at once at boot,

you can launch various services and components just as you need them Now, it turns out the decision to put some of your components into background apps has consequences as well Android treats foreground and background processes a little bit differently, and we need to be aware of what’s going on under the hood here Android marks application processes by priority based on how closely they are related to the foreground application And this is very important because of a system process known as the low memory killer The low memory killer is a process that is constantly prowling in the background, looking for new processes to devour Its job is to ensure that the free memory on the system is available to the foreground app at any given time So if the foreground app needs new memory and the device happens to be in a system of memory pressure, low memory killer is going to go hunting around for processes that it can terminate to allocate that memory back to the foreground, OK? On an Android device– like a typical user-driven Android device– this can be somewhat of a nuisance to developers, because their app may get terminated from the background But at some point, the user’s going to relaunch it later, and everything will be fine On Android Things, the low memory killer could mean that you have critical device functionality that is being terminated out from underneath you, and you didn’t even know it Perhaps there’s a device driver running in that service and Android killed that because it thought it was low enough priority in the background, OK? So something to keep in mind as you’re moving through this In addition, Android Oreo introduced execution limits for background apps So applications can no longer be started into a background state They must either be launched from the foreground app or bound to it in some way So because of these two things, there’s a number of different common ways that you may or may not have used in the past to launch components into the background, OK? And we’re going to walk through those a little bit So the first that you might be familiar with is using the BOOT_COMPLETED broadcast to listen for a final boot message coming from the Android framework saying that the system is up and running, you can launch other apps if you would like Do not use this on Android Things The primary reason is because of those background execution limits, your background services actually can’t be properly started into that state In a lot of cases, it won’t even work, OK? And in addition to that, this background broadcast– this BOOT_COMPLETED broadcast– is very unpredictable in terms of its timing In a lot of cases, this BOOT_COMPLETED broadcast actually triggers much, much later than when the home intent and the home activity are fully up and running in the foreground, OK? So if you’re trying to synchronize between these two things, it’s not a very good mechanism to rely on In addition, I would recommend you don’t use startService For a similar reason, startService is limited by those same background execution limits unless you are starting a service in foreground mode Now, foreground services require you, as a developer, to actually build in a notification that would typically display to the user when that service is running Well, we took away the system UI where that notification would display, so you end up doing a bunch of work for displaying the service that doesn’t actually gain you anything And in addition to that, there are some difficulties with started services when it comes to managing their lifecycle A started service– if that crashes for some reason, you don’t have a direct connection to understand that that occurred and that you need to restart that service so that you can manage that process a little bit better Now, Android does have this thing that services can return this START_STICKY attribute, and that’s a way for applications to tell Android that this service is important, and if it crashes or terminates for some reason, I need it to be restarted However, Android usually only does this about once or twice for a given service before they just sort of give up and realize that at some point, the user will launch this app again Maybe this will start again and everything will be fine, OK? That type of thinking doesn’t go well for those background services that have critical functionality in them– like a device driver, OK? So we recommend you use bindService instead By using bound services, this gives the background processes an active connection to that foreground app, OK? So you have a good indication of when that service is running and when that service has died for some reason so that you can manage that, relaunch it if you need to, do any of those things This also has the added benefit of a built-in communication channel between the applications that are bound So you can do some more direct communication with that service without having to use intents or other mechanisms

like that to pass data back and forth So looking at this diagram again, one of the other important reasons to use bound services is that pure background applications– like those that would have been started by BOOT_COMPLETED or just startService on its own– are very low priority on the scale, OK? Whereas bound service applications are almost as high priority as the foreground app They are literally the highest priority you can get without being the foreground app So this ensures that those background processes stay safe from something like low memory killer if the device ever does get into a memory pressure situation, OK? So you get better management of those services, and you get better protection from a memory management perspective All right, let’s take a look at what this would actually look like in code So I have just a basic example of a service here that has a device driver inside of it In this case, this is just a device driver to take some button inputs and convert them into key events like they were coming off of a keyboard All of this logic can be fully encapsulated into this external service and can run on its own So we can build this service component And then from the foreground app, we can construct an intent to that service component, and we can bind to it Notice that I’m doing this from the Application class and not from the primary activity Remember our discussion from activities before and the lifecycle associated with those If this is a service that needs to be as persistent as possible, we want to bind to it from a component that is expected to be around just as long, OK? So that you don’t end up with lifecycle issues where your activity gets destroyed, recreated, and you’re rebinding to that service unnecessarily Doesn’t necessarily cause a major problem, but it’s not the best idea In addition, with bound services, this also means that you get this feedback mechanism coming through the service connection callback So when you bind to a service, you provide this callback as a service connection And when the service is up and running, you will be notified through the onServiceConnected method So you know exactly when this is now something you can interact with or communicate with if you need to In addition, onServiceDisconnected tells us anytime that service stops unexpectedly– maybe because it has crashed or something else has occurred And at that point, we probably need to take a look at restarting this, especially if it’s running some critical functionality on our device So we now have the information we need to properly manage this functionality from within our applications, which we wouldn’t get with started services or other of these more independent mechanisms So here’s a final picture of that architecture again Android is going to manage for us automatically that foreground app using the HOME intent It will launch it automatically on boot, and it will relaunch it if that application crashes for any reason And then our application code can then manage these additional background support services through the bound service mechanism All right, the last thing that I want to share with you today are just a couple of quick tips on doing this type of development from within Android Studio, or within the development tools So you can manage multiple APKs from within a single Android Studio project by adding each additional package as a new module You can add multiple modules to the same project, and all of those modules can represent an APK or an individual app process This allows you to manage all of your code in one place, even though they’re technically separate apps Now, by default, Android Studio does not allow you to deploy an app module that does not contain a launcher activity, or an activity that has that main launcher intent filter on it This doesn’t work so well for background service apps that don’t have any activities at all in them in some cases But you can modify this behavior For a background services app, you can edit the run configuration and simply adjust the launch options for that particular module, set that target to Nothing instead of Default Activity This will enable Android Studio to deploy that service-only app to your device, and it won’t complain You can also do this from the command line And one of the advantages of doing it this way is that Android Studio does require that you deploy only one module at a time– by selecting that module from the run configuration list in the UI So if you have an application that’s constructed of four or five different modules all as individual APKs, it can be a bit cumbersome if you have to try and deploy them all individually all the time One of the advantages of using the Gradle command line is that by default, when you run a command like installDebug with no other modifiers, it builds and installs

every module in that project So with one command, you can deploy everything on the latest version to that device And you can still do individual modules, if you would prefer to do that, by just adding the module name to the command as well Once you’ve got the modules on the device, the other thing you can do directly from the command line that isn’t really supported in Android Studio today is the ability to start those individual components, whether their activities or services So using the am shell commands, you can trigger those services manually if you want to test out some of that behavior independently from the rest of the system, even though they may be managed by the foreground app in production All right, so let’s quickly review some of the tips that we’ve gone through here today Don’t assume a graphical UI Design for your memory constraints on these devices Break your app up into modules Bind your background services to the foreground app Don’t use started services And use the Gradle command line if you want to have more control over deploying your modules to the device Now, if you’re just as excited about Android Things as we are, I want to remind everyone that we’re doing a scavenger hunt here at Google I/O. If you visit the link here or use the Google I/O app, you can follow the instructions to find various items around the conference And once you’ve completed those challenges, you can then receive a free Android Things developer kit to take home To learn more about Android Things, visit the developer site and make sure to visit the Codelabs, office hours, and other demos that we have here in the Sandbox Also be sure to visit Android Things with to find featured community projects and additional sample code You’ll also find a lot of the sample code available for some of the demos that we have here at the conference on as well So thank you, everyone, for your time today, and I’m really excited to see the apps that you build with Android Things [APPLAUSE] [MUSIC PLAYING]

TensorFlow World 2019 Keynote

JEFF DEAN: I’m really excited to be here I think it was almost four years ago to the day that we were about 20 people sitting in a small conference room in one of the Google buildings We’ve woken up early because we wanted to kind of time this for an early East Coast launch where we were turning on the website and releasing the first version of TensorFlow as an open source project And I’m really, really excited to see what it’s become It’s just remarkable to see the growth and all the different kinds of ways in which people have used this system for all kinds of interesting things around the world So one thing that’s interesting is the growth in the use of TensorFlow also kind of mirrors the growth in interest in machine learning and machine learning research generally around the world So this is a graph showing the number of machine learning archive papers that have been posted over the last 10 years or so And you can see it’s growing quite, quite rapidly, much more quickly than you might expect And that lower red line is kind of the nice doubling every couple of years growth rate, exponential growth rate we got used to in computing power, due to Moore’s law for so many years That’s now kind of slowed down But you can see that the machine learning research community is generating research ideas at faster than that rate, which is pretty remarkable We’ve replaced computational growth with growth of ideas, and we’ll see those both together will be important And really, the excitement about machine learning is because we can now do things we couldn’t do before, right? As little as five or six years ago, computers really couldn’t see that well And starting in about 2012, 2013, we started to have people use deep neural networks to try to tackle computer vision problems, image classification, object detection, things like that And so now, using deep learning and deep neural networks, you can feed in the raw pixels of an image and fairly reliably get a prediction of what kind of object is in that image Feed in the pixels there Red, green, and blue values in a bunch of different coordinates, and you get out the prediction leopard This works for speech as well You can feed an audio wave forms, and by training on lots of audio wave forms and transcripts of what’s being said in those wave forms, we can actually take a completely new recording and tell you what is being said amid a transcript Bonjour, comment allez-vous? You can even combine these ideas and have models that take in pixels, and instead of just predicting classifications of what are in the object, it can actually write a short sentence, a short caption, that a human might write about the image– a cheetah lying on top of a car That’s one of my vacation photos, which was kind of cool And so just to show the progress in computer vision, in 2011, Stanford hosts an ImageNet contest every year to see how well computer vision systems can predict one of 1,000 categories in a full color image And you get about a million images to train on, and then you get a bunch of test images your model has never seen before And you need to make a prediction In 2011, the winning entrant got 26% error, right? So you can kind of make out what that is But it’s pretty hard to tell We know from human experiment that human error of a well-trained human, someone who’s practiced at this particular task and really understands 1,000 categories, gets about 5% error So this is not a trivial task And in 2016, the winning entrant got 3% error So just look at that tremendous progress in the ability of computers to resolve and understand computer imagery and have computer vision that actually works This is remarkably important in the world, because now we have systems that can perceive the world around us and we can do all kinds of really interesting things about We’ve seen similar progress in speech recognition and language translation and things like that So for the rest of the talk, I’d like to kind of structure it around this nice list of 14 challenges that the US National Academy of Engineering put out and felt like these were important things for the science and engineering communities to work on for the next 100 years They put this out in 2008 and came up with this list of 14 things after some deliberation And I think you’ll agree that these are sort of pretty good large challenging problems, that if we actually make progress on them, that we’ll actually have a lot of progress in the world We’ll be healthier We’ll be able to learn things better We’ll be able to develop better medicines We’ll have all kinds of interesting energy solutions So I’m going to talk about a few of these And the first one I’ll talk about is restoring and improving urban infrastructure So we’re on the cusp of the sort of widespread commercialization

of a really interesting new technology that’s going to really change how we think about transportation And that is autonomous vehicles And this is a problem that has been worked on for quite a while, but it’s now starting to look like it’s actually completely possible and commercially viable to produce these things And a lot of the reason is that we now have computer vision and machine learning techniques that can take in sort of raw forms of data that the sensors on these cars collect So they have the spinning LIDARs on the top that give them 3D point cloud data They have cameras in lots of different directions They have radar in the front bumper and the rear bumper And they can really take all this raw information in, and with a deep neural network, fuse it all together to build a high level understanding of what is going on around the car Or is it another car to my side, there’s a pedestrian up here to the left, there’s a light post over there I don’t really need to worry about that moving And really help to understand the environment in which they’re operating and then what actions can they take in the world that are both legal, safe, obey all the traffic laws, and get them from A to B And this is not some distant far-off dream Alphabet’s Waymo subsidiary has actually been running tests in Phoenix, Arizona Normally when they run tests, they have a safety driver in the front seat, ready to take over if the car does something kind of unexpected But for the last year or so, they’ve been running tests in Phoenix with real passengers in the backseat and no safety drivers in the front seat, running around suburban Phoenix So suburban Phoenix is a slightly easier training ground than, say, downtown Manhattan or San Francisco But it’s still something that is like not really far off It’s something that’s actually happening And this is really possible because of things like machine learning and the use of TensorFlow in these systems Another one that I’m really, really excited about is advance health informatics This is a really broad area, and I think there’s lots and lots of ways that machine learning and the use of health data can be used to make better health care decisions for people So I’ll talk about one of them And really, I think the potential here is that we can use machine learning to bring the wisdom of experts through a machine learning model anywhere in the world And that’s really a huge, huge opportunity So let’s look at this through one problem we’ve been working on for a while, which is diabetic retinopathy So diabetic retinopathy is the fastest growing cause of preventable blindness in the world And screening every year, if you’re at risk for this, and if you have diabetes or early sort of symptoms that make it likely you might develop diabetes, you should really get screened every year So there’s 400 million people around the world that should be screened every year But the screening is really specialized Doctors can’t do it You really need ophthalmologist level of training in order to do this effectively And the impact of the shortage is significant So in India, for example, there’s a shortage of 127,000 eye doctors to do this sort of screening And as a result, 45% of patients who are diagnosed to this disease actually have suffered either full or partial vision loss before they’re actually diagnosed and then treated And this is completely tragic because this disease, if you catch it in time, is completely treatable There’s a very simple 99% effective treatment that we just need to make sure that the right people get treated at the right time So what can you do? So, it turns out diabetic retinopathy screening is also a computer vision problem, and the progress we’ve made on general computer vision problems where you want to take a picture and tell if that’s a leopard or an aircraft carrier or a car actually also works for diabetic retinopathy So you can take a retinal image, which is what the screening camera, sort of the raw data that comes off the screening camera, and try to feed that into a model that predicts 1, 2, 3, 4, or 5 That’s how these things are graded, 1 being no diabetic retinopathy, 5 being proliferative, and the other numbers being in between So it turns out you can get a collection of data of retinal images and have ophthalmologists label them Turns out if you ask two ophthalmologists to label the same image, they agree with each other 60% of the time on the number 1, 2, 3, 4, or 5 But perhaps slightly scarier if you ask the same ophthalmologist to grade the same image a few hours apart, they agree with themselves 65% of the time But you can fix this by actually getting each image labeled by a lot of ophthalmologists, so you’ll get it labeled by seven ophthalmologists If five of them say it’s a 2, and two of them say it’s a 3, it’s probably more like a 2 than a 3 Eventually, you have a nice, high quality data set you can train on Like many machine learning problems, high quality data is the right raw ingredient

But then you can apply, basically, an off-the-shelf computer vision model trained on this data set And now you can get a model that is on par or perhaps slightly better than the average board certified ophthalmologist in the US, which is pretty amazing It turns out you can actually do better than that And if you get the data labeled by retinal specialists, people who have more training in retinal disease and change the protocol by which you label things, you get three retinal specialists to look at an image, discuss it amongst themselves, and come up with what’s called a sort of coordinated assessment and one number Then you can train a model and now be on par with retinal specialists, which is kind of the gold standard of care in this area And that’s something you can now take and distribute widely around the world So one issue particularly with health care kinds of problems is you want explainable models You want to be able to explain to a clinician why is this person, why do we think this person has moderate diabetic retinopathy So you can take a retinal image like this, and one of the things that really helps is if you can show in the model’s assessment why this is a 2 and not a 3 And by highlighting parts of the input data, you can actually make this more understandable for clinicians and enable them to really sort of get behind the assessment that the model is making And we’ve seen this in other areas as well There’s been a lot of work on explainability, so I think the notion that deep neural networks are sort of complete black boxes is a bit overdone There’s actually a bunch of good techniques that are being developed and more all the time that will improve this So a bunch of advances depend on being able to understand text And we’ve had a lot of really good improvements in the last few years on language understanding So this is a bit of a story of research and how research builds on other research So in 2017, a collection of Google researchers and interns came up with a new kind of model for text called the Transformer model So unlike recurrent models where you have kind of a sequential process where you absorb one word or one token at a time and update some internal state and then go on to the next token, the Transformer model enables you to process a whole bunch of text, all at once in parallel, making it much more computationally efficient, and then to use attention on previous texts to really focus on if I’m trying to predict what the next word is, what are other parts of the context to the left that are relevant to predicting that? So that paper was quite successful and showed really good results on language translation tasks with a lot less compute So the blue score there and the first two columns for English to German and English to French, higher is better And then the compute cost of these models shows that this is getting sort of state of the art results at that time, with 10 to 100x less compute than other approaches Then in 2018, another team of Google researchers built on the idea of Transformers So everything you see there in a blue oval is a Transformer module, and they came up with this approach called Bidirectional Encoding Representations from Transformers, or BERT It’s a little bit shorter and more catchy So BERT has this really nice property that, in addition to using context to the left, it uses context all around the language, sort of the surrounding text, in order to make predictions about text And the way it works is you start with a self-supervised objective So the one really nice thing about this is there’s lots and lots of text in the world So if you can figure out a way to use that text to train a model to be able to understand text better, that would be great So we’re going to take this text, and in the BERT training objective, to make it self-supervised, we’re going to drop about 15% of the words And this is actually pretty hard, but the model is then going to try to fill in the blanks, essentially Try to predict what are the missing words that were dropped And because we actually have the original words, we now know if the model is correct in its guesses about what goes in the box And by processing trillions of words of text like this, you actually get a very good understanding of contextual cues in language and how to actually fill in the blanks in a really intelligent way And so that’s essentially the training objective for BERT You take text, you drop 15% of it, and then you try to predict those missing words And one key thing that works really well is that step one You can pre-train a model on lots and lots of text, using this fill-in-the-blank self-supervised objective function And then step two, you can then take a language task you really care about Like maybe you want to predict, is this a five-star review

or a one-star review for some hotel, but you don’t have very much labeled text for that actual task You might have 10,000 reviews and know the star count of each review But you can then finetune the model, starting with the model trained in step one on trillions of words of text and now use your paltry 10,000 examples for the text task you really care about And that works extremely well So in particular, BERT gave state-of-the-art results across a broad range of different text understanding benchmarks in this GLUE benchmark suite, which was pretty cool And people have been using BERT now in this way to improve all kinds of different things all across the language understanding and NLP space So one of the grand challenges was engineer the tools of scientific discovery And I think it’s pretty clear machine learning is actually going to be an important component of making advances in a lot of these other grand challenge areas, things like autonomous vehicles or other kinds of things And it’s been really satisfying to see what we’d hoped would happen when we released TensorFlow as an open source project has actually kind of come to pass, as we were hoping, in that lots of people would sort of pick up TensorFlow, use it for all kinds of things People would improve the core system They would use it for tasks we would never imagine And that’s been quite satisfying So people have done all kinds of things Some of these are uses inside of Google Some are outside in academic institutions Some are scientists working on conserving whales or understanding ancient scripts, many kinds of things, which is pretty neat The breadth of uses is really amazing These are the 20 winners of the AI Impact Challenge, where people could submit proposals for how they might use machine learning and AI to really tackle a local challenge they saw in their communities And they have all kinds of things, ranging from trying to predict better ambulance dispatching to identifying sort of illegal logging using speech recognition or audio processing Pretty neat And many of them are using TensorFlow So one of the things we’re pretty excited about is AutoML, which is this idea of automating some of the process by which machine learning experts sit down and sort of make decisions to solve machine learning problems So currently, you have a machine learning expert sit down, they take data, they have computation They run a bunch of experiments They kind of stir it all together And eventually, you get a solution to a problem you actually care about One of the things we’d like to be able to do, though, is see if we could eliminate a lot of the need for the human machine learning expert to run these experiments and instead, automate the experimental process by which a machine learning expert comes by a high quality solution for a problem you care about So lots and lots of organizations around the world have machine learning problems, but many, many of them don’t even realize they have a machine learning problem, let alone have people in their organization that can tackle the problem So one of the earliest pieces of work our researchers did in the space was something called neural architecture search So when you sit down and design a neural network to tackle a particular task, you make a lot of decisions about shapes of this or that, and should it be used 3 by 3 filters at layer 17 or 5 by 5, all kinds of things like this It turns out you can automate this process by having a model generating model and train the model generating model based on feedback about how well the models that it generates work on the problem you care about So the way this will work, we’re going to generate a bunch of models Those are just descriptions of different neural network architectures We’re going to train each of those for a few hours, and then we’re going to see how well they work And then use the accuracy of those models as a reinforcement learning signal for the model generating model, to steer it away from models that didn’t work very well and towards models that worked better And we’re going to repeat many, many times And over time, we’re going to get better and better by steering the search to the parts of the space of models that worked well And so it comes up with models that look a little strange, admittedly A human probably would not sit down and wire up a sort of machine learning, computer vision model exactly that way But they’re pretty effective So if you look at this graph, this shows kind of the best human machine learning experts, computer vision experts, machine learning researchers in the world, producing a whole bunch of different kinds of models in the last four or five years, things like ResNet 50, DenseNet-201, Inception-ResNet, all kinds of things That black dotted line is kind of the frontier of human machine learning expert model quality on the y-axis and computational cost

on the x-axis So what you see is as you go out the x-axis, you tend to get more accuracy because you’re applying more computational cost But what you see is the blue dotted line is AutoML-based solutions, systems where we’ve done this automated experimentation instead of pre-designing any particular architecture And you see that it’s better both at the high end, where you care about the most accurate model you can get, regardless of computational cost, but it’s also accurate at the low end, where you care about a really lightweight model that might run on a phone or something like that And in 2019, we’ve actually been able to improve that significantly This is a set of models called Efficient Net and it has a very kind of a slider about you can trade off computational cost and accuracy But they’re all way better than human sort of guided experimentation on the black dotted line there And this is true for image recognition, for [INAUDIBLE] It’s true for object detection So the red line there is AutoML The other things are not It’s true for language translation So the black line there is various kinds of Transformers The red line is we gave the basic components of Transformers to an AutoML system and allowed it to fiddle with it and come up with something better It’s true for computer vision models used in autonomous vehicles So this was a collaboration between Waymo and Google Research We were able to come up with models that were significantly lower latency for the same quality, or they could trade it off and get significantly lower error rate at the same latency It actually works for tabular data So if you have lots of customer records, and you want to predict which customers are going to be spending $1,000 with your business next month, you can use AutoML to come up with a high quality model for that kind of problem OK So what do we want? I think we want the following properties in a machine learning model So one is we tend to train separate models for each different problem we care about And I think this is a bit misguided Like, really, we want one model that does a lot of things so that it can build on the knowledge in how it does thousands or millions of different things, so that when the million and first thing comes along, it can actually use its expertise from all the other things it knows how to do to know how to get into a good state for the new problem with relatively little data and relatively little computational cost So these are some nice properties I have kind of a cartoon diagram of something I think might make sense So imagine we have a model like this where it’s very sparsely activated, so different pieces of the model have different kinds of expertise And they’re called upon when it makes sense, but they’re mostly idle, so it’s relatively computationally [INAUDIBLE] power efficient But it can do many things And now, each component here is some piece of machine learning model with different kinds of state, parameters in the model, and different operations And a new task comes along Now you can imagine something like neural architecture search becoming– squint at it just right and now turn it into neural pathway search We’re going to look for components that are really good for this new task we care about, and maybe we’ll search and find that this path through the model actually gets us into a pretty good state for this new task Because maybe it goes through components that are trained on related tasks already And now maybe we want that model to be more accurate for the purple task, so we can add a bit more computational capacity, add a new component, start to use that component for this new task, continue training it, and now, that new component can also be used for solving other related tasks And each component itself might be running some sort of interesting architectural search inside it So I think something like that is the direction we should be exploring as a community It’s not what we’re doing today, but I think it could be a pretty interesting direction OK, and finally, I’d like to touch on thoughtful use of AI in society As we’ve seen more and more uses of machine learning in our products and around the world, it’s really, really important to be thinking carefully about how we want to apply these technologies Like any technology, these systems can be used for amazing things or things we might find a little detrimental in various ways And so we’ve come up with a set of principles by which we think about applying sort of machine learning and AI to our products And we’ve made these public about a year and a half ago as a way of sort of sharing our thought process with the rest of the world And I particularly like these I’ll point out many of these are sort of areas of research that are not fully understood yet, but we aim to apply the best in the state of the art methods, for example, for reducing bias in machine learning models, but also continue to do research and advance the state of the art in these areas And so this is just kind of a taste of different kinds of work we’re doing in this area– how do we do machine learning with more privacy,

using things like federated learning? How do we make models more interpretable so that a clinician can understand the predictions it’s making on diabetic retinopathy sort of examples? How do we make machine learning more fair? OK, and with that, I hope I’ve convinced you that deep neural nets and machine learning– you’re already here, so maybe you’re already convinced of this– but are helping make sort of significant advances in a lot of hard computer science problems, computer vision, speech recognition, language understanding General use of machine learning is going to push the world forward So thank you very much, and I appreciate you all being here [APPLAUSE] MEGAN KACHOLIA: Hey, everyone Good morning Just want to say, first of all, welcome Today, I want to talk a little bit about TensorFlow 2.0 and some of the new updates that we have that are going to make your experience with TensorFlow even better But before I dive into a lot of those details, I want to start off by thanking you, everyone here, everyone on the livestream, everyone who’s been contributing to TensorFlow, all of you who make up the community TensorFlow was open source to help accelerate the AI field for everyone You’ve used it in your experiments You’ve deployed in your businesses You’ve made some amazing different applications that we’re so excited to showcase and talk about, some that we get to see a bit here today, which is one of my favorite parts about conferences like this And you’ve done so much more And all of this has helped make TensorFlow what it is today It’s the most popular ML ecosystem in the world And honestly, that would not happen without the community being excited and embracing and using this and giving back So on behalf of the entire TensorFlow team, I really just first want to say thank you because it’s so amazing to see how TensorFlow is used That’s one of the greatest things I get to see about my job, is the applications and the way folks are using TensorFlow I want to take a step back and talk a little bit about some of the different user groups and how we see them making use of TensorFlow TensorFlow was being used across a wide range of experiments and applications So here, calling out researchers, data scientists and developers, and there’s other groups kind of in-between as well Researchers use it because it’s flexible It’s flexible enough to experiment with and push the state-of-the-art deep learning You heard this even just a few minutes ago, with folks from Twitter talking about how they’re able to use TensorFlow and expand on top of it in order to do some of the amazing things that they want to make use of on their own platform And at Google, we see examples of this when researchers are creating advanced models like Excel NAT and some of the other things that Jeff referenced in his talk earlier Taking a step forward, looking at data scientists, data scientists and enterprise engineers have said they rely on TensorFlow for performance and scale in training and production environments That’s one of the big things about TensorFlow that we’ve always emphasized and looked at from the beginning How can we make sure this can scale to large production use cases? For example, Quantify and BlackRock use TensorFlow to test and deploy BERT in real world NLP instances, such as text tokenization, as well as classification Hopping one step forward, looking a bit at application developers, application developers use TensorFlow because it’s easy to learn ML on the platforms that they care about Arduino wants to make ML simple on microcontrollers, so they rely on TensorFlow pre-trained models and TensorFlow Lite Micro for deployment Each of these groups is a critical part of the TensorFlow ecosystem And this is why we really wanted to make sure that TensorFlow 2.0 works for everyone We announced the alpha at our Dev Summit earlier this year And over the past few months, the team has been working very hard to incorporate early feedback Again, thank you to the community for giving us that early feedback, so we can make sure we’re developing something that works well for you And we’ve been working to resolve bugs and issues and things like that And just last month in September, we were excited to announce the final general release for TensorFlow 2.0 You might be familiar with TensorFlow’s architecture, which has always supported the ML lifecycle from training through deployment Again, one of the things we’ve emphasized since the beginning when TensorFlow was initially open sourced a few years ago But I want to emphasize how TensorFlow 2.0 makes this workflow even easier and more intuitive First, we invested in Keras, an easy-to-use package in TensorFlow, making it the default high level API Many developers love Keras because it’s easy to use and understand Again, you heard this already mentioned a little bit earlier, and hopefully, we’ll hear more about it throughout the next few days By tightly integrating Keras into 2.0, we can make Keras work even better with primitives like TF data We can do performance optimizations behind the scenes and run distributed training Again, we really wanted 2.0 to focus on usability

How can we make it easier for developers? How can we make it easier for users to get what they need out of TensorFlow? For instance, Lose It, a customized weight loss app, said they use tf.keras for designing their network By leveraging [INAUDIBLE] strategy distribution in 2.0, they were able to utilize the full power of their GPUs It’s feedback like this that we love to hear, and again, it’s very important for us to know how the community is making use of things, how the community is using 2.0, the things they want to see, so that we can make sure we’re developing the right framework and also make sure you can contribute back When you need a bit more control to create advanced algorithms, 2.0 comes fully loaded with eager execution, making it familiar for Python developers This is especially useful when you’re stepping through, doing debugging, making sure you can really understand step by step what’s happening This also means there’s less coding required when training your model, all without having to use Again, usability is a focus To demonstrate the power of training models with 2.0, I’ll show you how you can train a state-of-the-art NLP model in 10 lines of code, using the Transformers NLP library by Hugging Face– again, a community contribution This popular package hosts some of the most advanced NLP models available today, like BERT, GPT, Transformer-XL, XLNet, and now supports TensorFlow 2.0 So let’s take a look Here, kind of just looking through the code, you can see how you can use 2.0 to train Hugging Face’s DistilBERT model for text classification You can see just simply load the tokenizer, model, and the data set Then prepare the data set and use tf.keras compile and fit APIs And with a few lines of code, I can now train my model And with just a few more lines, we can use the train model for tasks such as text classification using eager execution Again, it’s examples like this where we can see how the community takes something and is able to do something very exciting and amazing by making use of the platform and the ecosystem that TensorFlow is providing But building and training a model is only one part of TensorFlow 2.0 You need the performance to match That’s why we worked hard to continue to improve performance with TensorFlow 2.0 It delivers up to 3x faster training performance using mixed precision on NVIDIA Volta and Turing GPUs in a few lines of code with models like ResNet-50 and BERT As we continue to double down on 2.0 in the future, performance will remain a focus with more models and with hardware accelerators For example, in 2.1, so the next upcoming TensorFlow release, you can expect TPU and TPU pod support, along with mixed precision for GPUs So performance is something that we’re keeping a focus on as well, while also making sure usability really stands to the forefront But there’s a lot more to the ecosystem So beyond model building and performance, there are many other pieces that help round out the TensorFlow ecosystem Add-ons and extensions are a very important piece here, which is why we wanted to make sure that they’re also compatible with TensorFlow 2.0 So you can use popular libraries, like some other ones called out here, whether it’s TensorFlow Probability, TF Agents, or TF Text We’ve also introduced a host of new libraries to help researchers and ML practitioners in more useful ways So for example, neural structure learning helps to train neural networks with structured signals And the new Fairness Indicators add-on enables regular computation and visualization of fairness metrics And these are just the types of things that you can see kind of as part of the TensorFlow ecosystem, these add-ons that, again, can help you make sure you’re able to do the things you need to do not with your models, but kind of beyond just that Another valuable aspect of the TensorFlow ecosystem is being able to analyze your ML experiments in detail So this is showing TensorBoard TensorBoard is TensorFlow’s visualization toolkit, which is what helps you accomplish this It’s a popular tool among researchers and ML practitioners for tracking metrics, visualizing model graphs and parameters, and much more It’s very interesting that we’ve seen users enjoy TensorBoard so much, they’ll even take screenshots of their experiments and then use those screenshots to be able to share with others what they’re doing with TensorFlow This type of sharing and collaboration in the ML community is something we really want to encourage with TensorFlow Again, there’s so much that can happen by enabling the community to do good things That’s why I’m excited to share the preview of, a new, free, managed TensorBoard experience that lets you upload and share your ML experiment results with anyone You’ll now be able to host and track your ML experiments and share them publicly

No setup required Simply upload your logs, and then share the URL, so that others can see the experiments and see the things that you are doing with TensorFlow As a preview, we’re starting off with the [INAUDIBLE] dashboard, but over time, we’ll be adding a lot more functionality to make the sharing experience even better But if you’re not looking to build models from scratch and want to reduce some computational cost, TensorFlow was always made pre-trained models available through TensorFlow Hub And today, we’re excited to share an improved experience of TensorFlow Hub that’s much more intuitive, where you can find a comprehensive repository of pre-trained models in the TensorFlow ecosystem This means you can find models like BERT and others related to image, text, video, and more that are ready to use with TensorFlow Lite and TensorFlow.js Again, we wanted to make sure the experience here was vastly improved to make it easier for you to find what you need in order to more quickly get to the task at hand And since TensorFlow is driven by all of you, TensorFlow Hub is hosting more pre-trained models from the community You’ll be able to find curated models by DeepMind, Google, Microsoft’s AI for Earth, and NVIDIA ready to use today with many more to come We want to make sure that TensorFlow Hub is a great place to find some of these excellent pre-trained models And again, there’s so much the community is doing We want to be able to showcase those models as well TensorFlow 2.0 also highlights TensorFlow’s core strengths and areas of focus, which is being able to go from model building, experimentation, through to production, no matter what platform you work on You can deploy end-to-end ML pipelines with TensorFlow Extended or TFX You can use your models on mobile and embedded devices with TensorFlow Lite for on device inference, and you can train and run models in the browser or Node.js with TensorFlow.js You’ll learn more about what’s new in TensorFlow in production during the keynote sessions tomorrow You can learn more about these updates by going to where you’ll also find the latest documentation, examples, and tutorials for 2.0 Again, we want to make sure it’s easy for the community to see what’s happening, what’s new, and enable you to just do what you need to do with TensorFlow We’ve been thrilled to see the positive response to 2.0, and we hope you continue to share your feedback Thank you, and I hope you enjoy the rest of TF World [APPLAUSE] FREDERICK REISS: Hello, everyone I’m Fred Reiss I work for IBM I’ve been working for IBM since 2006 And I’ve been contributing to TensorFlow Core since 2017 But my primary job at IBM is to serve as tech lead for CODAIT That’s the Center for Open Source Data and AI Technologies We are an open source lab located in downtown San Francisco, and we work on open source technologies that are foundational to AI And we have on staff 44 full-time developers who work only on open source software And that’s a lot of developers, a lot of open source developers Or is it? Well, if you look across IBM at all of the IBM-ers who are active contributors to open source, in that they have committed code to GitHub in the last 30 days, you’ll find that there are almost 1,200 IBM-ers in that category So our 44 developers are actually a very small slice of a very large pie Oh, and those numbers, they don’t include Red Hat When we closed that acquisition earlier this year, we more than doubled our number of active contributors to open source So you can see that IBM is really big in open source And more and more, the bulk of our contributions in the open are going towards the foundations of AI And when I say AI, I mean AI in production I mean AI at scale AI at scale is not an algorithm It’s not a tool It’s a process It’s a process that starts with data, and then that data turns into features And those features train models, and those models get deployed in applications, and those applications produce more data And the whole thing starts all over again And at the core of this process is an ecosystem of open source software And at the core of this ecosystem is TensorFlow, which is why I’m here, on behalf of IBM open source, to welcome you to TensorFlow World Now throughout this conference, you’re going to see talks that speak to all of the different stages of this AI lifecycle But I think you’re going to see a special emphasis on this part–

moving models into production And one of the most important aspects of moving models into production is that when your model gets deployed in a real-world application, it’s going to start having effects on the real world And it becomes important to ensure that those effects are positive and that they’re fair to your clients, to your users Now, at IBM, here’s a hypothetical example that our researchers put together about a little over a year ago They took some real medical records data, and they produced a model that predicts which patients are more likely to get sick and therefore should get additional screening And they showed that if you naively trained this model, you end up with a model that has significant racial bias, but that by deploying state-of-the-art techniques to adjust the data set and the process of making the model, they could substantially reduce this bias to produce a model that is much more fair You can see a Jupyter Notebook with the entire scenario from end to end, including code and equations and results, at the URL down here Again, I need to emphasize this was a hypothetical example We built a flawed model deliberately, so we could show how to make it better But no patients were harmed in this exercise However, last Friday, I sat down with my morning coffee, and I opened up the “Wall Street Journal.” And I saw this article at the bottom of page three, describing a scenario eerily similar to our hypothetical When your hypothetical starts showing up as newspapers headlines, that’s kind of scary And I think it is incumbent upon us as an industry to move forward the process, the technology of trust in AI, trust and transparency in AI, which is why IBM and IBM Research have released our toolkits of state-of-the-art algorithms in this space as open source under AI Fairness 360, AI Explainability 360, and Adversarial Robustness 360 It is also why IBM is working with other members of the Linux Foundation AI, a trusted AI committee, to move forward open standards in this area so that we can all move more quickly to trusted AI Now if you’d like to hear more on this topic, my colleague, Animesh Singh, will be giving a talk this afternoon at 1:40 on trusted AI for the full 40 minute session Also I’d like to give a quick shout out to my other co-workers from CODAIT who have come down here to show you cool open source demos at the IBM booth That’s booth 201 Also check out our websites, and On behalf of IBM, I’d like to welcome you all to TensorFlow World Enjoy the conference Thank you [APPLAUSE] THEODORE SUMME: Hi, I’m Ted Summe from Twitter Before I get started with my conversation today, I want to do a quick plug for Twitter What’s great about events like this is you get to hear people like Jeff Dean talk And you also get to hear from colleagues and people in the industry that are facing similar challenges as you and have conversations around developments in data science and machine learning But what’s great is that’s actually available every day on Twitter Twitter’s phenomenal for conversation on data science and machine learning People like Jeff Dean and other thought leaders are constantly sharing their thoughts and their developments And you can follow that conversation and engage in it And not only that, but you can bring that conversation back to your workplace and come off looking like a hero– just something to consider So without that shameless plug, my name’s Ted Summe I lead product for Cortex Cortex is Twitter’s central machine learning organization If you have any questions for me or the team, feel free to connect with me on Twitter, and we can follow up later So before we get into how we’re accelerating ML at Twitter, let’s talk a little bit about how we’re even using ML at Twitter Twitter is largely organized against three customer needs, the first of which is our health initiative That might be a little bit confusing to you You might think of it as user safety But we think about it as improving the health of conversations on Twitter And machine learning is already at use here We use it to detect spam We can algorithmically and at scale detect spam and protect our users from it Similarly, in the abuse space, we can proactively flag content as potentially abuse, toss it up for human reviews, and act on it before our users even get impacted by it A third space where we’re using machine learning here is something called NSFW, Not Safe For Work I think you’re all familiar with the acronym So how can we, at scale, identify this content and handle it accordingly? Another use of machine learning in this space There’s more that we want to do here, and there’s more that we’re already doing Similarly, the consumer organization– this is largely what you think of, the big blue app of Twitter And here, the customer job that we’re serving is helping connect our customers with the conversations

on Twitter that they’re interested in And one of the primary veins in which we do this is our timeline Our timeline today is ranked So if you’re not familiar, users follow accounts Content and tweets associated with those accounts get funneled into a central feed And we rank that based on your past engagement and interest to make sure we bring forth the most relevant conversations for you Now, there’s lots of conversations on Twitter, and you’re not following everyone And so there’s also a job that we have to serve about bringing forth all the conversations that you’re not proactively following, but are still relevant to you This has surfaced in our Recommendations product, which uses machine learning to scan the corpus of content on Twitter, and identify what conversations would be most interesting to you, and push it to you in a notification The inverse of that is when you know what the topics you want to explore are, but you’re looking for the conversations around that That’s where we use Twitter Search This is another surface area in the big blue app that we’re using machine learning The third job to be done for our customers is helping connect brands with their customers You might think of this as the ads product And this is actually the OG of machine learning at Twitter, the first team that implemented it And here, we use it for what you might expect, ads ranking That’s kind of like the timeline ranking, but instead of tweets, it’s ads and identifying the most relevant ads for our users And as signals to go into that, we also do user targeting to understand your past engagement ads, understand which ads are in your interest space And the third– oh Yeah, we’re still good And the third is brand safety You might not think about this when you think about machine learning and advertising But if you’re a company like United and you want to advertise on Twitter, you want to make sure that your ad never shows up next to a tweet about a plane crash So how do we, at scale, protect our brands from those off-brand conversations? We use machine learning for this as well So as you can tell, machine learning is a big part of all of these organizations today And where we have shared interests and shared investment, we want to make sure we have a shared organization that serves that And that’s the need for Cortex Cortex is Twitter’s central machine learning team, and our purpose is really quite simple– to enable Twitter with ethical and advanced AI And to serve that purpose, we’ve organized in three ways The first is our applied research group This group applies the most advanced ML techniques from industry and research to our most important surface areas, whether they be new initiatives or existing places This team you can kind of think of as like an internal task force or consultancy that we can redeploy against the company’s top initiatives The second is signals When using machine learning, having shared data assets that are broadly useful can provide us more leverage Examples of this would be our language understanding team that looks at tweets and identifies named entities inside them Those can then be offered up as features for other teams to consume in their own applications of machine learning Similarly, our media understanding team looks at images and can create a fingerprint of any image And therefore, we can identify every use of that image across the platform These are examples of shared signals that we’re producing that can be used for machine learning at scale inside the company And the third organization is our platform team And this is really the origins of Cortex Here, we provide tools and infrastructure to accelerate ML development at Twitter, increase the velocity of our ML practitioners And this is really the focus of the conversation today When we set out to build this ML platform, we decided we wanted a shared ML platform across all of Twitter And why is that important that it be shared across all of Twitter? Well, we want transferability We want the great work being done in the ads team to be, where possible, transferable to benefit the health initiative where that’s relevant And similarly, if we have great talent in the consumer team that’s interested in moving to the ads team, if they’re on the same platform, they can transfer without friction and be able to ramp up quickly So we set out with this goal of having a shared ML platform across all of Twitter And when we did that, we looked at a couple product requirements First, it needs to be scalable It needs to be able to operate at Twitter scale The second, it needs to be adaptable This space is developing quickly so we need a platform that can evolve with data science and machine learning developments Third is the talent pool We want to make sure that we have a development environment at Twitter that appeals to the ML researchers and engineers that we’re hiring and developing Fourth is the ecosystem We want to be able to lean on the partners that are developing industry leading tools so that we can focus on technologies that are Twitter specific Fourth is documentation You ought to understand that We want to be able to quickly unblock our practitioners as they hit issues, which is inevitable in any platform And finally, usability We want to remove friction and frustration

from the lives of our team, so that they can focus on delivering value for our end customers So considering these product requirements, let’s see how TensorFlow is done against them First is scalability We validated this by putting TensorFlow by way of our implementation we called Deep Bird against timeline ranking So every tweet that’s ranked in the timeline today runs through TensorFlow So we can consider that test validated Second is adaptability The novel architectures that TensorFlow can support, as well as the custom lost functions, allows us to react to the latest research and employ that inside the company An example that we published on this publicly is our use of a SplitNet architecture and ads ranking So TensorFlow has been very adaptable for us Third is the talent pool, and we think about the talent pool in kind of two types There’s the ML engineer and the ML researcher And as a proxy of these audiences, we looked at the GitHub data on these And clearly, TensorFlow is widely adopted amongst ML engineers And similarly, the archive community shows strong evidence of wide adoption in the academic community On top of this proxy data, we also have anecdotal evidence of the speed of ramp-up for ML researchers and ML engineers inside the company The fourth is the ecosystem Whether it’s TensorBoard, TF Data Validation, TF Model Analysis, TF Metastore, TF Hub, TFX Pipelines, there’s a slew of these products out there, and they’re phenomenal They allow us to focus on developing tools and infrastructure that is specific to Twitter’s needs and lead on the great work of others So we’re really grateful for this, and TensorFlow does great here Fifth being documentation Now, this is what you would go to when you go to TensorFlow, and you see that phenomenal documentation, as well as great education resources But what you might not appreciate and what we’ve come to really appreciate is the value of the user generated content What Stack Overflow and other platforms can provide in terms of user generated content is almost as valuable as anything TensorFlow itself can create And so TensorFlow, given its widespread adoption, its great TensorFlow website, has provided phenomenal documentation for ML practitioners Finally, usability And this is why we’re really excited about TensorFlow 2.0 The orientation around the carrier’s API makes it more user friendly It also still continues to allow for flexibility for more advanced users The eager execution enables more rapid and intuitive debugging, and it closes the gap between ML engineers and modelers So clearly from this checklist, we’re pretty happy with our engagement with TensorFlow And we’re excited about continuing to develop the platform with them and push the limits on what it can do, with gratitude to the community for their participation and involvement in the product and appreciate their conversation on Twitter, as we advance it So if you have any questions for me, as I said before, you can connect with me, but I’m not alone here today A bunch of my colleagues are here as well So if you see them roaming the halls, feel free to engage with them Or as I shared before, you can continue the conversation on Twitter Here are their handles Thank you for your time Cheers [APPLAUSE] CRAIG WILEY: I just want to begin by saying I’ve been dabbling in Cloud AI and Cloud Machine Learning for a while And during that time, it never occurred to me that we’d be able to come out with something like we did today because this is only possible because Google Cloud and TensorFlow can collaborate unbelievably closely together within Google So to begin, let’s talk a little bit about TensorFlow– 46 million downloads TensorFlow has been massive growth the last few years It’s expanded from the forefront of research, which we’ve seen earlier this morning, to businesses taking it on as a dependency for their business to operate on a day in, day out basis It’s a super exciting piece As someone who spends most all of their time thinking about how we can bring AI and machine learning into businesses, seeing TensorFlow’s commitment and focus on deploying actual ML in production is super exciting to me With this growth, though, comes growing pains And part of that is things like support, right? When my model doesn’t do what I expected it to or my training job fails, what options do I have? And how well does your boss respond when you say, hey, yes, I don’t know why my model’s not training, but not to worry, I’ve put a question on Slack And hopefully, someone will get back to me We understand that businesses who are taking a bet on TensorFlow as a critical piece

of their hardware architecture or their stack need more than this Second, it can be a challenge to unlock the scale and performance of cloud For those of you who, like me, have gone through this journey over the last couple of years, for me, it started on my laptop Right? And then eventually, I outgrew my laptop, and so I had a gaming rig under my desk, right? With the GPU and eventually, there were eight gaming rigs under my desk And when you opened the door to my office, the whole floor knew because it sounded like [INAUDIBLE] Right? And but now with today’s cloud, that doesn’t have to be the case You can go from that single instance all the way up to a massive scale seamlessly So with that, today, we bring you TensorFlow Enterprise TensorFlow Enterprise is designed to do three things– one, give you Enterprise grade support; two, cloud scale performance; and three, managed services when and where you want them, at the abstraction level you want them Enterprise grade support, what does that mean? Fundamentally what that means is that as these businesses take a bet on TensorFlow, many of these businesses have IT policies or requirements that the software have a certain longevity before they’re willing to commit to it in production And so today, for certain versions of TensorFlow, when used on Google Cloud, we will extend that one year of support a full three years That means that if you’re building models on 1.15 today, you can know that for the next three years, you’ll get bug fixes and security patches when and where you need them Simple and scalable Scaling from an idea on a single node to production at massive scale can be daunting, right? Saying to my boss, hey, I took a sample of the data was something that previously seemed totally reasonable, but now we’re asked to train on the entire corpus of data And that can take days, weeks We can help with all of that by deploying TensorFlow on Google Cloud, a network that’s been running TensorFlow successfully for years and has been highly optimized for this purpose So scalable across our world class architecture, the products are compatibility tested with the cloud, their performance optimized for the cloud and for Google’s world class infrastructure What does this mean? So if any of you have ever had the opportunity to use BigQuery, BigQuery is Google Cloud’s kind of massively parallel cloud hosted data warehouse And by the way, if you haven’t tried using BigQuery, I highly recommend going out and trying it It returns results faster than can be imagined That speed in BigQuery, we wanted to make sure we were taking full advantage of that And so recent changes and recent pieces included in TensorFlow Enterprise have increased the speed of the connection between the data warehouse and TensorFlow by three times Right? Now, all of sudden, those jobs that were taking days take hours Unity gaming, wonderful customer and partner with us You can see the quote here Unity leverages these aspects of TensorFlow Enterprise in their business Their monetization products reach more than three billion devices– three billion devices worldwide Game developers rely on a mix of scale and products to drive installs and revenue and player engagement And Unity needs to be able to quickly test, build, scale, deploy models all at massive scale This allows them to serve up the best results for their developers and their advertisers Managed services As I said, TensorFlow Enterprise will be available on Google Cloud and will be available as part of Google Cloud’s AI platform It will also be available in VMs if you’d prefer that, or in containers if you want to run them on Google Cloud Kubernetes Engine, or using Kubeflow on Kubernetes Engine In summary, TensorFlow Enterprise

offers Enterprise grade support– that continuation, that full three years of support that IT departments are accustomed to– cloud scale performance so that you can run at massive scale, and works seamlessly with our managed services And all of this is free and fully included for all Google Cloud users Google Cloud becomes the best place to run TensorFlow But there’s one last piece, which is for companies for whom AI is their business– not companies for whom AI might help with this part of their business or that or might help optimize this campaign or this backend system, but for companies where AI is their business, right? Where they’re training hundreds of thousands of hours of training a year, petabytes of data, right? Using cutting edge models to meet their unique requirements, we are introducing TensorFlow Enterprise with white-glove support This is really for cutting edge AI, right? Engineering to engineering assistance when needed Close collaboration across Google allows us to fix bugs faster if needed One of the great opportunities of working in cloud, if you ask my kids, they’ll tell you that the reason I work in cloud AI and in kind of machine learning is in an effort to keep them ever from learning to drive They’re eight and 10 years old, so I need people to kind of hurry along this route, if you will But one of the customers and partners we have is Cruise Automotive And you can see here, they’re a shining example of the work we’re doing On their quest towards self-driving cars, they’ve also experienced hiccups and challenges and scaling problems And we’ve been a critical partner for them in helping ensure that they can achieve the results they need to, to solve this kind of generational defining problem of autonomous vehicles You can see not only did we improve the accuracy of their models, but also reduce training times from four days down to one day This allows them to iterate at speeds previously unthinkable So none of this, as I said, would have been possible without the close collaboration between Google Cloud and TensorFlow I look back on Megan’s recent announcement of We will be looking at bringing that type of functionality into an enterprise environment as well in the coming months But we’re really, really excited to get TensorFlow Enterprise into your hands today To learn more and get started, you can go to the link, as well as sessions later today And if you are on the cutting edge of AI, we are accepting applications for the white-glove service as well We’re excited to bring this offering to teams We’re excited to bring this offering to businesses that want to move into a place where machine learning is increasingly a part of how they create value Thank you very much for your time today KEMAL EL MOUJAHID: Hi, my name is Kemal I’m the product director for TensorFlow So earlier, you heard from Jeff and Megan about the prod direction Now, what I’d like to talk about is the most important part of what we’re building, and that’s the community That’s you Sorry Where’s the slide? Thank you So as you’ve seen in the video, we’ve got a great roadshow, 11 events spanning five continents to connect the community with the TensorFlow team I, personally, was very lucky this summer, because I got to travel to Morocco and Ghana and Shanghai, amongst other places, just to meet the community, and to listen to your feedback And we heard a lot of great things So as we’re thinking about, how can we best help the community? It really came down to three things First, we would like to help you to connect with the larger community, and to share the latest and greatest of what you’ve been building Then, we also would like you– we want to help you learn, learn about ML, learn about TensorFlow And then, we want to help you contribute and give back to the community So let’s start with Connect So why connect? Well, first the community– the TensorFlow community has really grown a lot It’s huge– 46 million downloads, 2,100 committers,

and– again, I know that we’ve been saying that all along, but I really want to say a huge thank you on behalf of the TensorFlow team for making the community what it is today Another aspect of the community that we’re very proud of is that it’s truly global This is a revised map of our GitHub stars And, as you can see, we’re covering all time zones and we keep growing So the community is huge It’s truly global And we really want to think about, how can we bring the community closer together? And this is really what initiated the idea of TensorFlow World We wanted to create an event for you We wanted an event where you could come up and connect with the rest of the community, and share what you’ve been working on And this has actually started organically Seven months ago, the TensorFlow User Groups started, and I think now we have close to 50 The largest one is in Korea It has 46,000 members We have 50 in China So if you’re in the audience or in the livestream, and you’re looking to this map, and you’re thinking, wait, I don’t see a dot where I live– and you have a TensorFlow member that you’re connecting with, and you want to start a TensorFlow User Group– well, we’d like to help you So please go to, and we’ll help you get it started So that next year, when we look at this map, we have dots all over the place So what about businesses? We’ve talked about developers What about businesses? One thing we heard from businesses is they have this business problem They think ML can help them, but they’re not sure how And that’s a huge missed opportunity when we look at the staggering $13 trillion that AI will bring to the global economy over the next decade So you have those businesses on one side, and then you have partners on the other side, who know about ML, they know how to use TensorFlow, so how do we connect those two? Well, this was the inspiration for launching our Trusted Partner Pilot Program, which helps you, as a business, connect to a partner who will help you solve your ML problem So if you go on, you’ll find more about our Trusted Partner program Just a couple of examples of cool things that they’ve been working on One partner helped a car insurance company shorten the insurance claim processing time using image processing techniques Another partner helped the global med tech company by automating the shipping labeling process using object recognition techniques And you’ll hear more from these partners later today I encourage you to go check out their talks Another aspect is that if you’re a partner, and you’re interested in getting into this program, we also would like to hear from you So let’s talk about Learn We’ve invested a lot in producing quality material to help you learn about ML and about TensorFlow One thing that we did over the summer, which was very exciting is for the first time, we’re a part of the Google Summer of Code We had a lot of interest We were able to select 20 very talented students, and they got to work the whole summer with amazing mentors on the TensorFlow engineering team And they worked on very inspiring projects going from 2.0 to Swift to JS to TF-Agents So we were so excited with the success of this program that we decided to participate, for the first time, in a Google Code-in program So this is the same program, but for pre-university students from 13 to 17 It’s a global online contest And it introduces teenagers to the world of contributing to open source development So as I mentioned, we’ve invested a lot this year on ML education material, but one thing we heard is that there’s a lot of different things And what you want is to be guided through pathways of learning So we’ve worked hard on that, and we’ve decided to announce the new Learn ML page And what this is a learning path curated for you by the TensorFlow team, and organized by level So you have from beginners to advanced You can explore books, courses, and videos to help you improve your knowledge of machine learning, and use that knowledge and use TensorFlow, to solve your real-world problem And for more exciting news that will be available on the website, I’d like to play a brief video by a friend Andrew Ng [VIDEO PLAYBACK] – Hi, everyone I’m in New York right now, and wish I could be there to enjoy the conference

But I want to share with you some exciting updates started a partnership with the TensorFlow team with a goal of making world-class education available for developers on the Coursera platform Since releasing the Deep Learning Specialization, I’ve seen so many of you, hundreds of thousands, learn the fundamental skills of deep learning I’m delighted we’ve been able to complement that with the TensorFlow in Practice Specialization to help developers learn how to build ML applications for computer vision, NLP, sequence models, and more Today, I want to share with you an exciting new project that the and TensorFlow teams have been working on together Being able to use your models in a real-world scenario is when machine learning gets particularly exciting So we’re producing a new four-course specialization called TensorFlow Data and Deployment that will let you take your ML skills to the real world, deploying models to the web, mobile devices, and more It will be available on Coursera in early December I’m excited to see what you do with these new resources Keep learning [END PLAYBACK] KEMAL EL MOUJAHID: All right This is really cool Since we started working on these programs, it’s been pretty amazing to see hundreds of thousands of people take those courses And the goal of these educational resources is to let everyone participate in the ML revolution, regardless of what your experience with machine learning is And now, Contribute So a great way to get involved is to connect with your GDE We now have 126 machine learning GDEs globally We love our GDEs They’re amazing They do amazing things for the community This year alone, they gave over 400 tech talks, 250 workshops They wrote 221 articles reaching tens of thousands of developers And one thing that was new this year is that they helped with doc sprints So docs are really important They’re critical, right? You really need good quality docs to work on machine learning, and often the documentation is not available in people’s native languages And so this is why when we partnered with our GDEs, we launched the doc sprints Over 9,000 API docs were updated by members of the TensorFlow community in over 15 countries We heard amazing stories of power outage, and power running out, and people coming back later to finish a doc sprint, and actually writing docs on their phones So if you’ve been helping with docs, thank you, if you’re in the room, if you’re over livestream, thank you so much If you’re interested in helping translate documentation in your native language, please reach out, and we’ll help you organize a doc sprint Another thing that the GDEs help with is experimenting with the latest features So I want to call out Sam Witteveen, an ML GDE from Singapore, who’s already experimenting with 2.x TPUs, and you can hear him talk later today to hear about his experience So if you want to get involved, please reach out to your GDE and start working on TensorFlow Another really great way to help is to join a SIG A SIG is a Special Interest Group, and it helps you work on the things that you’re the most excited about on TensorFlow We have, now, 11 SIGs available Addons, IO, and Networking, in particular, really supported the transition to 2.0 by embracing the parts of contrib and putting them into 2.0 And SIG Build ensures that TF runs well everywhere on any OS, any architecture, and plays well with the Python library And we have many other really exciting SIGs, so I really encourage you to join one Another really great way to contribute is through competition And for those of you who were there at the Dev Summit back in March, we launched our 2.0 challenge on DevPost And the grand prize was an invitation to this event, TensorFlow World And so we would like to honor our 2.0 Challenge winners, and I think we are lucky to have two of them in the room– Victor and Kyle, if you’re here [APPLAUSE] So Victor worked on Handtrack.js, a library for prototyping hand gesture in the browser And then Kyle worked on a Python 3 package to simulate N-body,

to generate N-body simulations So one thing we heard, too, during our travels is, oh, that hackathon was great, but I totally missed it Can we have another one? Well, yes Let’s do another one So if you go on, we’re launching a new challenge You can apply your 2.0 skills and share the latest and greatest, and win cool prizes So we’re really excited to see what you’re going to build Another great community that we’re very excited to partner with is Kaggle So we’ve launched a contest on Kaggle to challenge you with question answering model based on Wikipedia articles You can put your natural language processing skills to the test and earn $50,000 in prizes It’s open for entry until January 22, so best of luck So we have a few action items for you, and they’re listed on this slide But remember, we created TensorFlow World for you, to help you connect and share what you’ve been working on So our main action item for you in the next two days is really to get to know the community better And with that, I’d like to thank you, and I hope you enjoy the rest of TF World Thank you [APPLAUSE]

Jupyter Meets the Earth – Community Forum

good morning uh welcome everyone uh to our uh short jupiter mitsuki earth earthquake workshop um we first would like to thank the earthcube organizers in particular lynch schreiber who’s been extremely helpful in getting us getting us all set up and i also want to thank thank the team in particular lindsey hagie who’s done a huge amount of work on the logistics and the material um to have a successful workshop we’re going to try to give you a bit of an overview of what the project from this team has done and and hopefully we’ll have a chance to to talk to you folks afterwards so a little bit of the usual zoom etiquette that by now probably everyone is used to um please stay muted unless you’re speaking so we don’t get too much background noise and remember that we are recording so that if you would rather not be on camera you can leave your camera off um and the link right there is a slack channel that you should all have been invited to but if for some reason you didn’t you can go to that and go to that url and join the slack channel where you can post we can post discuss discussion questions um throughout throughout the workshop and we’ll do our best to monitor that channel and to respond to those in the in the extended q a session that we have uh later on so for those of you who haven’t used slack too much um this is what it will look like if you use the web client many of you probably do use slack but in case you haven’t um you will have the the channel that we are using is called jupiter meets the earth you’ll see that on the left um you type your chat messages at the bottom and uh and the uh excuse me let me change the size of this window just a little bit so that those google slide controls don’t cover the actual speaking area um and uh and if you click on the on the little pin uh push pin icon um you will see uh useful useful links and resources that get posted that will appear on the right hand side so let’s let us tell you a little bit about what what this project sort of is is driven by the motivation for this project and in this we’re taking very very strong cues from joe hammond and ryan abernathy and others in the pangea project who kind of framed uh used this framing for their design of the pangea project and it’s something that we find a particularly valuable way to look at the problem is to think about what drives progress in general in the sciences but some of this is somewhat specific to the geosciences um and the geosciences in principle ought to be a virtuous feedback cycle between the development of theoretical ideas and models in many areas we do have odes and pdes that we consider to be uh kind of fun the fundamental physical on driving mechanisms of geophysical processes atmospheric processes fluid processes geological processes um and these are described by fundamentally mathematical models we know they’re imperfect and they are approximate and they leave they leave processes out but they still form sort of one of the cornerstones of our description of the natural world we combine those with observations and data that that feeds into them and obviously we try to model to incarnate those models via simulations and computational techniques that are becoming increasingly sophisticated um the volumes of data that we have are larger and larger the models or uh are increasingly complex they are today they are multi-scale models they we pay much more attention to nonlinear terms and we incorporate noisy stochastic terms um the size of the observations and the data is becoming absolutely astronomical for uh uh for not nobody here that should that should be news as a quick example the the data set for the cmap6 um climate model and comparison project uh in its current iteration is estimated to be on the order of 15 to 30 petabytes that is a lot of data and virtually everyone who’s dealing with experimental data uh today is is dealing with uh rates of on the order of terabytes a day and data that is very complex as well it’s not just a single sensor anymore you typically need to integrate data from multiple of multiple kinds and to make sense of all of that you need uh today very complex um computational machinery um we’re talking about exascale computing at the at the national labs and all of the hpc centers in the world um a lot of folks are doing their work in the cloud but that requires its own set of software engineering skills and devops skills of all kinds and uh and today machine learning is widely used that’s its own basically industry with its own um engineering engineering complexities and so this virtuous cycle is one that is sort of gummed up as as joe and ryan um and others framed it in the in their pangea discussions the gears of the engine are kind of starting to grind um and part of this is basically complexity right there is a huge amount

of um complexity in integrating all of this we’re well past the stage where a single scientist could understand the theory um the physical models gather and gather or get access to some data and run their own codes on their workstation with a little bit of fortran that they wrote on their own this might have worked in the 70s or even in the 90s it’s completely infeasible today because so our our framing is how do we get these gears more effectively turning again if we want to focus on geosciences questions we need to bring to the table together effectively teams uh it’s not going to be possible to do this as an individual who have expertise in the domain expertise in data science and statistical methods and analysis questions in the protocols of good statistical data analysis data management is its own has become effectively its own complex discipline data engineering um and there’s a huge amount of software engineering that comes into play um there’s a huge amount there’s a huge amount of building tools um and the jupiter meets the earth project is a little bit of basically trying to join the software engineering aspect as a as a first-class partner to this world it was uh it was funded by the nsf through a grant proposal um that we wrote together uh with our copy eyes on the team here kevin paul joe hammond laura larson um lindsay hagee and and myself wrote wrote this grant it was funded by the earthquake program and the idea was to argue that it would be a real partnership between the software engineering and the domain sciences a lot of projects the the federal funding agencies have typically been done and we were all probably familiar with that pattern where there is either a very strong focus on cyber infrastructure and computer science research um and there’s a little bit of perhaps uh if i’m if i can be slightly provocative lip service paid to the notion that uh that that uh the computer science research will engage with the domain uh with uh some domain topics um and on the other hand uh we also know of a lot of projects that are funded purely on a domain um on a domain emphasis uh question and where there’s sort of a wink-wink not nod somebody somehow will write the software um and instead um of either of those approaches that we feel is perhaps a little bit imbalanced we chose to have a real partnership between the the open source software tools that we’re building and the domain scientists um and this comes a lot from our experiences in um at least for myself in how we’ve built uh project jupiter project jupiter is an as a long running open source software project um but what i want to emphasize for a moment is that this project at this point is much more than software um yes there is a focus on code and when people think about open source software the word obviously is software um importantly software that should be extensible and reusable by others um but on a for a project to have a large impact in a in a community and for a project to really um uh live uh beyond uh beyond the the uh the original intent of for perhaps its first authors it actually needs to consider um a layered layered set of questions that go that go far beyond software um and uh and these layers in a in a nod to kind of a classic masterful hierarchy that we have here include services and content that are presented with the software um they include potentially standards and protocols and ways to interoperate and build an ecosystem and they include a human community that needs to be managed and so i want to spend just a couple of minutes kind of highlighting some of these points from the perspective of project jupiter um some of you may know jupiter and or ipython its predecessor project which began its life as basically a simple interactive environment for experimenting with python code in data analysis and scientific computing workflows on the jupyter notebook originally named ipython notebook which many of you have probably used is an environment that that allows you to combine text code and the results of the code execution accessible through a web browser but what really has brought um jupiter to a very large scientific community is actually the fact that we provide on top of this software that you can download the project actually provides content and services that people use binder is an example of a service that allows you to turn any git repository with one click into a collection if it’s properly prepared into a collection of live interactive notebooks envy viewer is a way that a lot is a tool that allows you to take a notebook that is publicly available and provide a web view of it to share with others say a colleague who doesn’t have these tools installed

installed or on social media or as part of a as part of a website jupyter hub allows you to put all of these tools on the on through a web browser accessible on shared infrastructure whether it’s a super computing center a research cluster uh or a cloud computing environment the point is that if these are not things that you download these are things that you can actually where you can actually run services and many of these you can access for free and they have created um a far larger impact from the original software that we had that we had written um then then actually even we had uh we had imagined at the beginning because we realized that it was really the entry point for many people is actually the content is what they want to read is what they want to access is what they want to share with others rather than being focused on the tools themselves as perhaps sometimes the the tool makers tend to tend to think um and below the software um if we think that uh that the software sits on core ideas in the case of jupiter those ideas are fundamentally a computing protocol to do this kind of interactive work yes in python and that’s what most of us use in community research but also in languages like julia um in languages like r and ultimately in any virtually any programming language and today there are implementations of the jupiter what is called the jupiter protocol in over a hundred different programming languages including things like c plus plus um so my point with this is to emphasize that in jupiter we took the time to not only write the software but to actually take a step back and ask the question what is the software doing and what of this can we attract and formalize into a standard that others can that others can use for their own purposes even even if they don’t use python which is our tool of choice and making the time and taking the effort to engage with other communities to create that standard has been enormously valuable um because now uh there’s a much greater interoperability across programming languages regardless um uh of of which one uh which one you’re using and you can share content and tools back and forth across uh language communities um and finally as i mentioned uh at the bottom of uh of all of these tools are humans um and it’s very important to if you if you think of the role of open source software in science to think about these issues um jupiter is a project where we’ve taken a lot of time and we’re currently undergoing and trying to finalize our restructuring of our governance um but it is a project that even though it began originally ipython was me procrastinating on a ph.d of older today it’s a huge project with many many contributors and we have a formalized governance model that includes a body called the steering council we have a formal process for institutional partners um there is a 501c3 called num focus that provides physical spots for the project this takes an enormous amount of time it’s not recognized as real work in academic settings for the most part and yet it is critical to maintain a healthy community that can grow that can engage with different stakeholders that can grow more diverse um that can engage different use cases um it’s it’s effort that we’re not trained for well that we’re typically not recognized for but that is it’s critical to understand that this is necessary in the construction of a high-impact long-lived um scientific computing project uh that is that is based around open source software um and yes we do build tools we do build software um jupiter lab is one tool that has uh that has been at the forefront of of the research um in in jupiter over the last few years and we want to empha i want to spend a couple of minutes uh flagging uh its extensibility because that is part of what we want to engage this community about which is that jupiter lab is basically an evolution of the jupiter notebook interface to go well beyond notebooks um to consider for example the fact that data should be a first class citizen this is an example of the jupyter lab interface not viewing a notebook but rather viewing different data files an image file a csv file and importantly the extensibility means that let’s say we have a json file here but that encodes actually geospatial data in this case the locations of uh museums in washington dc um and this file is actually honors the geojson schema then this file can be visualized with a plugin as an actual live moving map using leaflet so the point is in jupyter lab the community can write its own plugins that access data as a first class citizen in in the way that is most appropriate to that particular data format or that data modality um this kind of extensibility has been uh really taken very successfully advantage of um by this community of neuroscientists at columbia university um in the group led by oral lazar and i want to show this extremely short video um let’s see if this works where you will see the jupiter lab interface that has a notebook um on the left but that also has a webgl 3d 3d view

of of the fruit fly brain neuronal circuits in the fruit fly are simulated as electrical circuits and those are viewed on the right on the lower right hand panel those circuit simulations are run on the gpu cluster genomics data about the fruit fly is accessible from a custom database and is visible on this panel on the right this is still the same jupyter lab that you installed you can see the notebook right in the middle but that team built a custom interface that added their own plugins and their own tools to to effectively turn uh turn the default generic scientific super lab interface into a custom interface for uh for the study of the data that is most prominent and we want to use this as a motivation to inspire all of you we’re building some of these things for our own use cases and we want all of you to think of this as a tool that you can extend and mold to your own scientific needs uh pangea i’m realizing that i’m already running late so i’m going to finish rather quickly pangeo which is the other leg of this project this project is the collaboration in the jupiter team and the pangea team is the union of jupiter for interactive computing with the das high level system for distributed computing um along with the x-ray um numerical array uh project that basically takes something like the the the traditional numpy arrays and blends them with the net cdf data model to build a platform so that scientists can use interactively um these tools to do very large scale data analysis this is a quick example from a blog post by scott henderson where a scientist zooms into a small data set what looks like like a figure and as some little collar bars wiggle at the bottom the image zooms that seems like not a big deal but what’s happening is that zooming requires running over 100 gigabytes of landsat data in the state of washington and running a big distributed computing cluster by doing this on pangeo the scientists can just log in with a browser and do the zooming and pangea will orchestrate that cluster will schedule the jobs automatically to make that happen um and for and for the scientists to be able to focus on on their data exploration rather than effectively becoming an amazon or a google software cloud software engineer um so pangeo is a project that by joining uh by joining these tools tries to make the idea of having interactive data analysis done in the cloud with data that is ready uh that is ready for analysis a reality uh we’ve left two links in here to two talks by ryan abernathy and joe hammond uh that tell you a lot more about pangaea we could spend we could spend the whole day talking only about pengio and the impact it has had um and uh and joe will talk a little bit more uh later about how you can get involved with pangea so in this project as i said earlier our our perspective is to join research use cases um in four specific areas and climate data analysis specifically the cmf6 data cryosphere science hydrology and geophysics um to drive developments in the pangea and jupiter ecosystems especially especially around interactivity data discovery and infrastructure both in the cloud and hpc and this is the team um the team of scientists and and developers who are working on the project um and so today what we hope to do is get you give you a little bit of an overview of these projects um and present avenues for all of you to get involved as well as having time to discuss with you on slack in the q a session and hopefully to remain engaged so that we understand your needs and questions better so that we get better ideas from all of you for this project to be as impactful as possible with these four areas being only what were what was the scientific focus of our team but our intent being much broader a much broader impact so i’m already five minutes late um but we do have a little bit of slack and i’m going to try to end quickly we’re going to continue with scott henderson’s uh overview of pangea then a talk by kevin paul and we’ll have a short break uh uh at uh uh at 8 30 and then we’ll have a sequence of lightning a five minute lightning talks and we’ll conclude uh with a longer session of community q a we’re probably gonna lose a few minutes into that but we we knew that would be the case so just to remind those of you who may have come in a little bit later if you’re not on slack this is the url for you to join the conversations ask us questions etc and uh i will stop here and we will continue with scott henderson right away i will stop my sharing so that scott can can hop on all right um just a second everyone all right good morning everyone can people see my screen now yes looks good okay great

i’m just gonna jump right into it to keep us on schedule i’m scott henderson i’m a research scientist at the university of washington i work at the e-science institute and i also work in the department of earth and space sciences so in the next 10 minutes i’m going to try and give you a bit more detail jumping off of what fernando mentioned about the pangeo project which i’ve been involved in for the last year or so so um if you’re new to pangeo i highly recommend going to the websites this screenshot is taken just from the website and you’ll see straight off the bat that bangio is a community platform for big data and geoscience and if you read further down the first statement i’ve copied verbatim here um is that pangio’s first and foremost a community promoting open reproducible and scalable science so i really like that maslow diagram that fernando showed where community was at the bottom kind of the foundational aspects to all other work happening in this project and um there are a lot of venues which we’ll talk about later this afternoon for getting involved in this community but i’ve just pointed some arrows here to various like online forums that are really central to bringing together scientists and software developers as part of this community effort so bangio got started around early 2017 from earth cube funding and since then has grown to involve a lot of other funding sources and bringing in scientists and software engineers from a bunch of different institutions i got started working with this community as a result of the nasa grant that’s focused on developing capabilities for analyzing earth observation data from nasa which is starting to move to aws for hosting and why is this effort happening now um as already stated there’s this need with growing archives and technologies for better scalable tools for doing scientific computing with large data sets so for the case of nasa you can see this plot in the lower left where we’re just looking at the archive growth of nasa over the years and there’s a pretty big step change in archive size due to some new satellites that are launching in the near future here one of them being nicer that’s slated now postponed a bit but for 2021 and then on the right you’re seeing a size of cement global climate model outputs so these data sets are becoming cumbersome to work with and agencies hosting those data sets are also starting to indicate the need to host these data sets on central servers potentially cloud providers for improved access so this move i’m going to focus on the nasa case for a minute here to cloud server is a big deal because it changes a lot of the typical workflow we’re accustomed to in doing scientific computing so this schematic on the lower left is just a redrawing of the schematic we saw from fernando where we’re illustrating kind of the envisioned architecture that um this big data platform the platform aspect or computing aspect of fangio is advocating for and this platform is really kind of centered by a jupiter hub system so a server running in the same data center where large data sets are stored so in this case we’ve got a cloud here this could also be an hpc system but the idea is to start allowing people interactive access to these large data sets without having to download those data sets to their local computers so if we instead move algorithms to the data this will improve the current state of the art and on the right here i’ve just listed a few kind of key aspects to this style of computing some benefits where we have instant access in the case of cloud commercial cloud we often don’t have to deal with queues we can fire up as many computers as we want on demand we also democratize access because people only need a web browser on their personal laptop to engage with these larger computing resources downloading is avoided which is often a bottleneck for scientific workflows these days we have scalable power and computing resources we can plug in gpus to our workflow if we need them when we don’t we’re not using them and by packaging everything up to run on these different

heterogeneous systems so commercial cloud providers where we tend to be improving the reproducibility of workflows because we’re making data sets accessible over networks and we’re also containerizing all the software that runs to analyze those data sets so that’s that’s the vision there we go there are a lot of concerns around this vision so i’m putting these points down here to kind of spur some discussion later in the day i think but we have an unfamiliar cost model in this case for many scientists people aren’t used to cloud-based infrastructure and there’s a steep learning curve if setting up that infrastructure yourself there’s concern over commercial management of public data this is these are things we’ve heard over the last year from scientists getting started in this project and there’s potential vendor lock-in so if you develop all this infrastructure that only runs on the aws system it’s not very portable but those advantages are really big and i’ve taken a slide here from shell gentleman’s keynote from the esip meeting that just happened last week there’s a link to it here it’s recorded so i highly recommend folks taking some time later in the day to look at some of the recordings from the ecip meeting which was really great but the ultimate goal really with this rethinking of infrastructure is to reallocate time of scientists so the traditional timeline at the top there is the 80 percent of figuring out where your data lives and doing all this work to organize it and then very little time at the end of the day actually writing your paper and we really feel that with this cloud-based approach you can kind of flip those time allocations the computing architecture again we saw this from fernando but our idea is to use jupiter um and to give people kind of curated python libraries there’s sets of python computing in environments that facilitate distributed computing and so there’s been a large focus in the pangeo project so far on python in particular using the library’s das and x-ray to work with these large climate model data sets or large cubes of satellite imagery foundational to this architecture is the storage of data in an appropriate format so we’ve been advocating a lot for the tsar format or cloud optimized geotiff but you need uh data sets to be stored in some sort of tiled fashion to facilitate distributed computing and then a lot of the work we’ve done so far has really been um borrowing from jupiter’s kind of guidelines on just on setting up jupiter hub on a kubernetes system for cloud providers this allows us to run things on google cloud aws azure different systems i’m going to skip over this these are just some of the libraries we typically highlight in presentations but for this community i just want to draw attention to the fact that um the the platform of pangeo is really a collection of platforms that have been supported through cloud credits from cloud providers google amazon these are jupiter hubs that people can simply log into with the github username and immediately have access to the das configuration for distributed computing so again as fernando mentioned this would be the services component of what we’re doing which has been hugely important for getting people up to speed on using these software sets and the style of computing so we have jupiter hub running on aws jupiter hub running on google we also have binders set up running on both of those systems and again this is this is a binder that is attached to these das clusters to do distributed computing the best way to get familiar with how this works is to go to this website and you can try out some interactive examples here um there’s a tutorial for getting started and then there are more sophisticated workflows the increasing complexity the one other thing i want to touch on in this presentation is the role of hack week so at the university of washington e-science institute we’ve been running week-long hack weeks over the past several years um these are really important for supporting like community training getting people up to speed on on these python tools so hack weeks are really a welcoming environment designed to facilitate this building of a research community

and they’re really intentional um really well-designed projects to get people working on developing software and contributing to open source projects while creating connections in their research community so a typical hack week will include many uh community building activities over the course of the week hands-on tutorials and these hands-on tutorials are done using one of these jupiter deployments that we’ve put together also project time to advance some sort of research project and just recently we hosted the um icesat-2 focus hack week which is specific to a nice nasa satellite that recently launched as the first 100 virtual event and we had 80 over 80 participants for this event you’ll see some familiar faces on the screenshot in the lower right here um it was very successful and uh important aspect of its success was having this centralized jupiter hub environment that participants from all over the world could log into and start sharing documents and working together so i’ll just point out that we’ve put a lot of effort over the last year into making these these fancy jupiter hubs deployable such that other research groups and people putting on hack weeks can do this themselves there’s a schematic over here of what the website looked like when people logged into this environment and how jupiter hub partitions machines to different people when they log in this slide i put in here because i just want to draw attention to the fact that part of the beauty of this project is that everything is kind of out in the open from the ground up so there is a certain level of complexity in deploying and then keeping track of these systems when you set them up these services on cloud providers but we’ve we’ve been trying to kind of lay this out so that other groups can do it and i’d love for people to who are interested who i suspect are on this call to get involved in trying to deploy these systems yourself so this is a nice blog post that sebastian alves wrote one of the co-eyes on the project at the university of washington and setting up this infrastructure for the icesat-2 hack week and i’ll just end with a couple questions to kind of again spur thoughts and discussions for later in the day but just reflecting on the last year we recently had to give a report to an nasa technology infusion workshop and we’re asked what’s the biggest challenge of this project so far um and i’ve put two uh things in response to that question about what’s what’s the biggest but one being availability of data this whole motion to move things to the cloud or the data format is key and the availability of data on the cloud is key to the success of this and also this wariness over long-term costs who supports these services when they’re no longer supported by cloud credits how to overcome these challenges we think these hack weeks are really key to getting the community accustomed to cloud computing but the long-term funding and support of these systems is a unsolved problem still so thanks everyone those are my slides please check out those links in there in your spare time later today thank you so much uh scott for that presentation um i we’re taking we’re having a few questions coming through through slacken we encourage all of you to post your questions on slack and we’ll have a brief q a session um right before the lightning talks but uh we’ll continue with kevin paul from ncar uh next um scott thank you option there you go thank you um i need to be made a co-host i think you should be good to go now okay thank you very much okay where am i there we go all right thank you everybody uh all right thank you so uh thanks to the uh wonderful introduction by fernando

and uh scott i think everybody is uh fairly familiar with what pangaeo is and the fact that uh pangea is had a lot of success being deployed in the cloud but and it’s also been mentioned that pangeo can be deployed uh on high performance computing systems uh such as nursk and uh our super computing system at uh the national center for atmospheric research where i work with my colleague anderson von hearway we’ll be speaking one of the lightning talks a little bit later scott’s already given a little bit of uh introduction to this but deploying on the cloud has a lot of advantages uh there are a lot of benefits to deploying pangaea on the cloud but you can deploy it on hvc and there are obvious differences to what that deployment looks like stemming mostly from the fact that hpc has a number of limitations uh that don’t necessarily exist in the cloud such as limited access it’s very difficult to just spin up a server for example on an hpc system and convince your sysadmins to make it public to the internet uh hbc is usually bare metal so there’s no virtualization usually not even containerization although that’s changing hpc limits uh resource access um this is also i guess connected to the fact that it’s bare metal but it means that the standard user just doesn’t have uh sysadmin access so you can’t use things like docker which a lot of uh the pangea cloud stack is built on as a result uh since you’re not going to be using docker or kubernetes or anything like that hpc uses job schedulers which many people are familiar with such as pbs or lsf or slurm and that’s sort of the common way of launching large jobs and sharing resources for that but the goal of pangeo on hpc is exactly the same as it is for pangea on the cloud and that’s centered around this idea of a common user interface which is jupiter that’s built on this architecture involving jupiter hub which spawns for you at jupiter lab and then provide access to canonical software stack via custom kernels and the goal obviously is to try to make the user experience the same in fact i think ideally we would say that the goal would be to try and make it so the user doesn’t even know whether or not they’re running on an hpc platform or if they’re running in the cloud we’re not quite there yet there are differences usually having to do with things like authentication sometimes there are functionality differences some supercomputers don’t provide direct internet access through the compute nodes for example which can limit what you can do there are also differences in the software stack by necessity for example as was already pointed out in one of scott’s slides there’s a difference between desk job queue and das kubernetes which is just being able to launch your desk or cluster which provides you your parallelism in an hpc environment versus in a kubernetes-based cloud environment but instead of going through just a bunch of bullet points i think it’s uh useful to try and do a demo i say try i hope this works i’m going to take you to ncar’s jupiter hub here we have a selector page which allows you to select which of our systems you want to run on the cheyenne supercomputer or our data analysis and visualization cluster i’m going to do this on cheyenne there’s obviously an authentication page because we can’t just open the things up to the internet and this requires duo so there’s a little bit of a delay here and then as it usually works in hpc you have to select exactly what kind of resources do you want this is actually becomes a job submitted to the pbs scheduler all i have to do is basically give it that you’ll notice that the rest of this is defaults to just asking for resources to run one process which is all i need to run jupyter lab and so now i’ve submitted a job to the queue and it takes a little bit to spawn it up i and start my jupiter lab session running usually it doesn’t take very long this is where everything could have fallen apart but it didn’t yay so this is what you typically run into this is the default uh in a lot of ways this is very close

to the default uh jupiter lab um this is what you would there’s some something similar to what you would see if you just launched jupiter lab on your laptop i’ve got a notebook here that is a good demo of a fairly large data set and i’ll show you that you’ll see i land in my home directory on the supercomputer this is a this example here is an example of a 100 member ensemble containing precipitation and temperature data my thanks to anderson bonnie hearway for giving this to me um he has the usual boilerplate where you do some imports and there’s a little bit of setup up front so it takes a little bit but then the next step here is creating and launching a task cluster now you remember i just asked for resources for one node one process but what i’m going to do is i’m going to use das job queue here to actually request from the pbs job scheduler resources to launch a bunch of das workers and that’s what desk job cube does for you so i launched this and you can see i’ve requested 72 workers there are 36 cores on each cheyenne compute node so basically i’m asking for two compute nodes and it takes a little bit for each one of those jobs to make it through the queue but you can see i’ve done it and i even get a link here to the desk um dashboard and we have this das lab extension uh installed in here so i can see things like um the progress window the dash progress window nothing’s happening so this is basically empty right now and i can also take a look at things like memory use and i can spread this out like so give myself a little bit more real estate so i can show you what i’m doing and you can see you can even ask deskjobq to tell you how it requested the workers through the previous job schedule and then i can connect a client to this and start looking at my data so that didn’t take too long to actually get to a point where i’m actually looking at real data this is a czar store that’s held on ncar’s storage platform glade uh it’s about 1.7 terabytes um you’ll notice didn’t take very long to open it because all it was basically doing was reading metadata you’ll also notice that it’s a czar format which as scott mentioned is ideal for cloud but it also turns out to be really nice for hbc environments with a parallel file system as well i what i’m doing is i’m reading this into an x-ray data set x-ray can print out information about the data set in an easy to read format like this you can see it has precipitation variables and temperature mean temperature variables it’s got a 100 member ensemble and it’s got a lot of time steps you can take a look at what the size of one of the some information about one of those variables for example the mean temperature and you can see that das has distributed this data across all of the workers it’s chunked it up into chunks a one per ensemble member and 366 time steps per chunk and now i can start using x-ray to do some actual computations this is the computation that’s sub-selecting a small amount of that 500 you’ll notice up here this is a five half a terabyte basically for each one of these variables i can sub select a particular time step and do some computation on that obviously by sub selecting a time step the amount of data i’m using is actually much less but uh all the operations are lazy until i actually do something with it like compute or plot in this case and what i can actually do is i can actually tell task and you can see the windows up here start working uh and i end up getting a plot of the standard deviation the errors uh which is pretty reasonable uh it didn’t take very long that’s partly because we selected only one of the 13 500 time steps uh and we can do something more significant and use the entire data set where i’m actually going to take the temperature mean and do a mean across time and then compute the spread across the ensemble members so i should get back something that only depends

upon latitude and longitude again this is a lazy computation so it’s not doing anything yet until i actually tell it to and now you can see over here in the right as it starts to turn along in its computation uh you can actually start to see it uh moving fairly quickly uh it’s doing a lot of tsar reads uh it’s computing the mean and time i don’t spend too much time on this but this is giving you an example of exactly how long it takes to process half a terabyte with 72 workers um two nodes of our supercomputing cluster and just to give you an idea of exactly how much that’s actually used of our supercomputing cluster and it’s done uh that was uh there are 72 cores that i’m using and there are about 150 000 cores in the supercomputer and now i can immediately get that data and plot it that’s all i’m going to show i will go back to my talk now we are not the only players in the game obviously there are a lot of hpc centers that are devoted to jupiter um this is a slide from colleagues at uh nurse um trey australia and roland thomas who’ve done some amazing work with jupiter uh on the nurse systems uh this central image here shows uh essentially that chooser page that you saw at the very beginning of the demo i that in this case it’s particularly customized to each user after authentication and it only allows you to choose things you actually have access to which is pretty cool um the top right of this slide is showing a jupiter nb viewer deployment that they have local to their own disks so that users on nurse can can provide a static view a statically rendered view of a notebook with colleagues and then they can download that notebook along with the environment that’s needed to run it into their own home space on nurse that’s just a handful of things that the nurse folks have done you can see down here a laundry list of amazing things that they’ve actually done i’m actually quite jealous of all the things that they’ve done and i’d like to get a lot of this stuff implemented at ncar as well um what’s next there are obviously a couple of different things that we can’t do on hbc yet that you can do in the cloud such as binder we’re working on that we’re also hoping to kind of build up some data discovery capabilities using existing data search capabilities at ncar and link those in through an extension to the jupiter lab those are just a couple things to think about thanks to anderson for help with the demo thanks to the folks at nurse and thanks to the nsf and that’s it thank you so much kevin um for that presentation and anderson for the materials we actually have a couple of questions and one of them is directed to you kevin um so perhaps if you could briefly uh show again the i don’t know if you still have it open the loading of data on x-ray uh sort of how if you could show that part of the notebook and kind of illustrate how xray loads the data sure yeah um that is uh obviously i i couldn’t do a lot here um i had very little time but i i can show you this is the notebook here where i have actually loaded the data okay so xr here is just xray and it has a function called open czar which allows me to open up a czar data store and it’s stored actually on my colleagues scratch space on our glade storage system so this is a parallel file system it’s gpfs uh it’s obviously as you can see here it’s about a 1.7 terabytes and just in the same way that czar works uh in the cloud when you open up the data set you’re not actually reading any data except for coordinate and metadata information which doesn’t take very long which is why it took less than a second to actually open this but x-ray then gives you loads this whole data set into something called an x-ray data set and then allows you to view information and metadata about each thing so i can get information for example about

t mean down here directly from this interface which is fantastic interface uh can get information about precipitation uh this is telling you how it’s been chunked by desk behind the scenes so that when you actually do start reading you’re reading in parallel and that’s it does that answer the question or is it everything after this was basically doing lazy computational or selection sub selection operations with uh x-ray um i hope that answers the question sure and uh there was a several people asked also whether this notebook could be made publicly available is that is that something you could post on github or we can post it as part of the materials when we wrap up i think so um yeah i it’s uh it only works on because this store is only available on on glade you can only run it actually on glade but if you go up to the top of this there is a version of this that is freely available on the cloud and i can post that link for everybody via a binder thank you kevin um another question that came up i think for more than one person was and this might be something for especially for scott and joe as well um where the the type of resources that you showed could be cost prohibitive and so is this something that is only accessible to universities and government entities so if scott and joe have common takes on on that on that question yeah i was just typing a quick response in the slack channel um i would say no um it’s definitely the cost scales with your number of users and one good aspect of the cloud deployments now after a year of iterating or so is the costs are actually quite minimal just as for a baseline infrastructure but it depends on how many users you have because everything’s dynamically scaling so joe did you want to add anything to that no i think that’s basically it i think the nice thing about the cloud is you only pay for what you use and so you can scale up your deployment to take care of a specific task or do a workshop or to do whatever computation you’re using and then you can scale it down and so i think there’s actually a pretty good counter argument for the fact that the cloud costs more which is um you don’t pay for idle time on your machines yeah i would second um perhaps that’s a third uh the uh this the whole platform uh is fairly uh easy to deploy and the other costs definitely do scale uh on on cloud uh the biggest uh i think cost for small groups would be storing data in the cloud but i would say or i should maybe that’s not the biggest but it’s a chunk and i would say that there are efforts by institutions that actually own a lot of the data such as ncar and nasa that are actually hosting the data in the cloud themselves so getting access to that data and using it in the cloud is usually very very low cost if not free so um i would say that aspect of the costs is uh something that hopefully we’re uh we’re not going to um not going to incur to most of our users ryan says he has a comment and i just want to mention um a very sort of um uh low-key announcement but we we recently received uh some notice that we’ll be getting some new earth cube funding to provide data hosting for the earth cube community so in terms of who will pay for the cost of hosting data i’m happy to be able to say that going forward we have a sort of strategy to to provide that to the community and our our funding includes the the hosting costs and it also includes working with alternative storage providers like wasabi cloud and open storage network that can provide cloud style storage at a bit lower price point than say amazon excellent news and congratulations on the on the funding that actually dovetails with another question that perhaps you can comment on which is whether nasa has any plans to make data available in certain formats and and as part of these arrangements any of you and who are well plugged into kind of the pangea and uh kind of nasa data availability

that’s one for scott and joe probably yeah so so nasa’s kind of in a multi-year transition over to hosting some data sets on aws and there is no single storage format that’s been identified as the go to format but everything’s on the table right now so hdf czar geotiff are definitely some formats you’ll be seeing nasa data sets in thank you and this one perhaps back for uh kevin and others close closer to the hpc side on the cloud side whether chunk size is important on hpc when using czar and and perhaps if you can comment on how the chunking decisions are made anderson do you want to field that one sure um so i i think in this particular case the original data set was actually initiative format and the way that i created the chunking i was just looking into what i was going to do with the data so i would say that actually matters um but um that in itself may cause a few problems like if at another time you actually want to do some other operation that maybe kind of deal with dimensions that have been actually chunked and you don’t want actually them to be chunked uh but there’s a new package uh from the pangea project called rich anchor i think ryan can actually uh speak to that uh i just know that it’s basically meant to uh address these issues where nowadays you actually have to spend so much time thinking about how you’re going to chunk your debtor but that in itself is not really a guarantee that things are going to work smoothly because at some point you may actually have to re-chunk it and the whole process of re-chunking is it can can be quite expensive depending on what you’re trying to do but the re-trunker project is trying to address that uh i don’t know if ryan has a comment on that because i’m not quite familiar with exactly what the rechunk is actually doing well i mean i’ll just say that yeah this is something that’s fundamental to large multi-dimensional array analysis there’s not really any way around it um as long as there’s some correspondence to reading data you know physical proximity on disk um and its impact on read performance so we’ve always had chunk data in one form or another whether it was spread over many net cdf files or or whatever using tsar kind of makes that totally explicit um which is probably good for thinking about your workflow but it’s definitely the case that some workflows will be optimized to certain chunk structures and some will fail hard on the same chunk structure so um rather than trying to say that there’s one universal chunking scheme we’ve moved towards the view that you should have the ability to rapidly kind of temporarily rechunk your data into a format into a scheme that fits your analysis best and the rechunker package is a tool that tries to implement that thanks ryan uh and in order to keep on schedule perhaps anderson can pick up pick up the mic again uh and take us into the lightning talk section where you’ll open anderson will open up with a talk on intake all right give me a second okay okay can you see my screen yes looks good okay good uh let me know uh should i start or should i wait go for it okay good all right uh give me a second okay so okay uh hi everyone uh thank you for the opportunity to speak uh my name is anderson vanihier i work as a software engineer at ncar and today i’m just going to talk about intake and intake esm uh so i realize i only have five minutes so i don’t really have any slides prepared i’ll actually show you a live demo of what those two tools do um so integ is a python uh data cataloging and data discovery tool and intake esm is a plugin is an integ plugin designed specifically for earth system model outputs um and let’s now actually switch gears and actually look at the demo um so what i have here is i basically have a notebook where i have two use case and the first use case uh is about the sea surface temperature uh data that is actually provided by nowhere so if you actually look at where this data come from at least the way it’s actually hosted on this

nova website the way it’s basically structured you basically have a bunch of directories uh by year and month as you can see it’s quite a lot of directories and it goes all the way back to 1981 and if you actually want to get like a specific file you basically have to go to a specific directory and you have to pick whatever file you want to get uh and work so in most cases you may actually have to download the data you can’t just access the data but intake actually allows us to to bypass a few of the pain points one of them is we can actually basically define a catalog in this case i basically have this yaml file where i’m basically defining a few things uh so i have like these parameters here uh so i know that okay in this case i could actually change that to 81 um so i know the time range for the data um and i basically have this argument here which is the path uh which as you can see it’s basically like a pattern of what of how that is actually structured on the server and what i’m basically saying is uh take whatever parameter the user provide and then fill those in here and then retrieve the data so in this case i’m actually using this new feature in sdf you can now uh basically request an sdf over http directly so this is basically what this is actually doing because i didn’t actually want to go through the opened up server um and uh one more thing so i’m also telling intake to actually cache the data on the first access locally since the request may actually be expensive and basically once i’m done with that i can basically say okay give me the data correspond to this year this month and this day and then what i get back is basically an xray data set so this is not really fun so what we can actually do we can actually tell we take his intake to basically retrieve a bunch of those so in this case i’m basically defining like a range this is actually like an entire year and i’m basically using das to basically retrieve a bunch of those files uh in parallel and also caching them as well and what i get is basically again a single xray data set and once you have that data set you can basically just do some interesting stuff so you can just go on and actually do some um interactive visualization you can build uh dashboards with this because as you can see the only thing that you really have to provide is just these parameters so this is something that you can easily turn into a dashboard um so that’s enough about intake um now let’s actually talk about intake esm so so intake doesn’t really um force you to have your catalog as a yaml file you can basically define your own catalog format and then you can actually uh build a plugin on top of intake which is what we did for earth system motor outputs because these they tend to be huge so basically a yamo is not really the right way to do it and also the hierarchy and how things actually structured in this case the use case is going to be the cmip which as most most of you probably know it’s really like an international uh effort like from all these different countries and different institutions um and an end car would basically have a subset of that data so in this case i have my catalog as a json file and in the json what i basically have i have a pointer to a csv file which in the csv i basically have a table where in in the table each row corresponds to every single necessity of file and the metadata associated with that necesita file um and when i basically read it what i get is basically um a data frame so i read my csv into a data frame uh and as you can see basically i have close to 1.7 million items here and then within the gsm then i can actually now start actually querying that catalog so in this case i’m saying give me this variable i’m only interested in this particular experiment id for this particular time frequency and for a bunch of other stuff here what i get back is basically the same object as the original object but then it only has a subset of the data and then once i i’m satisfied with that request i can now basically tell intake esm to load the better into x-ray objects which what i get basically here is basically a dictionary of data sets and that dictionary of datasets basically has so we basically take the 78 files and we actually group them into compatible groups

um so we only end up with only six x-ray data sets even though we actually had 78 nested files and then once we have that we can basically just do regular science so in this case i’m basically just computing the mean across time um for all the data sets that i was able to retrieve as you can see there’s some activity going on here with dusk um and then once i’m done computing the means i can basically just do the plotting so basically after this step where you actually tell intake esm to load the data into ksm just gets out of the way um so after cell number two some cell 12 you should basically just do the regular stuff that you do with uh uh xray and dusk and uh with that let me see if the plots actually show up uh in a second um uh hopefully it shows up soon but yeah so yeah so basically this is what i got uh so and as you can see i was able to do that with as i mean the amount of code that i have to write here is really small and as you can also see throughout this notebook this you don’t see any path or any url which this also kind of uh helps with uh disability and things like that you can easily share this and if let’s say someone was actually working in the cloud they can actually point intake esm to some catalog that actually points to data in the cloud and they basically just run exactly the same code and with that i’ll basically hand it to you to the next speaker hopefully i didn’t go over five minutes thank you anderson that was that was lovely and i really i really enjoyed the demonstration of the task cluster um on on the right kind of watching the activity as the computation was running um next we have um scott dale peckham from cu boulder who will be talking about widgets and interactive interfaces so scott please go ahead and you have the floor can you see my screen yes we can and we can hear you as well thank you okay okay great so um i’m going to give a just a short talk about a um jupiter notebook that has that uses ipi widgets and ipi leaflet to create interactive gui and map for people to select data sets and the the project this is part of here is this logo um it’s an earthquake project called balto which is an acronym for brokered alignment of long tail observations we thought it was cute since balto is the name of a famous sled dog from iditarod fame we thought it was cute to put him in because he has a long tail also but um put that aside so these are my i my uh copies on the balto project that’s at the top of this and there’s a little table of contents in this notebook this is on my github um repository which uh you i think there’s a place to find that later but if i scroll down to where the action is here um there’s code behind this that i’ve written for both the gui and for some plotting and i’m using primarily four packages once ipi widgets for the widgets ones i apply leaflet for interactive maps in in the notebook another one is pidap for accessing a a server that has the opendap protocol that supports that protocol to access some data another one is matplotlib for the graphics and so if i run this this little section here it starts up a tabbed gui hopefully you can see that okay it’s got five tabs it’s the first one is to browse some data so we’ve got a default uh opendap url in here and i’ll hit the go button and it goes out and it searches that url and gets this list of all the files that it finds there which is this is a test server that uh the opendap people use where they have lots of variety of data sets and so you can you can go and first choose a data set and i’ll choose this one called and then it will based on what it finds inside that file it will give me a list of the variables that are possible as a in a drop list here so i’ll go down to sst and everything will update to show me the units or degrees c the dimensions are time lapland here’s the shape of that array um which is you know it’s kind of big in time not very big in space and two-byte signed integer so all the information it finds on this puts here and then any attributes search with this variable which would normally be part of an scdf file are put in the drop list for for quick reference to be able to look at and if i had chosen a different file up here it would be different variables and

different attributes populating these lists so now that i’ve chosen the sea surface temperature monthly means data set i don’t want to download the whole thing i want to subset it so i’m going to go to spatial extent and because i happen to be in puerto rico right now in my house in in tokyo i’m going to just zoom into that but i keep activating my dictionary for some reason so i’m going to zoom in this is using ipy leaflet now for this interactive map and at the bottom here i’ve implemented some of the different maps that are part of that so i can just toggle between s3 world street map or openstreetmap.mapnic or many others but i’ll stick with this one and there’s also a some tools you can optionally include in your ipi leaflet window besides the zoom which is this full screen option which is kind of cool and then you can go back to when you’re done looking at things to the smaller view but now i’m going to zoom out just a bit to get more of the ocean around puerto rico and the research question is okay has surf how has sea surface temperature been changing over the last hundred years around puerto rico in the in the waters of the caribbean so now i’ll go to the uh the date range and this data set goes back to 1854 but i’ll show a smaller portion of it so i’ll change this to 1908 to get a nice 100 years and then i go to the download data and hit download and it’s really fast because it has already subsetted the data by space and time on the server before downloading it and so instead of downloading that large data set i just download the little part that i need for this this thing and if i scroll down through these instructions i have a few things i can print out about the data set that are loaded into the balto gui object looking for no data values looking for things like that but then there’s interest of time i’ll just jump down to the plot here where now i’m using balto plot which is another set of routines based on matplotlib that i’ve you know plot up the sea surface temperatures within the region with one of the corner pixels of the region that i just selected and sure enough there seems to be a a trend towards greater uh temperature over time over the last hundred years and these these wiggles are are the annual cycle or you might think that they’re an annual cycle because this was monthly data so to confirm that we’ll plot a subset of that and sure enough there’s about 12 dots or plus signs per oscillation here so that’s the 12 months of the year before it goes to the next cycle so that’s the basic idea just shows how kind of a cool way to uh blend together or glue together ipi leaflet ipi widgets pi dapp and matplotlib to create a tool that could be easily modified like if you wanted to modify this to do something else to go to some other type of server or do something else you can you can look at the code in in balto gui dot py and see how the panels are set up and see how the events are processed and and you know just go from there it’s all open source and that’s that’s it excellent thank you so much scott i really enjoyed your presentation at the previous earthquake meeting so i’m glad you were able to join us much appreciated no problem continue being in contact about these things um so next we have um adam marge from uh from uc berkeley who will be talking about applications in hydrology so and um you have the floor yeah okay let me share this screen i do see my screen and hear me talk perfect yeah okay um thank you for the opportunity to speak the hydrological use case of the punjab project and i am adam uh i on high apologize colored models and their uncertainty in talking about the original android use case i’m going to start with three brief motivational statements um the good bad and ugly in hierarchical data processing particularly in is of intensely monitored worst so the good news is uh funding for

highly instrumented worksheet for long term hierarchical data collection has been a priority and this funding comes from sf and other certain federal agencies so there is a concert to refer to monitored west ships where data is to support thyroid you can’t waste in diabetic understanding um here is a map of two of the major networks of biological data observation particularly intensively monitored which is the red one represents the critical zone of the visory dataset and the green one represents the long-term irish collection with ships and some other data sources from public uh universities and national laboratories so within each website there is uh many subsets of watershed within each single ceo or ether which it would have multiple switches where data has been collected and with subwoofers there are stations uh that will collect numerous technological and quantitative variables with these sensors and usually these sensors collect data at hourly and some hourly uh frequencies so um it’s uh maps and this will share a good opportunity for developing generalizable hierarchical principles and behaviors about these wealth shares and we can develop some theories or generating principles based on this data but the reality is such principles do not yet exist in higher orders and then to bad news we don’t have a synthetic understanding that emerged from this data that would support higher logical understanding or models and to come to the eyesight these datas are generated from this uh which share networks data be generated and they do not have common timestamps and they do not have they do have very big gaps or different gaps and they are really difficult to access so you see the ugly parts of data collection from these interestingly monitored issues in short data collected from this uh west is organized and naturally not ready to use to model development or scholars and then to the storage what we are proposing and working on is a punjabi use case for performing common hydrological downloading and processing tasks so we intend to develop a jupiter-based um there are protesting skin that will acquire the data and make the data ready to use from this intensively monitored research data processing scheme we employed includes primarily four standards for standardizing the first stage being data downloading and acquiring from these different sources and websites and second stage is where we do body control and cleaning where we attend clean and fan of outliers as well as unrealistic value since called enrolled and the second phase is i mean the surface is the aggregation where we aggregate the sub hourly and hourly times a daily time and first stage is where we do data processing where we feel misinformed and in filling the missing values with three stages primarily the first one being based in published where we feel short shorts uh lenses missing values primarily less than week or less on a day in hourly day assets and then we go to regression to fill the missing values and in reaction we do have two types of regression one is the sasha regression the first space where we call different uh

stations and second is the climatic catalog method or the client catalog regression approach where we borrow and regress time data so these are the data processing schemes we employ in very interactive manner in jupiter and the envelope from this is for this uses a very organized data from thirty websites across the us so i showed you earlier and the data contains discharge precipitation uh snow sweet soil moisture soil temperature and isotope data from these switches and we do have so far sheets and the data record language ranges from five years or two years there are forms for some variables and 20 years for some of the premium environs and the data format pleasing this is an htf format and it will be hosted in the piano cloud and going forward we are hoping to expand some multiple waste shapes hopefully across the us in more places and going as have the u.s and curry uh developed us our plan going forward so that we an open interactive platform anywhere any researcher can contribute to this um often reproducible networks and they can use our jupiter tool to clean and produce to a standard format unorganized program so with that i will conclude my last name thank you so much adam for that presentation um we’ll move on to eric sindell who’s joining us from sweden and talking about jupiter book eric um please go ahead i don’t know that we’re hearing you eric nope no audio we can see your screen but i can’t hear any audio fernando do you want to move to the next one and we can while you test your audio uh eric if that’s okay with you since we’re a little bit tight on time georgiana would you uh mind hopping on um to the jupiter hot presentation and we’ll try to debug with eric um separately sure let me show you screen you can share your screen and we can hear you georgiana yay okay so uh hi everyone thanks for inviting me here so i’m a georgian dolo can and i’m currently working as a jupiter hub and binder contributor in residence and today’s presentation will be about um we’ll cover some basics about jupiter hub and the quick demo so um when you first hear the words jupiter the first thing that pops into mind is probably the notebook with all the the beautiful visuals the code and the text um but then when you say jupiter hub it might be a bit confusing at first to know the difference between the two i mean for me it was at first so um this presentation will will try to explain a bit um these differences so let’s say uh we have a user that has a jupyter notebook and then um this user has some uh some team members that they do want to use the jupyter notebook um to be able to um work on the same data set or just share the the compute environment and for this we have jupiter hub and jupiter hub is made out of three main components which is the authenticator to like make sure that just the right person accesses the hub the spawner um that creates each user each user’s notebook and then we have the proxy to know how to route each user to their um their own um jupiter notebook server um and all of these components are configurable um you can choose from some that are available from the community so for the authenticator you can have the form the palm authenticator the native one you can log in with github um google bitbucking and others the same for spawner you can have some options you have some options there and for the proxy there are only

two right now the uh configurable http proxy and the traffic proxy and all of these are available under the classic notebook interface or as jupiter lab and um to make the um jupiter hub deployment easier um we have like two superhero light projects uh the little jupiter hub and zero to jupiter hub and um the first one the little jupiter hub as the name says it’s um mostly suited for small groups of people because um you have just one server and you have um the users and the hub running on on just that server this can be a bare metal one or just in the cloud and for zero to jupiter hub we have multiple servers again in the cloud and all of this operates under kubernetes laws so that’s super cool and for the demo um i want to show how easy the little jupiter hub deployment is and this is pre-recorded because it takes more than five minutes to actually install it so this is uh deploying the little jupiter hub on digitalocean this everything here is in the documentation so um right here i’m just choosing the data center and then in the user data you just have to copy from the documentation just a small comment to run and then everything gets installed and here you just set the uh the admin the first admin that will be able able to access the hub the first time okay so we are not enabling enabling backups because this is just a demo uh so while the droplet um builds uh after this you’ll have an ip address there um when you first access the ip address you will get a 404 because and this means that the literature hub wasn’t yet installed but you can always ssh into that server and see the logs there um if something went wrong and you can actually see when um when that was ready so here i’m just showing how the log the logs look like so you just see all the steps here and then when everything is done you’ll actually see a message with that and afterwards you just copy the ip address and then paste it in a browser and you have the jupyter hub login again logging in with the first admin i just set or delete a jupyter hub uses first use authenticator and this means that the the the password you set when you first log in that will be the password associated with your account you can always change the authenticator afterwards um okay so as an admin you can always um add more users um from the admin interface you can add just regular users or other admin users and um these admin users will have um we have the will have the uh the rights to install different packages because all the admin your users under jupyter hub under the little jupiter hub are also sudo users so this is just accessing the terminal from from an admin account and well here i’m just installing on the numpy package you just have to use the minus c option to make sure that it gets installed into the right environment which is a user environment okay so once we have this installed um you just like open a python notebook and you can import that package and hopefully there’ll be no errors in my demo there were there weren’t so so that’s basically it the little jupiter hub is very easy to install thank you very much for listening to me i hope you liked it thank you so much for that lovely presentation and very impressed that you were able to do this off of an actively running video um that’s very good timing much appreciated um eric it seems like your audio is now back on so we’ll move on quickly to you in the interest of keeping time and leaving at least a little bit of time for the win for the q a thank you this lightning talk is about jupiter book jupiter book is a tool that allows you to create beautiful websites from notebooks and markdown files and since we use them so often this is a tool that can come very useful so jupiter book does something like this it takes a collection of content in the form of notebooks and markdown files and it converts it into

a book a book can be a website or it can be a pdf file and other things but we’ll focus on creating websites so you you have a collection of content you create a website from it in jupiter book it’s the tool that does it for you so getting started of using jupiter book is something like this it’s a terminal tool so first you need to install it to install it use the python package manager called pip so pip install jupyter book and you’re done to have a book you can get a starting point with some boilerplate files so you run the create command here if you run the create command you get these files the folder and some files in it let’s look at the files first you have a configuration file this is where you set the title of the book and various other settings then you have table of contents this is where you define the structure of the book so what if you have multiple notebooks in what order should you view them on the websites for example and then of course oh those files here the configuration file and the table of contents it says in the file format yml it’s for jaml and yaml is something that is very useful to learn it’s like json but it’s more human readable so if you spend time learning this it’s something you won’t regret in the folder you also get some demonstrative content so markdown files and notebook which becomes content for the website now you have just provided content so far you have not yet got any html or something like that this is what the build command does and by default you get html a website running this build command gives you output in a build directory the build directory will for example have index.html and this is a file you can open up on your local computer then it will look something like this in your browser this is the standard what you get out of the box jupiter book publish as a website uh but yes it’s on your computer right now of course you would like to publish this online in the app so for doing that it’s useful to have some git and github knowledge but the documentation for jupiter book is so great that you don’t you can get by without having any previous experience i think the documentation is available at and to the left you can see here get started you go through the overview building your book and publishing it online and i have had such a great experience of publishing books online using github pages and github actions this is something this is an example of a web address that you can get if you publish your book online with github pages which is free and github actions is a tooling that allows for any change you do in the git repository to automatically build and update your book online so when you have set this up according to the documentation you don’t have to use the tool jupiter book anymore because it’s done automatically for you and that is really good because it’s enabled one very useful feature you can go to the configuration file and points to where your book is on github and if you do you can get such a button here on your website and this button allows any users visiting your website to to find their way to the github repository where the book is defined and say oh you have a spelling error here or i suggest that you rephrase this part of the book like this and if you accept the change suddenly it’s updated okay so now writing the book it you want something more than just a set of contents you want to cross-reference etc so you have some features for example perhaps you want to hide code code blocks of your notebooks and just so show graphs or certain sections you can use mertadorta inside

of the notebook and in this case hide the code method.tag to hide code and while you write these books i said you can do it using markdown and notebooks but the markdown is a bit extended with the label called mist and mist provides two additional parts to markdown roles and directives these are like functions for markdown to the left we see an example of how for example to to cite a reference so if you define a reference you can cite it and you get it nice formatting directives are bigger functions so say you can do something like inserting a figure as an example here is how to insert a note which renders like this on your website so this is the markdown in mist flavor and here is the results from having such marker one of the most important things of jupiter book i think is this ability if you have a set of notebooks you want perhaps to have a one very big notebook that you just want to generate a figure from and then you want to use that figure somewhere this is what you can do like this you can use the glue function to save an object from a notebook and then you can insert it into markdown somewhere else in your book and over here it’s a directive as we just have seen okay that’s about it jupiter book can help you publish a website and the documentation is just so good that i believe everybody can do this it doesn’t take five hours it takes 30 minutes or one hour and jupiter book is created as part of the executable book project and here are the team members of executable books and it has a rich community and we are all welcome to join i am one of the person happy to join this community of contributors and with that said please go ahead and visit to learn more and if you want to have these slides you can find them here i’m consideration github and chris holmgreff is in this meeting and is one of the co-founders of jupiter book thank you thank you eric much appreciated and we will close the lightning talk section with joe from pangea whom we’ve already referenced a few times so joe yeah the floor all right i’m just pulling up my screen here right i think you should be able to see things now audio and video all good great thank you well hi everyone my name is joe hammond and um uh lindsey and fernando asked me to just kind of wrap things up with a short lightning talk on how to connect with the pangea project and that’s what i’m going to talk about uh just briefly um i’ll just say who i am my contact information is here i’m a scientist at incar and and i also work at the a new nonprofit called carbon plan um and i’ve been working on the pangio project for about three three or four years now um and i also contribute to a bunch of open source projects so uh just remembering back to scott’s talk an hour ago or so um pangio is first and foremost a community of people working collaboratively and i just want to highlight a few of the places that those interactions are had and where you might find pingio people around doing their thing so um the first is uh on online so almost all of the interactions and events that uh pingio uh coordinates or is part is part of um happen online so github is kind of the primary central uh place where you’ll find things so ping data we’ve got a chat room in on gitter and that’s for kind of short quick messages and coordination things we have a discourse forum where you can find and post questions about how to do things or um we we coordinate uh regular meetings and whatnot there and of course there’s a there’s a twitter account as well um uh highlight a few of the kinds of uh of meetings um that we have on a regular basis so on tuesday and thursday mornings right now we’ve been doing

what we call the covet 19 coffee breaks um these are open to anyone there’s no agenda it’s just like an opportunity to see another human um we talk about pancho obviously but you know baking or baseball whatever you want to talk about that’s that’s there so um just thought i’d mention that really quick there’s a weekly uh developer meeting it’s kind of a mix of scientists and software developers it’s on wednesdays the timing alternates i didn’t put it on the slide between an early time and a late time for different time zones and we do a lot of coordinating on kind of ongoing development in the open source uh scientific python world this is just a screen grab from a few weeks ago um and it’s a good way to keep up with kind of what’s at what the present day uh activities are um on the pangio project i spend most of my time on this slide uh this is uh we we have this kind of thing that on the surface looks a little uh bureaucratic but we have these things called technical or topical working groups and it’s not meant to be bureaucratic at all it’s meant to provide a space where we can have more topical discussions go a little deeper so right now we have four of these working groups there’s a data working group that does things like that talks about things like data formats schemas best practices and performance so if things like you know talking about net cdf versus our versus tile db versus cloud optimized geotiff or in your wheelhouse um this is a this is a good working group did we lose joe i think we might have let’s give him a few seconds but it’s 9 10 in our agenda but it’s actually 9 45 in the real world and so we don’t have a huge amount of slack we have negative 35 minutes of slack right now so unless joe’s internet returns happily within a few seconds we may need to wrap up um we will post uh folks um links to all the slides okay joe actually dropped out so his computer may have crashed um lindsey we were wrapping up and i think you were gonna run the what we have left of time with the q a session so i might just hand it off to you um and and unfortunately we may have had to cut joe’s talk short uh no problem we’ll hopefully get uh joe slides but at least they get a bit of a flavor of some of the places where you can interact with the pangea community and as a part of the q a session we’ll also be inviting folks uh to engage on discourse uh to answer some big picture questions that we have posed and that we were hoping to speak to but we probably won’t have the time to get into in depth um so what i’m going to do is i’m going to share my screen here all right and can folks still see that perfect okay excellent um so the first thing that we’re going to do is actually invite a couple people who have posed some questions either before the session or um during the session uh to ask their questions so i’m going to start off on by inviting phil austin uh to jump in with his uh question and sort of call for some community input sure can people hear me yes all right so just a little bit of background i’m um at the university of british columbia where i chair the atmospheric science program in a department that’s sort of broadly earth science so we’ve got about 50 faculty we have about 350 graduate students and i think we actually are kind of typical for university infrastructure in that we have a ton of pretty capable three thousand dollar desk side linux machines that uh you know you just occur uh into budget grants where you have to buy something to lock in your um your budget uh before it disappears that kind of thing and so i i believe that there’s a missing middle i’d call it between zero to jupiter hub which does great i mean it’s it’s for bringing up a meeting to teach it’s perfect um but you leave little the littlest jupiter hub and then you hit zero to jupiter hub and it’s not a learning curve it’s really a learning wall i mean uh you need uh pitons to get up uh kubernetes i would say

just giving you my personal experience and so what i’m volunteering to do is is just share my own journey which is to provide this intermediate stage where you can use these desktop machines or bare metal like little little stupid hub but you’ve got a pathway that involves docker and then just one other thing i think for outreach and training trying to figure out where graduate students are going uh experience with docker and you know how to actually manage the cloud is something a graduate student probably needs and i think having a graduate students practice on something that’s free as opposed to practicing something you pay for is just a huge win thanks bill um does anyone else want to chime in or have follow-up thoughts on on phil’s comments here i’ll jump in there really quickly phil having deployed these things both single systems and at cloud scale i i agree that there’s a middle ground um and and you you sort of already mentioned sort of docker things like docker swarm are just such nice eevee easy infrastructure to install on a number of managed machines so if there was a model of installing the jupiter hub to that kind of infrastructure i agree there’d be a lot of use for that yeah just to borrow something i wrote on this jupiter book ticket there’s nothing more eloquent than a working example and being able to just do docker compose up and have something actually work where you can also ssh in and figure out where the jupiter hub config is stored and all that kind of stuff play around with a reverse proxy for those of us who are using that for the first time and then look at these look at these um ways of spawning notebooks uh for a certain type of person and a certain type of graduate student it’s it’s just really important to be able to get in and watch the pieces actually move and so to continue conversation i know you posted on the pangea discourse is that the best place for if folks are interested in in conversing with you on this would that be the best place to get started uh sure i i visit that pretty frequently and then also uh they’ll i’ll uh there’s also this pangio outreach right poets which has been pretty quiescent but i think i’ll try and bootstrap uh something there and i’ll just start posting my own sort of um i’ll for for anything i learn i’ll actually put an executable book together and i will have the executable book itself will be a docker compose github repos so you’ll be able to run my executable books with a jupiter hub and a web server and we’ll just see how that goes excellent thanks phil next up we have a question from lisa alisa are you online i think so can you hear me yes great um yeah so as i said here i’m a phd student in the energy resources group at berkeley and a master’s student in the computer science department here as well um but before this i worked at an environmental consulting firm um and did a lot of data analysis and data work with cmit 5 so the cmip6 has seemed really interesting to me and i still contract for that company so i’m basically predicting i’m going to have to give some pitch to them in the next year or two about how we’re going to work with cmip6 because we have nowhere near the resources um to do that locally we were pretty pushed to the limit even with cement 5 so sort of curious i think we’re mostly academics and government people but was sort of interested in like the infrastructure permissions costs for working with some of this potential cmip6 infrastructure from a private company setting just curious people have done it or thought about what that might look like joe is this something that you would have experience with your back you might not actually be on the call he probably would if he was back i think yeah i think joe is not back i don’t see him at all on the on the list um so he may have had a an internet mishap scott did post some helpful links though um on slack i think yeah yeah go for it oh um yeah the links are on slack replying to lisa’s question but um for sure private companies have been setting up the same infrastructure we’ve shown today for the cloud deployments um i think it comes down to whether or not you have personnel to dig into the details as philip yeah and it’s a bit of a learning curve and so there are companies now as well that are starting to provide these services for you know with a fee um catering towards specifically companies

rather than like educational government academic groups and so that if you’re going if you’re doing work for a company i would possibly recommend looking at some of those yeah start some of those startups cool yeah that makes sense yeah i think overall just this stuff i mean as a speaking from a graduate student perspective i think the comment before this was really relevant and this the cmip6 and other stuff is really relevant so it’s exciting to see as a student well thanks scott and lisa it looks like uh joe has just rejoined us sorry we lost you there but we did actually get a question that is quite relevant to your talk asking about uh if the working groups are open to the public yes somehow zoom totally hard crashed on me and i’m back now so all right and the answer is yes they’re all open to anyone so the website that i had up there painting dot io slash meeting notes has information about joining um any of the five or six regular regularly scheduled meetings that we have for the working groups what i can do here joe is i’ll stop screen sharing if you’d like to share share those slides if you still have them up or if there’s any other comments you wanted to to get in before before we lost you um yeah i don’t know if i still have them oh i do have them up still so um i’m not a host anymore though so i think it’s okay i was on this my second last slide it’s um it’s not a big deal okay um were there any other questions uh for joe on engaging with the community joining in on working groups anything anyone would like to bring up there i also wonder if joe had any comments on lisa’s questions because he’s worked i know he’s worked with google on making the cmap6 data sets available through at least through google cloud and i don’t know if that intersects in any way with lisa’s concerns on access to those data and their usage so the joe the question that’s on the screen right now yeah so there’s actually some kind of movement there there’s actually going to be a mirror of this data also on aws so we’re kind of seeing the data um proliferate a bit um at least between google and amazon um and it’s they’re in public uh public uh data buckets so there’s no egress charges for using them outside of the cloud but you’ll find that the performance of using them in the in the same region cloud region would be the best so um you know i think to the there’s there’s no there’s no nothing that limits you to using uh that limits us to just having this be kind of government and academic anybody and actually there are a few companies out there that are using these these data resources from the cloud um just just like the png like they have kind of their own private version of pango running um and accessing these data so it should be no problem if you can sort out how to pay for it yeah that makes sense uh were there any other questions that folks wanted to bring up pause for a moment i’m curious about the the size of data that you would typically use work with because if you work with cmip6 cmx cmip6 is so big that you don’t work with the whole dataset at once but how big chunks do you typically work with in your workflows lisa do you have uh oh that’s for me sorry i’m just gonna say that for me um yeah so i think i think cmip6 will probably be an interesting um like new challenge for that question with cement 5 we were probably at largest working with i mean we were storing maybe six or seven terabytes on disk um we’re working with the loca downscaled cement five product um so it was a 16th of a degree um and i think we used 14 gcms um so yeah somewhere around there so so sort of as you mentioned the challenge hasn’t been although the speed with actual it’s like there’s two challenges right there’s the storage challenge that we were facing because we’re sort of pushing the envelope on the data processing of the company and then there’s also the processing side of things and both of them have started to become

bottlenecks for my work you know the things that i’m asked to do are not particularly complicated like i can do them but the company resources are just bottlenecking both the processing and the storage um so it kind of depends on what the client wants um i think cmipsix will be a new challenge because everything is not really an option anymore thank you any other questions okay well i know we’re coming up in the hour so what i would like to do is show there’s a couple of questions that we’ve prepared um for the community that i think would be interesting uh to really collect some community knowledge on and see what we can do um to potentially build upon ideas and so what i’ve done is we’ve posted these on discourse um so i’ll read them out now and we can we can think on these and hopefully continue some of the discussion on um on discourse actually so eric and i assembled these together so um perhaps i will let eric uh read through this one and we can alternate here so what does your interactive computing workflow look like today and what you envision it will be in five years so we hope to get input on your visions on how to improve the workflow and understand better how how your workflow looks today the next question we have relates a lot back to eric’s talk um how would you like to publish and share your computational research and where can improvements current where can improvements be made um so thinking sort of again both about this how is your current workflow implemented and what would you envision it and what ideally would it look like for you and how do you stay up to date with the evolving open source ecosystem in other words how do you learn about the new tools that you may want to use and how would you like to learn and how would you like to keep up to date on these tools it’s a lot of tools how do you learn about them and then two if there’s other ideas for community projects i think phil prompted some nice questions of you know what can we be doing sort of in between um the littlest jupiter hub and jupiter zero to jupiter hub but there’s certainly other ideas out there um and so feel free to to post your questions and uh we can look at you know finding uh getting in touch with the right communities to take action on those um so we posted and eric posted the link in slack we’ve got the discourse post that we’ve created uh for this session and so we posed those questions there so we would invite you to contribute your ideas um what we would like to do is we’ll be writing up a short blog that’s a bit of a summary of of this meeting um and we also would really like to include ideas that folks have posted and posed and so we’ll we’ll work on synthesizing uh what you share with us um and so with that um we just want to say thank you for joining us today uh it’s been certainly an interesting and enlightening session i really appreciated all of the the speakers who took time to prepare material um and uh invested their time and effort in in sharing their knowledge with all of us uh so big shout out thank you thank you to all the speakers and thanks for everyone for spending time with us today i hope it has been a useful session for you and we look forward to your input um fernando i don’t know if there’s anything else you would like to to close with usual you’re muted on zoom i echo i echo your your thanks to all the speakers team to the earth cube team as well who hosted us and provided coordination and support on on slack we very much appreciate it um and we will post as soon as the recording of this meeting is ready uh we will uh post it online on on discourse as well um and all of the slides and materials from the presentations that we have are also available and so we will we will post that uh we will post those links both on discourse and on slack so thank you everyone and uh we’ll try to wrap up now so that people can go on to their 10 a.m meetings which i’m sure everyone has their next zoo meeting to jump into so i’ll stop the recording now um thank you everyone it was a pleasure and we thought it was useful for for folks i’ll stop recording

Google Cloud Summit Munich – Keynote

[MUSIC – DNCE, “CAKE BY THE OCEAN”] [MUSIC PLAYING] SPEAKER 1: Ladies and gentlemen, please welcome Michael Korbacher, director, Google Cloud Germany, Austria, and Switzerland [MUSIC PLAYING] MICHAEL KORBACHER: Herzlich willkommen Welcome to Munich’s first Google Cloud Summit The future of Cloud is now It is not a question of if, it’s a question of when, and when is now I have been a longtime believer in Cloud technologies I’d probably date that back to the late ’90s At that time, I was working for a company called Compaq Computers And some of you might remember that brand which was then acquired by Hewlett-Packard At that time, I was privileged I was based in the south of France, and my job was to launch what was called back then an ASP Competence Center ASP stood for Application Service Providers,

which was the early form of what we call software as a service, or SAS Competence Center today And I was privileged because that place was located somewhere between Nice and Cannes, somewhere in the hills, in a place called Sophia Antipolis And the French used to refer to Sophia Antipolis as the Silicon Valley of Europe While I was doing that, launching this competence center, there were some smart people thousands of miles on the west in the real Silicon Valley, busy doing something else, because they were busy, Sergey Brin and Larry Page, launching Google Now, I do not want to compare efforts, and I do not want to compare results But I would say some of the ideas that we had around utility based computing and server on demand back then probably made it some way into the Cloud of today For sure, what I have taken forward, is the passion for Cloud And I’m quite sure it’s this passion for Cloud that has brought me here today My name is Michael Korbacher, and I’m running Google’s Cloud business for Germany, Austria, and Switzerland Now, the time has not stood still, and Google has progressed and we have progressed And if we look at from what happened something like 19 years ago in a garage in the Silicon Valley where people started pulling together infrastructure held together with LEGO blocks, we’re now looking at an infrastructure where we’re running numerous services that have, on a regular basis, more than a billion users Think about Gmail Think about YouTube Think about Search Think about Maps Think about Play, Android, and so on What an infrastructure it takes And all that based on an idea and a mission that I would call rather ambitious A mission that said in a simple form, well, we’re going to take the internet We’re going to index the internet We let people search that index, and we give them relevant answers to their searches no matter where they are on the planet, ideally within milliseconds That was a big ambition when you think about it, to organize the world’s information to make it universally accessible and useful That’s a very big ambition when you think about how the internet evolved and how the internet grew To make that happen, there was probably not a lot of off the shelf products that would make this mission come true Think about it What file system could actually manage that volume of data that you would create? What networking gear could actually handle all this bandwidth required? What product would I use to search this amount of unstructured data and give people replies within milliseconds? How do I scale? How do I scale out? And how do I improve my results using technologies like machine learning or artificial intelligence? So when looking at this to get to where we are today, one of the requirements was to drive constant innovation, to explore a path that nobody has gone before And that led to the development of things like the Google File System When we’re talking about BigQuery, how do we search large amounts of unstructured data? When we think about developments such as Kubernetes, container based technologies, how do we distribute workloads across servers, data centers, Clouds, as well as making machine learning more accessible, and artificial intelligence more accessible, through frameworks like TensorFlow?

Now the great news is, in necessity of fulfilling our mission, we’ve created all this innovation, and we’ve given it back to the open source community And now we’re taking all this, the infrastructure that serves billions, and the tools that run Google, and we’re making this available to companies in the form of Google Cloud You might think, hm, all right, that’s great I’m a manufacturer I’m a retailer I’m a logistics company, or I’m a bank Why do I care? Well, you should care because if you are a traditional company, what sets you apart in the past, which might be, oh, I’ve got a big data center And in that big data center, I run some really big enterprise IT I might have been your competitive advantage in the past For some of you, this might be a slight burden today because one of the key things that companies need to be aware of, and that maybe is your treasure, not your data center, but the data within that data center And how do you exploit? How do you use? How do you make better decision from that data within your data center? And that often requires the tools that are available in the public Cloud I don’t think your sense of urgency is to take a workload from A to B and then save 20% on the way, or you name it That does not necessarily imply a sense of urgency, especially if you think about companies doing quite well as an average across the region Your sense of urgency is a different one You want to be prepared for the future And you want to be using the tools that your competition can use as well And your competition might be somebody that you have not been dealing with in the past That might be that startup company that takes their credit card, goes to the Google Cloud, registers on the portal, goes to the console, and within 90 seconds, spawns up a thousand node Hadoop cluster They use it for 10 minutes, and then it takes them 60 seconds to shut it down That is your competition today At the end of the month, they will find an invoice on their credit card statement for 12 and 1/2 minutes of usage of Google Cloud This is the environment that we’re in today, and this is your sense of urgency, whether you’re a startup or whether you’re an established company You want to stay ahead of the competition You want to do that with the latest tools available to you You want to look at how you work together and be more collaborative Increasing productivity is everybody’s challenge You want to be, at the end of today, more agile, more reactive to changes in the business, ideally you want to be proactive, really And all that as you’re dealing with security threats out there on the internet, you want to do all that, ideally, and be more safe And this is where Google and the Cloud can help For us, this region, Germany, Austria, and Switzerland, is one of the key regions on the planet It’s a center of innovation It’s the home of many multinational companies that have global influence And this investment that we’re making, and this importance, is reflected by the investments we’re making And one of the investments we’ve announced three months ago is local computing power It is our data region in Frankfurt, which has dramatically reduced the latency and opens up new opportunities for application development and application deployment within the region And it helps companies to keep their applications and data close to their heart We’re completely committed to GDPR

Who’s busy with GDPR at this point in time, the General Data Protection Regulation? Many So are we And we think this is a blessing So we’re taking what used to be a guidance from the European Union, which was then interpreted slightly differently from all the member states– and then if you look at the states within Germany, they might have had their own preferences We’re taking that into a general regulation which can help us, and which will help us, and we’re completely committed to this new regulation Another important thing for Germany, which has an impact globally as well, is our partnership with SAP We’ve got SAP here as well So I’m glad to have them here And it spawns three areas Engineering, how we work together to help the industry It’s running SAP’s technology on our Google Cloud Platform As well as our partnership around compliance, where SAP helps our joint customers by monitoring services and making sure that they are delivered to the service, delivery and quality as required Cloud is used by many companies across the globe Don’t worry, I’m not going to talk about every logo here You will know most of these logos And you will probably think, gee, some of those logos, they look rather conservative, some of those companies I didn’t know that they were looking at Cloud I can tell you they are The Cloud is real The Cloud is there And the Cloud wouldn’t be real without our partners Please make sure, as you are here, to step by the partners’ booth and visit them, because they are the ones that take your opportunities, that take your requirements, and then make them solutions on our Cloud platform I’m very happy to announce one of our most prominent examples on stage, a retailer One of the largest retailers in Germany, needless to say, one of the largest retailers on the planet They serve businesses and they serve consumers And they have a clear vision on how they help their customers in their digital transformation Please welcome on stage, the CIO and the chief solution officer of Metro AG, Timo Salzsieder [MUSIC PLAYING] TIMO SALZSIEDER: So good morning, everyone Happy to be here Thanks for the invitation from Google Michael already mentioned what we are doing, and just to give you a bit of an overview, you see on the slide here– because most of you should be familiar with Metro You know these blue boxes If you’ve been around the major cities in Germany, you might also know Real which is one of our brands But not a lot of people actually know how big we really are So it’s $36 billion business, globally We are serving 35 countries and our customers are mainly out of the B2B space So we’re talking about hotels, restaurants, caterers, et cetera, and we serve them mostly with food but also with non-food products It’s a global business, as I said We have 150,000 employees worldwide Within IT, we are 2 and 1/2 thousand people with a large development unit, so we have subsidiaries in Romania, in India, in China, obviously in Germany, so Düsseldorf is our headquarters, which has opened recently an office in Berlin, also to attract talents in that area And the funny story was in Berlin that we are on a campus with a couple of other start-ups there, and one of the guys approached us and asked, how is your funding doing? And actually, they did not obviously know that we are quite big So we are doing well regarding our funding So what are the challenges– and Michael just mentioned, we have obviously the issue that we are pretty strong when it comes to offline business So all our stores are going really well, but as you know, the food sector is kind of slow, compared to other areas like clothing, books, consumer electronics, et cetera And therefore, we are moving obviously into the digital space now, so the digital transformation, another buzzword used within our organization And due to the fact that we have these global operations, we have obviously a couple of massive challenges

So scalability is, for us, critical, because we are present in countries like India, and just imagine that you are running some campaigns on mobile devices, send out push messages, et cetera, which obviously means that millions of people will approach all services So scalability is, for us, really important The other issue is performance, because we just need to make sure that our users are getting excellent performance out of our applications And security, you might have seen in the news, that Metro had also some incidents regarding WannaCry in Russia and Ukraine We handled it quite easily, but at the end, we are obviously attractive for some of the hackers Therefore, security is one of our top priorities The other thing I just mentioned is the digital business And you see here a couple of screenshots about our actually offering for the digital space or classical web shops And we implement them, basically completely on our own solutions That means we are not using any external software All of our offerings, whether it’s back end solutions or internal solutions, and also web solutions, are implemented with our own people And obviously, we have challenges like providing this service globally You see a nice screenshot here of one of the apps we are running in China On the bottom right, you see that’s one of our flagship stores in France We have these self-standing devices, so these little devices on the left, and there’s a kiosk system called heavy article kiosk on the right side And you might wonder, why are we having a web shop within a store? So the name says it already It’s a heavy article kiosk So if you want to order, for example, a beer barrel, which is quite heavy, you order it there directly You go to the cashier You identify yourself with the Metro card, and then there’s a nice gentleman from our staff, and this gentleman is delivering the beer barrel directly to your car So this kind of service we are offering in several stores We are mainly driven, obviously, by innovation, innovation about the internal services So you see logistics is one of the topics The football fans in the room might identify one of our customers at the top right So we are serving obviously this football club, mostly with beer and currywurst, and sometimes actually also with champagne, if they play against Dortmund, mainly And obviously, also try to serve our customers when it comes to actually the touch point with their customers So for example, within the restaurants, we will be providing technology and service there And the most innovative part is on the bottom left It’s a self checkout service And just imagine you’re having a scanning device, or it’s a mobile app You’re scan in the products, you put it into the basket, and then you drive this car into this little device It’s a scale So it weighs what’s in your cart The camera is on top of it It takes roughly about 10 seconds And we’re doing this currently in Poland, 10 seconds, and you’re good to go No cashier, no payment, no credit card It’s in there And that just gives you an impression how innovative we actually are doing So what’s this partnership with Google all about? And obviously, we are very excited about this, otherwise I would not be here on stage The scalability topic is one of the major issues Coming back to the example I mentioned earlier, we are having these huge campaigns running in these big countries, and we are also serving countries like India and Pakistan, just to bring you example And therefore, scalability is a massive topic for us We are very happy about the unmatched performance of the Google network because we serve, as I said, customers globally, and security is also very well covered by Google And I know that Google had a pretty hard time with our legal people, with our governance people, and they did very well, because at the end we managed to finally sign a contract with Google and are now able to roll out Google for our digital business worldwide, and be using the Google Cloud Platform to make sure that our customers get excellent service out of our platform The other thing we are doing with Google is obviously engineering As I said, we have a huge development community with our own organization This is actually one of the teams from us in Dusseldorf And we are just exchanging with Google how methodology works For example, design thinking is one of the major topics we are currently investing in, but we are also leveraging for our own applications the APIs provided by Google So machine learning is relevant for us We’re doing a lot of natural language processing, voice recognition, image recognition So we are pretty well on this route, and therefore, using Google as a technology partner is really essential for us also to drive our own applications So, and this is obviously also something that’s relevant for us, innovation drives us massively And we now have obviously use cases like you see here on the screen, where we just

provide to our customers, services based on Google technology, such as we mentioned, there’s a chef Obviously, dirty hands cannot type on computer or any tablet, so we offer them services like, based on Google Home, so that they can order Metro products directly out of the kitchen So these are the kind of services we are providing So very happy to be a part of Google Enjoy the collaboration very much, and thank you for your attention And have a wonderful Google Cloud Summit today [APPLAUSE] MICHAEL KORBACHER: Thank you very much, Timo We are very excited about this partnership as well, and we are very excited to be working with you on your vision I think we’ve got to do something about that soccer club That just came to me I don’t think you’ve been serving a lot of champagne lately, you know? I think that for those– I think it’s Cologne, right? For those not familiar, I think they’re last in the league right now, and they have a little bit of catching up to do, so maybe there’s some initiative we have to think of there I’m excited to announce a person next on the agenda who is from our– we call it the office of the CTO And that’s part of the partnership that we have with our clients Please let me welcome on stage, Paul Strong [MUSIC PLAYING] PAUL STRONG: Guten morgen, munchen Wie gehts? Gut? Fantastic That’s about the limit of my German these days It’s quite sad I was actually brought up in a little village called [GERMAN] on the [GERMAN] But unfortunately, I moved back to England when I was seven, and I had no one to speak German to So now I have this accent So firstly, welcome I think right now we live in, quite possibly, one of the most exciting times that I can remember in terms of the evolution of technology and how it is intersecting with business and with life I think we saw with the example of Metro, there is a real transformation taking place This is not just about technology allowing you to do things faster, cheaper, and more reliably than you could before This is not about incremental change Absolutely, you can use technology to make it cheaper, faster, and more reliable, but technology today is being used to do things differently, fundamentally differently And this is what digital transformation is all about This is not about business as usual This is not about government as usual This is about doing things differently It’s about thinking about things differently Every company in this world becomes a technology business It’s about writing software It’s about getting intimacy and insight into your customers and the world around you This is what the digital transformation is all about And it’s driven by a set of technology trends Many of them have been evolving over many years, but they’ve all come together at this point in time to enable true transformation and change If you think about Cloud– for me, Cloud is not a destination Cloud is a vehicle that takes us on this digital transformation journey Cloud– with the first time with Cloud, you can get technology, and particularly the technology that doesn’t differentiate you, as a commodity In the past you had to buy software You had to buy the hardware to run it on You had to buy the people to run the software and run the hardware Now you can make it someone else’s problem so that you can focus on what differentiates you, what makes you more competitive This is what Cloud allows With mobility, you can now engage with anyone anywhere You know, you can make good guesses as to what they’re doing, what they’re doing in their lives, you’re getting intimacy And from that intimacy, you can get insight and understanding so that you can deliver better services, whether that’s monetizing ads or whether it’s allowing people to do amazing things You see governments who put applications on phones so that their customers, their citizens,

can take photographs of potholes in the road in Malaysia, and they automatically end up on a list of things to be done by the Ministry of Transport Mobility is changing everything Social is changing everything The interaction model, the way people engage with each other, is shifting and changing In some ways we talk a lot about globalization, and we think about it in terms of industry, but there’s a globalization of friendships My daughter, well, one of my four daughters, is 25 She’s from England She met friends online She went partying with many of them, actually in Poland, and now she’s moved to Australia, where a whole bunch of them live But this whole– the way people talk to each other, the way people engage, the way people share information and trends are driven, is fundamentally shifting and changing IoT gives that same connection that you see with social and mobile, but not with people, but with machines I’m connected I’m a cyborg Did you know that? I happen to be diabetic, and I have a constant flow glucometer plumbed into my arm That allows my wife, 5,000 miles away in the United States, to know that I am healthy And I can monitor my 8-year-old daughter, who is also diabetic And if my wife is asleep and my daughter goes low, I can give her a call and wake her up so that there isn’t an emergency This connectedness is allowing us to get more data and insight And all of that, big data The job of technologists is to turn that data into insight and understanding, and then act on it, to be able to engage, again, back through the internet of things or with mobile This intersection of all of these technologies is profoundly changing everything we do in life, and it says that every organization and every business today is a technology business Your success will be defined by how you use and consume technology, and your failure, or your epitaph on your gravestone, will be how you didn’t, if you don’t take advantage of it What we see happening with this technology is technology is being used It is being used to undermine the assumptions, the foundations, that you build business models on I’ll give you a couple of examples The first one is a fairly famous one Most of you will be familiar with the stories of ride sharing and things like Uber and Lyft What’s interesting here is the story of how it came about So taxis, you know, it’s an old established business And there’s certain barriers to entry to the market place If you want to be a taxi driver, you have to learn your way around cities because driving with a map in your lap is kind of dangerous And you kind of want to know that you can trust the taxi driver So in London, they have a thing called the Knowledge, and you spend three years studying all the streets and the landmarks around London And it’s quite interesting, actually So these taxi drivers– all this memorization and recall actually turns them into mutants So it turns out that when you memorize things so much, it actually makes a part of your brain bigger, physically bigger So the Darwin in me says that if we look at taxi drivers for long enough, they’ll end up with these enormous heads, just like those aliens that you see in 1950s science fiction movies The point is that these businesses in the underlying assumptions were predicated on teaching drivers and validating them through licensing Now, with a mobile device, it has GPS, it has voice, it has maps, it can tell you where to go You can summon your ride, you can see where it is, you can give feedback at the end so that you get a better idea of the reputation of the driver than you ever, ever could before This is a fundamental shift This is technology disrupting established businesses If you think of the spare parts business and the logistics, 3D printing means you don’t need warehouses full of spares anymore You can print them on demand with 3D printers and CNC machines at the edge This disruption is coming to all of us where technology is being used and it says, is that assumption that your model is built on valid anymore in the face of technology? And when it isn’t, then there’s risk,

because the small and nimble, with nothing to lose, have access to the same technologies as the largest businesses in the world And that is because of Cloud So when we think about the journey going forward, the question is, what does it look like? What are the steps? We put it down into four key areas The first, optimize Get out of the business of delivering IT that doesn’t differentiate you, and make it someone else’s problem Make it Google’s problem, because what doesn’t differentiate you does differentiate us So that you can focus on collaboration, driving teams to work together to be more productive and more effective, and through that collaboration, drive innovation And then to accelerate your business, to be able to focus on the technology that enables insight and engagement to actually drive new value, new services, and to take them to market And then in doing all of that, you need the right partner We believe we are that partner to help you take that digital transformation journey Let’s quickly talk about optimize and what’s happening in that space For us, Google Cloud is a full suite of products going from the low level of offering compute network and storage all the way up to work collaboration suites like G Suite When we talk to our customers, there’s a set of drivers that we see Reliability, security, and so forth It’s the classic list that IT want out of infrastructure Google, what doesn’t differentiate you, you want to get us infrastructure as a service I say is infrastructure is a differentiator for us We believe, and our businesses, all of those billion user businesses, have driven us to deliver better infrastructure, more scalable In fact, we build our own servers If we sold servers, we’d probably be the world’s third or fourth largest server vendor, but we don’t sell any They’re all built and customized for Google so that we can deliver services that are more reliable, that are faster, and cheaper than we could in any other way And we were driven to do that by the growth of our own business And the opportunity for you is to take advantage of that It’s also about managing things in a different way With site reliability, engineering is about taking us to a place when we build out infrastructure in this new world, we think differently about availability It’s all about resilience and reliability, and it’s baking it into everything that we do The apps in this new world assume the infrastructure breaks So you have to build and design them differently, and you have to manage them differently, and we wrote the book on it, something we shared with the world And I encourage all of you to actually get the site reliability engineering book and to have a read and learn how Google has figured out at least how to deliver at extreme scale and with great reliability I always say one of the amazing things about Google that you can learn is we can afford to make the mistakes that you can’t We’ve been very lucky We have a very successful business And we’ve been able to experiment and learn the lessons that otherwise would be very expensive In terms of delivering performance, we have something rather unique Google actually has a navy I bet you didn’t know that, right? We actually have ships And what those ships do is they lay cables under the ocean One of our main differentiators is actually by having our own physical network with the fibers that we own ourselves going across the globe, we are in a position to make your applications perform and be more reliable than in any other infrastructure You don’t have to worry about talking with Telco providers for WAN connections because we can give you very, very fat, fast, low latency pipes across the world, so that when you build applications and run them on Google Cloud, they can talk to each other and get very, very low latency And in fact, today, something like 40% of the total world’s internet traffic goes across Google’s own private network This is pretty insane By the way, when I joined Google, I was like, insane engineering Crazy numbers We build a lot of our own networking kit To allow our customers to take advantage of this, we also now have direct connection capabilities around the world so that you can place your servers so that they don’t have the internet between them and your services that sit inside Google’s Cloud And we value the set of regions We talked earlier about moving into Germany and Frankfurt, and we’re moving into other areas around the world, expanding very, very fast All of this is about giving you unrivaled performance and scale In fact, our investment in delivering all of this is somewhere in the region of $30 billion

in terms of a trailing, three year, CapEx investment This is so that you don’t have to make that investment We invest in the infrastructure scalable, resilient, so that you don’t have to do it And we can deliver it in a way that is very cost effective, and very green, and very sustainable We use technologies like machine learning and AI to actually drive down the energy consumption of our own data centers Everything you see in the services we offer outside we use inside, to enable Google to scale and be reliable And this is reflected in our performance numbers You will generally see that Google is there or thereabouts in terms of providing the highest performing as well as extremely reliable and cost effective services Finally, I’d like to briefly talk about security because ultimately, what we hear from our customers is all of this is great, but without security, who really gives a damn about all of this Right? It’s just table stakes At Google, it’s everything that we think about Our entire stack is optimized for security We customize, we build our own motherboards, we build our own chipsets, we build our own operating systems so they can be efficient, but also, so that we can get rid of all of the unnecessary components so that people can’t attack them and they can’t be compromised So we end up with a very, very secure stack We even have our own silicon, our own chips, that we put in our devices to allow us to understand their entire history In doing all of this, that security also results in compliance So with Google Cloud, when we satisfy some criteria in terms of compliance, for example, HIPAA in the United States, or whatever, it doesn’t apply to some subset of Google Cloud It’s all of Google Cloud And so all of Google Cloud is essentially homogeneous It looks the same If it’s qualified for some criteria, then all of it is And as Michael said earlier, part of that compliance story is also around GDPR And, you know, when it comes to the fore in March next year, we’re working very hard, both with our customers and internally, to make sure all of us are ready A couple of quick areas around security innovation The first is Identity Aware Proxy Now, why would you care about this? Well, Google’s kind of interesting When I would work at Google, when I joined Google, we actually don’t use VPNs to connect into Google infrastructure We actually assume we have a zero trust model, and what we want is fine grained access to our applications and services inside Google So what we do is we actually do everything through a browser It doesn’t matter whether we are on Google’s Wi-Fi or outside on the internet And this actually uses a service, called the Identity Aware Proxy, that allows us to get inside Google’s infrastructure and to work with all the applications and services we need to, but with very fine grained security And this we now offer as a service for our customers Secondly, another area of innovation is the data loss prevention API, which is really around focusing on managing data, making sure that only the right people can get at the right data, and getting rid of data, or getting it out of sight, for the people who shouldn’t be at it So what I’d like to do now is to have a demonstration, and we’ll talk about DLP And to do that, please join me in welcoming to the stage Lee Boonstra [MUSIC PLAYING] LEE BOONSTRA: Thank you, Bob Yes, like Paul already mentioned, now with the GDPR coming next year, we certainly need to be aware of what’s in our database, if it’s sensitive data, our personal data in our datasets As a developer, I’m always worried about that, so luckily, we now have the DLP API, data loss prevention API, which can help you with that And the way how it works is you pass in a text, and then as a second argument, you specify some fields where you want to scan form And these fields can be passport number, credit card number, phone number, and then you can scan your dataset, whether you’re connected to your data pool as a batch or as a call in real time, and you get back the data that is sensitive And you can even flag it You can even mask it To show that, I’d like to show you a demo And I thought I’d like to create or take a kind of real live demo here, like a chat application And this is just a fictional company, right? It’s called Acme customer support

And they sell anvils, you know? Like these hammers that they sell in the World of Warcraft or Skyrim And what’s happening here is like in the check books, Angela is complaining that the product doesn’t work and the employee asks, OK, well, what is your contact number? And then here’s the thing This is always what customers do When there is an input field, they just enter everything they have Like, oh, here, this is my phone number This is my name This is my last name Here’s my credit card number Here is my passport And then at that moment, yeah, now that’s in our dataset Now that can be in our database But luckily, we implemented a DLP API here So what I’m doing here is when I enter chat, there it goes, you see, it makes a call And immediately, it starts flagging the sensitive data out So this data won’t be stored And this chat application is just a nice UI, but what you can see here is the actual call that’s been made returns JSON And that’s what you– yeah This is how the API looks like, and this is what you can implement So that’s kind of cool Let’s have another example here And for this example, I like to make use of a camera that I have here on stage And I took here a passport, a German passport, and a German bank card And we’re going to make a picture of it and see if we, in real time, can mask the sensitive data Now in this demo, just to show you that it works with these field names, this array of fields that you pass in, I can show you the fields that we’re, in this case, scanning for So I’m looking here for credit card number, driving license numbers, faces, pass photo But also, specific for Germany, there is a passport field And we have it for all kinds of countries So for example, for the Netherlands, we have a BSN number, which is what we put in a Dutch passport Right? So now let’s make a picture and see what happens here There you go Now, and before you guys think, like, oh, this is probably all scripted, I’d like to welcome Paul back on stage and we can make a selfie here PAUL STRONG: I like that LEE BOONSTRA: Yes, I like that, too PAUL STRONG: A selfie I like being redacted on selfies Is this going to do it for me? LEE BOONSTRA: All right Hey PAUL STRONG: Yay LEE BOONSTRA: Thank you PAUL STRONG: Thank you very much, Lee [APPLAUSE] So I think it’s a very cool demo But what it really shows is the value As we externalize sets of services and complicated technologies like machine learning, you are able to make use of those, trivially and simply Some of these in applications and services you see built in the demos They can be built very rapidly and very quickly, and you can get value out of them And this security that we talked about, it’s in depth in everything we do, from the client, and Chrome, and Android, all the way through to GCP at the back end What I’d like to do now, I think you’ve already heard from Metro, and I think it’s great to hear the real stories of people’s transformation At this point, I’d like to welcome to the stage another customer to share some time with you, talking about how Google Cloud has helped them transform their organization and their business And to do that, welcome to the stage, from Deutsche Boerse, Thomas Curran [MUSIC PLAYING] THOMAS CURRAN: Paul Thank you Thank you So hello, good morning Deutsche Boerse Probably none of you really know much about it, but I’ll give you a few things One is if we had Formula One of marketplaces, we’d be racing in it We do about 200 million contracts a month in marketplace transactions If we were a bank, we’d also be in the big boys We have about 13.1 trillion assets under management So we have the very fast stuff and the very big stuff, and in the middle, we have quite a lot of challenges, dealing with a very complex set of businesses Security was just mentioned I think if anyone in the room can imagine themselves building an infrastructure that’s more secure than Google, good luck

I’m a Google Cloud junkie, and one of the reasons is because I believe it’s the most secure infrastructure you can buy anywhere That’s very important for our business because of the size, but it’s also very important because security today is not a one way street It’s something that you have to be able to count on over time And if you go and talk to Google and look at their security infrastructure, you’ll see why The best minds The best technology Compliance, we obviously have a lot of rules in our business, and implementing those rules and doing it over time with a trusted partner is very important But also, we understand flexibility is also an issue Over time, compliance changes because of new regulations and new things that happen in markets, new instruments, disasters, 2008, so compliance is a very big issue, and we need a partner to deal with that And by the way, two years ago, when I brought up Google Cloud, I think it’s fair to say people laughed me out of the room They said that Google doesn’t understand this business But they do Google is not the only Cloud provider that we use The reason for that is we also need to have flexibility in the way we deliver services And we develop software on multiple platforms, so Google gives us a lot of capabilities to move between those Cloud platforms, and in particular, what’s very important is some of the new things that were mentioned here, the compute platform, Kubernetes, in particular So we have these things that give us multi Cloud capabilities, and that’s very important for our business Why is that important? What are we doing? We’re building a global platform Deutsche Boerse is in 27 countries across the world, mostly in Europe, but also in the United States and Asia And to serve customers in those countries, we need a scalable and resilient platform to deliver our business services On top of that, we’re building new business, business that relies on a Cloud infrastructure for real time smart data and analytics Those analytics are built on very new technologies that we deliver in the Cloud, and we deliver them via an API platform And the ecosystem of trusted Fintech companies that we’ve been able to sign up and work with, and it’s a growing And if you’re in Fintech, please talk to me about it All of that is new, and as we already heard today multiple times, the world is being digital Digitization is important And in our business, developing new digital business models and delivering new digital products is very, very important to growing the business Here’s one example It’s very concrete We have a big, giant pipeline of real time data coming out of our trading infrastructure, again, 200 million contracts a month Those are not just for derivatives They’re cash, they’re foreign exchange, there’s reference data By the way, we run also the largest energy trading infrastructure in Europe We use different technology standards For instance, Pub/Sub some from Google, we use PCAP, which is a network standard, we use CSV, we use XML, we use JSON We put all of those things in different formats into a data link on Google Cloud Storage That’s very important because again, in the past, flexibility was not a given You had to structure information to store it in large infrastructures Today, that’s changed a lot That’s changed a lot because you have new technologies that let you then transform that information, big data, BigQuery, TensorFlow, machine learning To operate in Google Cloud, we use a very great invention from Google, by the way, called Go Language If any of you have ever tried it, I suggest it We use it because Google learned, as we learned, that managing large complex infrastructures, you need to have speed and reliability So we’re building our bots in Go We’re building data marts then in Bigtable And then we’re providing those to customers via our digital business platform, which is, guess what? Also written in Go and has APIs that then customers, partners, academia, people we never met before, can now consume data from Deutsche Boerse group, in particular market data, from this public dataset

based on BigQuery We’re doing all that because, again, it’s not only I’m a Google Cloud junkie We’re doing it because we see a growing and emerging new kind of computing infrastructure that we believe will deliver better results, will deliver better security, and over time, gives us the reliability we need to build a global infrastructure that will help our business grow Thank you, Google, for having us [APPLAUSE] Thank you PAUL STRONG: Thank you so much, Thomas THOMAS CURRAN: You’re welcome, thanks PAUL STRONG: Really appreciate it, and we really appreciate your partnership So this morning we’ve talked– had customer examples, we’ve talked a lot about digital transformation, the stages of the journey, the notion of optimize Get out of the business of the IT that doesn’t differentiate you Next, we’re actually going to explore collaboration, he says, if the slide will come up So optimize Get out of the business of delivering the IT that doesn’t differentiate, and focus on what does differentiate And then what you really need to do is to drive productivity and the environment that drives innovation That’s all about collaboration And to spend some time talking to you about that and sharing where we are going at Google, and how you can take advantage of our work, please welcome to the stage, Bill Hippenmeyer [MUSIC PLAYING] Thank you, sir BILL HIPPENMEYER: Good morning Great to be here So you might ask yourself, where does collaboration fit into the power of the Cloud? We could talk about AI, and we could talk about transformation services, infrastructure capabilities, and all the things that are involved in being able to take workloads from your systems and your environment, and take advantage of the leverage the network that we provide in all those things Why collaboration? Where does collaboration fit in? It is our belief at Google that collaboration is the beginning of this transformation process If you think about it, transformation is the end result of a big idea Where do big ideas come? They come from the creative juices and forces that are happening within your organization Creativity is all about what? Creativity begins with engagement Engagement is when one, two, three, and more people gather together and they begin to synthesize new ideas One person has a concept, a problem, a solution, some thoughts about something, but then that’s immediately impacted by the people and organization around them And ideas come from everywhere They come from the top of the organization where we’re doing strategy and thinking out forward They come from the grass roots of the organization where in retail stores, in manufacturing plants, and across the business, people are coming up with new ideas, new approaches, and new ways to solve problems That has to be facilitated with an infrastructure In today’s world, I’m going to state the obvious You’re going to say, Bill, we get it Right? Things are moving faster If you look at the innovation cycles in today’s world, if you examine the product lifecycle and the ability to bring new products to market, you no longer have 20 years to bring a new BMW to market You no longer have 20 years to reconstruct new life science approaches, or think about new transportation logistics solutions, or to rethink digital communications across the globe What you must do is you must shrink that lifecycle You must accelerate ideation And the only way to do that, we believe, is to be equipped with a proper set of tools to do so The Google Suite that we have constructed, and the tools that we would invite you to use, are built around this ability to gather people together in real time to collaborate to come up with the big idea, which is going to lead to the transformation that you’re looking to do So we have a set of tools that allow you to create and to rise that level of creation instead of being an individual activity, but to make it a group activity where you’re working in real time We provide tools that allow you to connect Connect in real time, exchange ideas, and be able to riff off each other and synthesize new thoughts in new areas We do that with the ability to quickly search and access content in one place, held in Google Drive, and then of course we have to give you

the control, the security, the compliance, the DLP capabilities, to be able to protect that content so that you can be compliant with things like GDPR And when you do it, there’s an impact PWC adopted the Google Apps or G Suite platform several years ago They rolled it out worldwide, and they saw a dramatic impact in terms of their productivity If you think about the world that they live in, their consultants are working every day, both internally and externally, to be able to try to advise on financial issues, tax issues, and a variety of other consulting areas What they have found is that by being able to reduce the amount of travel they have through Google Hangouts and being able to connect real time, through collaborative document editing and inviting both the customer and the suppliers that they’re involved with into the conversation, they’re finding that they’re able to get things done faster And the ability to be able to connect in real time, those things have led to real savings In a study that they did, they came back and said that on average, they are saving nine hours per week per employee, and they have a lot of employees That’s real numbers And the financial people love the impact of that But why can’t we just use the tools we’ve been using for years to accomplish the same thing? Our argument and our platform says that the way of collaboration that our competitors are offering is really an old school style of methods They haven’t really changed, fundamentally, the way you work In fact, I often talk to C level executives that have adopted the competitor platform, and I’ve asked them this one question What are you doing different today now that you’ve adopted the latest from your competitor, and how has that transformed the way you work, and how has that accelerated your ideation platform? And in every case, they say, well, Bill, that’s not what we were about We wanted to eliminate the impact of change management on our organization We didn’t really want to change anything We liked the way we were working And I’m like, really? Is that what you want to do? You want to just work the way you’ve been working for 20 years, where an individual creates a document, or a spreadsheet, or a presentation, and then they send it out in email as an attachment as a way to collaborate and be able to exchange ideas? If you were going to come up with the top 10 vacation list ideas in a matter of minutes, would you get into a document and list the top 10 and then send it out to 10 of your colleagues and wait for them to reply? No You would actively engage with them in real time, most likely in a conference room or in a Google Doc, and say, let’s come up with the top 10 quickly And when everybody is typing at the same time and they’re all interacting in a collaborative way, what happens is you begin to raise the ideas quickly You begin to collaborate and have visual inspection of what other people are doing And now what you can do is you can avoid the need to replicate what Greg has put down So Greg and I are collaborating Tom and I are collaborating Alexander and I are collaborating And I can quickly see they don’t like Cancun, Mexico They prefer Nice or Malaga Spain to be able to do vacation That’s their preference They’re Europeans, and I don’t have that experience Now, I do live in London now, which has been a really eye opening experience and I’m really enjoying it All right And then the most important thing that I really love about the Suite is the ability to connect in real time The reality is that we don’t have to wait for the Thursday afternoon meeting that’s scheduled for one hour because all meetings must go an hour That’s just the rule of meetings Right? That’s not the case When I can have an idea or an inspiration and I can literally reach out to Joe and Karen and say, can you join me on a Hangout for five minutes and let’s walk through this concept and this idea And when I ping them and they come back to me and they say, yep, I’m available, and we get on a Hangout, in five or 10 minutes, we can move the idea forward in a way that a meeting on Thursday afternoon at 3 o’clock would not do So the ability to instantaneously reach out, click of a button, and connect in real time is a powerful capability And that’s what we’re doing And then lastly, and you’re going to hear a lot more about this, is the application of artificial intelligence to this environment AI, in its simplest form, is really about observing patterns and then being able to recognize the outcomes that are associated those patterns, and then being able to do something that’s predictive and being able to actually look forward I think of it like this If I asked you to be a diving judge at the next Olympics, how qualified are you today? Probably not very well qualified I’m not But if you sat with a Olympic judge expert and watched 100 dives, you’d be a little bit better What if you watched 1,000 dives? 10,000 dives? 100,000 dives?

A million dives? We are literally taking samples from a billion people that are simultaneously using our products, and we’re able to look at what they’re doing and predict what’s next And one of the great examples of that is expanded upon in Explore in Sheets If you regularly get a dataset, maybe it’s a point of sale data, and you’re trying to ask a simple question like, OK, Google Show me the highest profit product that was sold during the months of September, October, and November across the different states in Germany, and illustrate– raise to the top the most productive or the best margin product during that period You don’t want to have to write the formula that does that But Google has been watching the way experts analyze data And from thousands and millions of examples, they can actually– we can actually predict what’s next and offer some insight, immediately, when you open up So all you have to do is hit the Explore button, and now I have a set of graphs, and charts, and pivots, that are already available to me to be able to examine And I can do that without having to do any work whatsoever, and then I can start to ask questions, and Google will actually write the formula for you So we’re doing a lot of work in AI, and that’s only one example of many, in terms of what we’re doing to be able to app apply artificial intelligence and machine learning to your work as you collaborate So let’s not just take my word for it But what I’d like to do is invite a customer to come up and talk a little bit about what they’re doing in the application at G Suite So I’m happy to invite Alex Pollmann from Vismon to come and share his experience [MUSIC PLAYING] ALEXANDER POLLMANN: So yeah, welcome Good morning from my side I’m really happy, proud to tell you, that we have gone Google this year And we have gone G Suite It’s a digital transformation at Vismon with G Suite Vismon is a traditional family owned business, and we are producing heating boilers, cooling systems, and also some industrial systems Our revenue is about 2.2 billion euros, and we have 12,000 employees all over the world So we started our digital transformation around 2015 because we saw their new business models came up, new competitors, also company of alphabet, Nest I have also on my slides They came up, and they are taking our core business, and also our thoughts about new businesses So what we have done, we have spread out a strategy for our digital transformation And there was a project called the future workplace So the future workplace is more about– at the very beginning, it’s more about evaluation of a platform solution So we came from an IBM Notes infrastructure We have run IBM Notes for these 22 years, and you know it is a big growing infrastructure So our priorities also lay on the usability Because if you want to use your software and you’ll also want to use a user software, you have to get them on an easy software Functionality was also an [INAUDIBLE] Also, the IT security All of you heard about the security layer of Google This is maybe also one of the reasons why we choose Google because we think it is the most secure platform for us to use So the integration part, and also the costs, were also prioritized in the evaluation At the end, we decided to move to G Suite and we are really happy We have now workplace that is easily accessible on any device, at any place, at any time Paul mentioned, our people should engage any time, any place They should easily get together They should create ideas We have now more flexibility, more security, more a team collaboration, and also more transparency at the same costs So our main achievements, like, we have started the global rollout for G Suite this year, in June And we have an average 1,500 people each month coming to G Suite We have migrated them 15% of our users are using bring your own device services Just imagine if, like, 1,200 people spent 10 minutes each day in their private life with their work environment That are, like, 200 hours a day your company will save 1.7 million emails doesn’t need to be archived anymore

In IBM Notes, we had just 100 megabyte space left in your email No other storage is unlimited It’s great And everyone uses two factor identification So everyone can reach his work benefits from everywhere And he just needs to get his smartphone [INAUDIBLE] and say, yes, I am So 4,500 Google Meets are done each month So if you’ve heard, it’s just six months ago we started the rollout, and it’s pretty increasing, this number Also, it’s, like, 4,500 If you think for 4,500, if for one Google Meet of 100 is just a business trip, you can save as a company You will save around 45 to 50 business trips every month, and your people could engage everywhere, everywhere, at any time So the last part is all about collaboration We don’t send any documents anymore It’s all, like, we share it, and collaborate in one document and all this stuff I really like it And I really would like to thank all of my team colleagues, and also our partner, [INAUDIBLE],, for the great support we had during the migration But I’m pretty sure that this is just the beginning of a big change at our future workplace And I’m really looking forward to all the new cool stuff that will come from Google in the future Thank you, guys [APPLAUSE] BILL HIPPENMEYER: Thank you, Alex, for sharing I really appreciate that So we’ve talked about optimize, about leveraging the power of Google infrastructure, network, and data center services Hopefully you’ve got some insight into why collaboration is so important to us What I’d like to do next is take you to the next level, which is accelerate In acceleration, it’s really about learning and growing, and harnessing the power of big data, machine learning, and technologies like that, that will really help you be able to move faster, and to move better So with that, I’d like to introduce my friend and colleague, Greg DeMichillie, who’s going to talk a little bit about accelerate [MUSIC PLAYING] GREG DEMICHILLIE: Thanks, everyone I’m really excited to talk about this section of the talk because, as Paul said, going into Cloud to just shift a little bit of CapEx to OpEx or save 5 or 10%, really misses the opportunity that it provides you The opportunity to not just save money, not just to affect your cost side of the business, but to drive new business And the best way to think about that is to think about it in terms of, how do you get value out of data? Think about it As companies, we have become digital pack rats We store ever increasing amounts of data By some estimates, we’ll have 163 zettabytes of data per year by 2025 I had to Google what a zettabyte is, by the way Just in case you don’t know, I believe it’s a trillion gigabytes And why is it that we’re doing that? The Cloud has made it possible To store a petabyte of data 10 years ago, you had to get a robotic tape archive You had to spend millions of dollars on special machinery Now, for a penny a gigabyte a month, you can store every bit of data from your factory, or your business, or your retail operation So the problem has shifted from, how do I store that data, to how do I find the needle of value in that haystack of data? Now the way this works for most of us today is big data Right? That’s the answer Big data And we know something about that at Google, because most of what is in big data today dates to research papers that Google published 10 years ago Big data is built on a foundation of work that we did within Google And we went through this pattern which is we build these big data systems, we run them in production, the data grows by a factor of 10, we have to re-architect the storage system, and the result is, if you look at most data big data projects within enterprises today, they’re not achieving the results that they promised And why is it? It’s because if you look at where you actually spend your dollars and your time, it’s not on getting insights You spend way too much time on building maintenance, logging, backup, disaster recovery, oh no, the data is 10 times as big, let me rebuild my sharding system Nobody enjoys doing that, yet we do it time and time again

You spend too much time taking care of the machinery, and not enough time actually understanding the data And again, we know this is the case because we went through this at Google We went through this exact same pattern We built data systems, we thought they were great, the data got too big, we had to rebuild them, and we realized we had to have a better way We had to have a different way to think about big data So we set about building out an entirely new set of data and analytics products around the simple idea of reversing this ratio, of inverting the amount of time you spent taking care of the system and the amount of time you actually spent getting Insights So across our entire data portfolio, you will see products built natively for the Cloud where all of the work of scaling, and infrastructure, and management, and even DBA, is taken care of for you And we’ve also gone one step further, because we also realized at Google, that traditional batch analytics aren’t good enough The days when you could have a weekly report of your business and call that acceptable, if you’re a retailer, you want to know what’s happening today, not what happened two weeks ago If you’re a manufacturer, you want to know that you’re having production problems today, not on a quarterly report when it’s too late to do anything about it So we built our systems around the notion that batch and streaming are the same thing That it should be as easy to get real time insights as it is to get a daily, or weekly, or quarterly business report And so when you look across our entire data portfolio, whether it’s BigQuery, which is a data and analytics warehouse in the Cloud– literally, upload a petabyte of data, write a SQL query Done No DBA, no sharding, nothing to set up Or Dataproc, which lets you take your existing Hadoop clusters, bring them in the Cloud, and get out of the business of managing the Hadoop infrastructure Or Pub/Sub and Dataflow, which you heard about from our customers, using that to transform the way you ingest data in real time What all these products have in common is they get you out of the drudgery of maintaining the machinery And they let you focus on the fun part I know of no data analytics professional who got into it because they liked writing sharding systems Do what’s the fun part, the understanding the analytics of data And we see this having an impact on customers Spotify, when they joined Google, if you talk to their execs, they will tell you, the number one reason that they wanted to use Google was to give them a competitive advantage in data Think about it Every time you skip a song, or you play a song, or use search for a song in Spotify, that’s data that helps them predict that next perfect music for you Or in the banking industry, Lloyd’s, where they say they literally went from 96 hours to 30 minutes in terms of their ability to go from an event happening in the market to them being able to make a decision based on that Now, what’s the next step of this? Sure, we can build data warehouses You’ve heard mentioned several times this notion of machine learning and artificial intelligence And we really think that is the next step It’s the next evolution in how you do data analytics And to prove just how much we believe in it, our CEO, Sundar Pichai, has said that this is a core transformative way where we’re rethinking everything we’re doing and that is literally true at Google There is not a project or a product inside Google that isn’t being rethought in terms of machine learning and artificial intelligence If you think about the consumer side, every way that you interact with Google now has an AI or machine learning component, whether it’s your Android smartphone and the way it does speech recognition or auto correction Or how many of you have used Google Photos and typed birthday party clown and found a picture from three years ago of your child’s birthday party when you had a clown come in? Or when you’ve used Gmail, as Bill talked about, where you’ve had an automatic reply generated All of these are built on the notion of artificial intelligence and machine learning Now it’s reasonable to ask, why? I mean, Google has smart programmers Couldn’t we have just done this ourselves without using ML? And I think it’s important to think about why ML was so transformative ML lets us solve problems that we don’t know how to solve Think about when you learned your first language Unless you have fairly unusual parents, I don’t think they handed you a grammar book and a dictionary But that’s what we do as programmers We teach our computers with grammar books and dictionaries You learned your first language by being

exposed to it, by example Your family had a dog Your parents said dog You saw the dog Literally, the chemistry in your brain changed, and you learned that that piece– that animal was a dog ML allows us to take that same technique Learn by example, not by rote, and apply it to business problems Think about your factories, your customer data, your sales data, where there is hidden value and connections in your data that you’re not taking advantage of ML and AI lets us do that So we’ve built a complete platform around AI and ML At the bottom is all that same infrastructure that you heard about earlier The network, the compute, the storage Because at the end of the day, ML is a real big data and compute problem On top of that, we build platforms, libraries, and tools, tools that let you take your business data and train your own models without having to worry about how to scale them, or how to deploy them, or how to manage them And at the top, we’ve taken those very same services that we offer consumers, services that help build systems that see, hear, speak, listen, and we’ve given those to you by APIs So you don’t even have to be able to spell ML in order to build systems that can see, hear, speak And that lets you match the tool for the job If you are an ML researcher, and you want to be on the cutting edge of defining how machine learning works, you can use our open source SDK TensorFlow If you’re a data scientist in your company, you can use Cloud ML Engine to build and train models using TensorFlow but without having to get into the details of how you build and scale the infrastructure And if you’re an application developer, you can use a simple API to build vision, speech, text, hearing, directly into your applications To give you an example of that, I want to show you an example of using those perception– those Vision APIs inside a real application So to do that, I’m going to have Lee come back up, and she is going to show you an application that actually knows how to see, hear, and speak Lee Lee? LEE BOONSTRA: Thank you, Greg Thank you Greg I’m actually very excited to show the Vision API to you because, yeah, I’m not a data scientist I’m an app developer So I can still play around with machine learning, and I don’t need to train my own models No, it’s pre-trained It’s pre-trained by Google, so I can just use it I can just make an API call, and I pass in an image, usually you upload an image in a Cloud Storage bucket, and then I make a Vision API call, and the Vision API can detect the photo And it can see landmarks, it can see persons, it can see objects It gives you all that information back, OCR detection, it’s very nice So that is something I’d like to show you today I created, over the weekend, a simple app where I uploaded two pictures here I uploaded a photo of a landmark and a photo of a person So let’s have a look into the landmark picture first Now, I thought for this presentation today, my coworkers, they made this photo, and they say it’s Munich But yeah, I was actually wondering, is this indeed Munich, or did they just take a normal picture from another beautiful European city and put that in the slide deck? So what I did is I made a screenshot of it, and I uploaded it into a bucket And then I checked it with the Vision API The Vision API first gives me back color information OK, that can be handy if you’re a designer and maybe you want to dynamically build your page You have a dark background and you don’t want to have a dark photo on it, so color information could be handy But it also recognized it is indeed a landmark It recognized this is a city It gave me back a latitude and longitude coordinate And with that, I can plot it on a Google map, so that is what I did And yes, it’s indeed Munich, and to be very precise, it’s actually in the [INAUDIBLE],, if I pronounce it correctly So, yeah, that’s kind impressive I thought, that’s cool What else? It did recognize some labels of what it did see, and yeah, and I translated it It’s in English, but I connected it to the Translate API so that I get the labels now back in German And it tells me, yeah, it’s a tower It’s a [INAUDIBLE] I’m 86% sure it’s that That’s the confidence level

The confidence level says it’s 85% sure it’s a city So yeah, that’s probably correct And it also has the OCR detection, so it detected the welcome in Munich OK Now, let’s do the same example again but with a person And I picked a Dutch guy, but you guys might know him as well Right? Yeah I thought this was pretty impressive, that Vision API did, because the labels that it gave me back, it actually recognized this is a soccer player It’s 83% sure it is a soccer player And I thought that’s pretty impressive because I don’t see any soccer balls, or no footballs, here on this picture I just see some green grass, but that could be any other sport as well So, like, the Vision API did understand, like, yeah, that’s a soccer player It’s definitely a player, it has something to do with sport, I see grass, I see team sport, so that’s cool But what the Vision API did not do is it did not have facial recognition It did not tell me who this person actually is But we are Google, right? And we know what other pages on the internet make use of this photo And based on that, we can also get some entities And here, it says it’s actually Arjen Robben, and he’s from the Bayern Munich team And this picture was made in the Wembley Stadium and it had something to do with the European Cup That’s pretty cool Now– [APPLAUSE] Now obviously, this is just a simple demo, right? But you can imagine that when you want to use this in a real life use case– let’s say you have a hotel website and you paid your photographer to make pictures of these hotel rooms You don’t want other websites, other hotel websites, to steal your pictures, right? So you could use the Vision API, for example, to detect what other websites are making use of your pictures Another example could be in real estate where you make a photo of a house, and that immediately sends you to the website of the house listing and says, yeah, this house is for sale Our customers are doing that Or maybe in financ– you have lots of documents with a lot of text in there Can be handwritten text, can be typed text, and we use the OCR detection to get all the text out of the documents So there are lots and lots of examples here This is why I’m so excited about it I’d like to hand it back to Greg Thank you GREG DEMICHILLIE: Thank you, Lee [APPLAUSE] Thank you, Lee It was really great to see so many different APIs is brought together in one simple to use application Now that’s what we’re building at Google I think it’s really useful to hear what customers are doing with ML and AI And so I’m really pleased to bring on stage a customer who knows something about machine learning and how to turn it into business value She’s the co-founder of one of Europe’s most exciting new start-ups, dealing with customers as interesting as Porsche And she did all this while building out her bachelor’s degree in E-commerce Please join me giving a warm welcome to Michelle Skodowski from BOTfriends Michelle [MUSIC PLAYING] MICHELLE SKODOWSKI: Thank you Good morning, everyone So I’d like to start with a little story There are many companies out there that people crave to work for And I believe that Porsche is maybe one among these As generations y and z are becoming older and ready to enter business world, it becomes more and more natural to them to inform themselves about career opportunities, not only via website or email, but simply on social media such as writing Facebook Messenger All right So that’s the jobs and career Facebook Messenger And what we can see here is that someone is asking questions regarding internships at Porsche All right, after checking the requirements, he unfortunately sees that he doesn’t really fit them, but however, he asks for more details on dual studies All right, he gets led to a integrated [INAUDIBLE] and he can check on all the offers on dual studies So that is really, really easy He instantly receives all the answers he wants So the special thing here is that there is actually no real employee writing back But this cute robot here on the ride, and it’s a Chatbot And he’s the first Chatbot in the automotive industry,

instantly answering questions regarding job offers, internships, dual studies, et cetera, by using Google’s natural language processing service, Dialogflow And just last week, we’ve been nominated, oh no, actually, we won the human resources Excellence Award in HR tech and data with this product So what we do [APPLAUSE] Thank you Thank you very much So what we do at BOTfriends is we create conversations over messaging platforms, but also voice assistance, with the help of artificial intelligence So our goals are to meet higher customer expectations on 24/7 availability, and also to establish messaging platforms as a new way of communication Since we spent 90% of our time on email and messaging platforms, and since the time we spent there has actually surpassed the time we spent on social media And we know that Facebook Messenger has reached over 1.3 billion users, so that’s quite impressive But now, why did we choose the Google Cloud Platform to develop such an innovative product? So obviously, one of the first reason is, we already talked about this today, the Google Cloud Platform is super secure, and for companies such as Porsche, this is really, really crucial We are also using all the amazing Natural Language APIs provided by Google, such as the Translation API For companies such as Porsche, it is important to reach all their potential applicants, because we know that people with different backgrounds, cultures, perspectives, are definitely contributing to the company’s success and we can enable conversations with everyone, regardless of language and origin We are also able to run [INAUDIBLE] tests on the App Engine, which is so crucial for our MVP approach So by distributing the traffic on different Chatbots, we can see which one is performing the best, and which one is the one we should actually further develop And most importantly, we are able to scale up automatically with the [INAUDIBLE] environment So the Chatbot can handle endless conversations at the same time, and our developers are able to focus on what they do the best Write the code instead of being busy balancing the load All right, so after we launched that product, we’ve seen amazing results The great service Porsche provided has led to an increased engagement by over 100%, and a reduced workload by over 25% So employees at Porsche were actually able to focus on more complex requests After the launch, we are planning to use the PostgreSQL database to enable creating user profiles so we can offer subscriptions to job vacancies So Porsche fans will always stay up to date whenever their dream job comes up And last but not least, we want to use a sentiment analysis API by Google to enable human handovers So whenever a conversation escalates, let’s say the customer is not happy with the Chatbot or the conversation, is unsatisfied, we can just easily direct him to a real employee to escalate the conversation The last thing I have to say, if you want to hire the best people, provide them with a great service And with the Google Cloud Platform, we’ve been able to develop such an innovative product Thank you very much, and have a great day [APPLAUSE] GREG DEMICHILLIE: Thanks so much, Michell Thank you so much That’s just such a great example, again, of tying together all these machine learning capabilities in a way that really makes a difference So that’s really just a snapshot of what we try to do to help you accelerate your business I could have talked about application development There’s so many different ways But I really do think, at the end of the day, we can help you get more value out of what really is the most valuable thing you have As someone said earlier, it’s the data in your data center that matters, not the data center itself And so we really think that we can help you by building out a set of products to let you find value out of that data So now I want to turn over to the next section of today, which is to talk about how you build a partnership relationship Building a relationship with a Cloud provider isn’t just a short term relationship It’s a long term one, and you want to have somebody who’s in there with you every step of the way So to talk about partnership and the role that we think we can play in that, please welcome Tom Grey from Google to talk about that [MUSIC PLAYING] TOM GREY: Thank you, Greg And thank you all for coming I’m really, really excited to be here I think this is an extremely exciting time And I’m here to talk to you a little bit about partnership

We believe that customers are looking for something new They’re looking for new technologies They’re looking for new business platforms, new business models They’re looking for a new place to do new things But they’re also looking for a new partner, a new partner with a new approach Over the course of this presentation, we’ve looked at the different aspects of digital transformation We’ve talked about how you optimize your business We talked about how you divest the parts of the IT organization, and technologies that aren’t differentiating to you We’ve talked about how to collaborate We’ve talked about how you use collaboration to accelerate and to drive innovation in your organization And we’ve talked about how you use data And you can use data to differentiate your business At Google, we believe we have one of the foremost platforms to allow you to do those things We believe that it’s at the cutting edge in terms of capability, in terms of reliability, in terms of security But we also believe that that is not enough We believe that to be successful, you also need a partner, a partner that understands the journey, but also a partner that understands the destination At Google, I’m proud that we have 18 years of innovation behind us, and I’m confident that we have many more ahead of us In that time, we’ve had the chance to build some of the leading platforms for mail, some of the leading platforms for search, for mobile Heck, we’ve even got to build self-driving cars We’ve had the chance to try some things that, really, many people only dream of And we’ve also had the chance to, frankly, make some mistakes that, really, few could afford to make And that’s what we want to bring to this partnership We want to bring and we want to share that learning We want to share that experience At Google, we have a saying Focus on the user, and all else will follow We want to build a platform that focuses on you We want to build a platform that’s flexible, a platform that’s open We want to build a platform that has easy to understand, customer friendly billing We want to build a platform that enables you to build and to run the applications you want to run, but also the applications that you need to run We want to put you in control of your IT To give you some examples of how we’re doing that, I want to talk about our sustained usage billing So sustained usage discounts is a scheme whereby the more you use of our Cloud, the less we charge you per unit We want to make billing simple And so with sustained usage discounts, those discounts are applied automatically You don’t have to do anything You just use the Cloud, and we automatically work out what you should be discounted, and we apply that discount on your behalf, without you needing to lift a finger We also want to make it flexible So we offer that sustained usage discount without any kind of commitment upfront on your part, whatsoever So it’s a simple, it’s a flexible, it’s a customer centric model for billing, we feel We’re also extremely committed to openness and to open standards Google as a company was born on the internet, and the internet was born of open standards, and it’s thrived as a result We are deeply, deeply committed to openness, and we’re deeply committed to allowing you to avoid vendor lock-in Vendor lock-in kills innovation Vendor lock-in kills innovation because it doesn’t allow the best technologies to win And lastly, people As I said before, we have some amazing people in Google, and some amazing people that really have the chance to build some really amazing things And we want to bring these people to you We want you to work together with these people to build the next generation of crazy cool stuff on the internet And you might say, that’s great Crazy, cool, I love it That’s kind of cool, that’s great But come on, Tom What about everything else?

We understand we need to run that as well We need to make it easy to bring new stuff, but we also need to make it easy for you to bring your existing applications and your existing investments onto our Cloud as well Globally, CIOs are moving to the Cloud to save money, but they’re coming and they’re finding better security, better reliability, better flexibility We need to make it easy for you to bring those applications to our Cloud But we also want to make sure that when you bring those applications to the Cloud, we also make it easy for you to migrate and to also take advantage of the new services that you find there, services like the DLP, the data loss prevention API, services like Cloud Spanner We want to enable you to migrate and then to modernize And we know that we need to invest in the people, and the processes, and the tools to allow you to do that And one example of that is our support for Windows I hope here everybody would accept and agree that our platform is an amazing place to run Linux It’s an amazing place to run Open Source workloads There’s also a lot of Windows out there, right? We need to enable people to run that as well So we’ve made investments in things like supporting images for Windows server Many different flavors of Windows Server, many different flavors of SQL server as standard images We support hosting Active Directory, so that you can extend your own Premise Active Directory domain controller, into the Cloud For Windows developers, we want to be able to use the same, familiar, powerful tools that you’re used to using So then we now support Visual Studio You can develop your applications in Visual Studio, and you can easily deploy them to our Cloud The Cloud SDK supports over 100 different commandlets for the Windows Power Shell to enable to use the tools that you’re familiar with to work in our environment And one of the areas we’ve made the biggest investment is in people We know we need to bring people to this partnership to help you to be successful People like our customer engineering and our solution architecture teams, experts who can work with you to build, and to design, and architect your applications to get the best out of our Cloud People like the amazing engineers who sit in our support organization First class Google engineers, who enjoy deep relationships with our software engineering and our site reliability engineering teams, so that when things go wrong, and things will occasionally go wrong, we can get you up and running as fast as we possibly can People like our office at the CTO Seasoned industry professionals with deep technical expertise and a background in different verticals, who can work with customers who are doing some of the largest, most audacious migrations to the Cloud, migrations where deep technical expertise and domain expertise are vital to be successful People like our customer liability engineers, taking the lessons we’ve learned with site reliability engineering, running those 1 billion user applications, keeping those up and running on the internet People who can work with you to bring the same level of resilience and security to your applications so that you can build applications that stay secure, that stay scalable, that stay available under the most extreme of conditions And lastly, Open Source We have a lot of experience with Open Source, and we can leverage that experience to help you with your Open Source packages, be that on our Cloud, or be that on Premise So to finish, I just wanted to talk through an example of this And a great example of this is Conrad Conrad’s a 90-year-old family owned company And about 20 years ago, they made a very smart investment, and they started to invest in E-commerce And today, Conrad is one of the leading electronics suppliers both in physical stores, but also on the internet They are now working with Google to move more and more of those systems into our Cloud Their IoT platform runs on Google’s Cloud and integrates with Google Assist They have moved 5,000 of their employees

to our G Suite platform to allow them to better collaborate, to allow them to be empowered And they save 90% of people work in the process, which is pretty cool But lastly, and most importantly of all, they’re working in collaboration with us to explore new business models, to explore new possibilities, and to test out the bleeding edge of what we are making available And I would urge you to go and check out their booth They have a very cool booth downstairs, so go say hi to the folks from Conrad and hear a little bit more about their IoT platform So in that, I hope I’ve given you a bit of a flavor of how we’re working in partnership with you, how we believe in openness, how we believe in bringing people and bringing our expertise I think it’s an extremely exciting time to be a technologist I’m excited I hope you’re all excited, too And with that, I’m going to hand back to Michael Thank you [APPLAUSE] MICHAEL KORBACHER: Thank you, Tom Thank you Yeah, so we spent the last hour and a half on talking about four subjects, four very important subjects, on your journey in making the digital move Optimize Focus on the things that differentiate you Get rid of some of the things that you shouldn’t do Collaborate I think that’s pretty obvious We want to make sure we get more productive, work better together Engage and innovate and look at the things of the future Accelerate Look at things like complex data analytics, machine learning, and AI, and do that with a partner that cares The future of the Cloud is now It’s not if, it’s when, and when is now I want to leave you with an overview of what you can experience this afternoon There are going to be many sessions in four different tracks, and you’re ready to choose There will be plenty of booths outside from out of partners, as well as a start-up booth out there, as well as a booth that says ask an architect, where you’ll meet with some of our technical experts and you can talk through architectures Some of you have already started I’ve been curious to share experience on the hashtag Google Summit, Munich And I want to leave you with a little assignment for the afternoon Take one of the opportunities you have in mind, one of the business opportunities you have in mind, and take this opportunity with you into the sessions that you’ll follow this afternoon And think about, how does this session and what I’m hearing, how does that apply to my opportunity? And make sure you walk out of here today with something that you’re going to do different tomorrow that is going to help you in fulfilling this idea That said, I release you into the afternoon, and I look forward to seeing you again at our cocktail reception at 6:15 Thank you [APPLAUSE] [MUSIC PLAYING]

Honeywell and Foghorn Leverage Android for Smart Industrial Devices (Cloud Next '19)

[MUSIC PLAYING] ANTONY PASSEMARD: Thanks for being here I’m Antony Passemard I lead the Product Management team for Cloud IoT at Google And I’m joined by a few folks here– Sastry, the CTO from FogHorn, Ramya, VP of Product, and Scott Bracken from Honeywell So we’re going to talk a little bit about some of the stuff we’ve been doing with FogHorn and Honeywell So a bit of an agenda, a little overview of Google and how we see IoT, fairly high level and generic view of IoT for us And then FogHorn will come on stage and talk about their edge solution and their IoT core, integration which is pretty cool You’ll see some of the platform, and they’re going to talk about a use case they’ve done with Honeywell And they actually have a nice demo here to show you what they’ve done, which is really impressive and I think adds a lot of value So we’ll have the demo and Q&A at the end, so I hope you enjoy So you’re probably here because you all know that IoT can and probably is already bringing value to your businesses And whether it’s to solve something specific or really being part of a broader digital transformation, it’s really central to a lot of creating differentiation for your businesses and across the industries We see that in agriculture, really increasing crop yield or lowering energy consumption in oil and gas, doing predictive maintenance or security in oil and gas We’ve seen some of those use cases We’re seeing logistics to do high value asset tracking or better driving behaviors for fleets, reducing risks for the drivers So we see a lot of those use cases So almost across every industry, we’re seeing IoT use cases What we’ve seen also– that’s a study that came out This is a IoT analytics company that does a bunch of competitive intelligence on IoT in different sectors And what they’re saying is in 2021, about 50% of the company will have IoT projects and will have seen value So it’s more than having a project It’s actually seeing value from IoT in 2021 That’s only two years from now But what’s staggering is difference from 2018 to 2021 There’s a massive difference, a massive growth in the realization of value through IoT, which is the big change here I’m going to talk a little bit about real-time in some of the talk because we’re seeing that speed is really critical in IoT We’ve seen analytics done on bulk data on database, and that’s great to build your models But what we’re seeing from customers is really the demand for real-time insights and real-time actionable insight, actually And what’s important, we’re seeing that data By 2025, a quarter of the data will be real-time in nature That’s quite a lot But that’s part of that quarter of data, and 95% of is actually IoT data That’s real time IoT data from the global datasphere, they call it It’s kind of funky word but interesting So real-time nature, being able to take decision right there on the spot, either locally or in the cloud, is super important for an IoT platform And it’s more important even because you want to get that accuracy to reduce costs, drive efficiency on a manufacturing line, to simplify your logistics and reduce risk, and make your environment work more safe So that’s something we’ve seen in oil and gas, in particular, to track workers with their hat and being in zones they’re supposed to be and things like that This is vision intelligence applied to a work environment for safety use cases That has to happen all in real-time You can’t wait for the next day to say, hey, you shouldn’t be in there The accident happens, it’s too late One thing that we’ve noticed– and I’ve been a cloud IoT space for quite some time now, and I’ve seen it going through a lot of cycles of PoC and pilots and testing out stuff I think that’s changing Now I’m seeing the PoCs and the pilots go to production And that’s a big shift that we’re seeing We’re seeing those growth People in companies– I’ve been testing IoT for a while Now they realize what they can do with it, and now it’s going in production That’s the big shift we’re seeing And the problem is you really have to think about, OK, I am I doing my IoT project in isolation? Am I using the right platform? Am I limited by the platform I’m using? Am I restricting my possibilities with a platform I chose? So choosing the provider of your IoT platform is actually really important, and you often time forget about the delays of managing complex infrastructure when you put IoT platform in place The specialized tech and learning curve you have to have for hardware communication, AI,

data ingestion, all of that, there’s no real standard protocols So it’s pretty messy There’s a lot of hidden costs that come with IoT deployment So having a platform that’s simple, flexible, and can really help you to get to value faster is critical for the success We see IoT from Google’s [INAUDIBLE] But we see IoT as a big data challenge almost more than anything else, and big data in real-time And having that ability to gain real-time insight, as I said, will bring the competitiveness that you’re looking for your business So in terms of platform, the Google Cloud IoT platform is a serverless platform from connectivity, processing, storage, and analyze Serverless means that you never have to worry about scaling up or down or pre-provisioning resources and overusing resources It’s just going to scale from one or a few device to millions of devices very easily And that’s really a core differentiation of our platform from connectivity to analysis If we look a little bit deeper into the Cloud IoT platform itself, you’re going to see several components here The main one is Cloud IoT Core That’s something we launched GA last year in March, so about a year ago That’s been pretty successful This is a global front end for your devices, one URL globally And we use all the Google front end– the GFE, we call them– to ingest data from anywhere in the world You don’t really have to think about, where is my device physically and which region does it have to connect to to have the lowest latency That will happen automatically This is served by the same front end that serve your Google search or your YouTube It’s the same front end So it’s really global, really scalable all across the globe Once the data flows into IoT Core, it flows into Pub/Sub Pub/Sub is this messaging bus, global again Also realize, you don’t have to worry about resources here Pub/Sub is available globally That means that if you have data ingested in Australia, in Asia, in North America, they all fall into a topic of choice You want to get that data, just query that topic and get the data out You don’t have to think about, oh, where is my device again It’s all global infrastructure, automatically scalable Pub/Sub stores data for seven days You can do snapshot replays and have several subscribers to it and get the data out The data can go into a Cloud function, a serverless compute service to trigger actions, or do any kind of requests Or it can go to Cloud Data Fusion Our Cloud Data Fusion is the managed version of CDAP CDAP is an open source data processing pipeline You may know it as So that’s open source, and you can create your workflow with drag and drop little nodes, and then do your transformation along the way Plug the data out of Pub/Sub, do your transformation, and drop it in some database If you want to do longer processes like windowing analysis in real-time, you can use Dataflow Dataflow is also scalable on its own and has an Apache Beam open source equipment All those are managed but have their open source equivalents Pretty cool And then you’ll land in Bigtable if you want to do time series If you want to do big data analytics, you have your BigQuery You have Cloud Machine Learning to do your training Send those models back, and send them back to the device because Cloud IoT Core is bi-directional And then you have some nice visualization tools So that’s what we call the Cloud IoT platform Putting that together is fairly simple Doesn’t require much code, if any code It’s all serverless And we’re pretty happy with some of the early customers that we’ve had You can see some of those customer at the spotlight session on Thursday We have some of the customer on stage One key element of our strategy with Google Cloud IoT Platform is we can’t do it on our own We have to have partners working with us And I’m really happy today to have FogHorn as one of our partners, if Sastry wants to join me FogHorn, we’ve been working with them for over a year at the Edge, in particular, and solving some of our Edge challenges and doing machine learning at the Edge So thank you, and I’ll leave it up you, Sastry SASTRY MALLADI: Thank you, Antony Is my mic on? ANTONY PASSEMARD: Yeah, you’re on SASTRY MALLADI: Good evening or good afternoon, everyone I know I’m between you and food, so I’ll try to be as interesting as possible So we are a Google partner, as you mentioned, and we are an edge computing platform provider company We are a value-based company I’m going to give you a snapshot of who we are We’re based in Sunnyvale, right here in the valley We provide software that runs on small devices, edge devices, whether they’re existing PLCs, gateways, or embedded systems close to the machines, close to the assets And the key here is to do live data processing, analytics, and machine learning on the data that’s flowing through the edge to derive actionable insights

We’ve got customers across the globe, hundreds of them We’ve got lots of analyst coverage, and so on Everyone that you see here, the logos, these are all our investors Honeywell is an investor Honeywell is actually going to talk today about how they’re leveraging our technology for some of the use cases as well You can see all of the big technology and industrial names there as well that all are investors Traditionally, we have been shipping our products on Gateways, PLCs, Raspberry Pis, things of that nature, where you have a flavor of Linux or Windows or a real-time operating system What we have recently also done is release the same product on hand-held devices, which is actually the core part of the demo and discussion today, on Android-based devices So the way the product works is– this is how it looks like You have a tool here in the cloud, which you use to manage all of the edge devices The software runs on the edge device itself We have a very flexible microservices-based architecture In the traditional Linux world, we have used containers Obviously, in Android, there are no containers So we have developed all of this as an app So the notion that you can have all kinds of data that fed into our system through a database that we have– the two core key components here are the analytics engine and the edge ML, the machine running engine The idea being that as you’re ingesting live data from all different sources, you can apply analytics and machine learning on the data to drive the insights And then you can take closed loop control actions by either using our SDK There is an SDK using which you can build applications Or you can also visualize it and send that information into IoT Core and other places as well So maybe let’s step back a second A lot of you, I’m sure– many of you are familiar with edge computing But let’s step back for a second to say, why is edge actually important Right? So in a typical industrial environment, whether it’s in oil and gas, manufacturing, transportation, smart cities, buildings, doesn’t matter All of these different sectors, you’ve got data coming in– I might add– lots of data, especially if you’re doing video, audio, and acoustic type of data And there is a high cost latency and security associated with it by directly connecting these machines into some kind of a central or cloud location What FogHorn really does is to help with that, solve the problem, by introducing this edge device next to the asset– on many cases on the asset itself– where the data is ingested into our software running here And then you derive the insights and then send the results and insights back into the cloud, wherefore additional processing for fine tuning your models, for fleet-wide visibility, and so on This obviously has the benefits of low latency here because there is no latency involved between these two And then there is low cost of data transport And then security is also eliminated or minimized when you’re communicating with that Another way to look at it, a lot of the times when you’re processing data– in this example right here, this is a sensor signal for one of the signals You see here how the blips are in the signal? This is a suction pump When you typically send that information to a down sampled environment, whether it’s cloud data center, this signal actually looks like that You actually pretty much miss the little point of what it is So the fidelity of deriving actionable insights at the edge is a lot higher, is really why it’s so important, in addition to all of the three things that I talked about– security, cost, and, of course, latencies This is a high level product solution, which is our traditional solution that we’ve been shipping on Linux It’s the same solution that I showed you on the first slide It has been put onto Android The idea here is being that you ingested data from all these different types of sources, enrich them because a lot of the times in the industrial world, the data quantity is not good That layer fixes all of that, cleanses, normalizes, decodes it, and then does processing through these two layers that I talked about And then the information itself can then be published to a cloud environment We have pretty deep integration with IoT Core, as Antony mentioned, and I’ll mentioned that in a second in the next slide how that is done as well And using the SDK, you can then program and programmatically subscribe to those insights and take closed loop actions So one concept that I want to bring into this picture is this notion of what is called edgification I know it’s a new word So what is edgification? A lot of the times when you build data science models, machine learning models, to run in a server environment, cloud environment, you may or may not necessarily pay attention to the amount of compute resources Or you may be actually working on batch data rather than working on live data And you may also not pay attention to the number of layers on the size of the model because those are not necessarily the constraints you worry about in a cloud environment But when you bring the same model to run on an edge

environment, in order to run efficiently and also perform at the rate that you would expect, you have to deal with those things And that process is what we call edgification How do you take your model that typically works off of batch data and make it run on real-time live data? How do you then optimize and reduce the number of layers and the rates and still with the same level of accuracy? And finally, if we look at the anatomy of a machine learning model, it has really three things One is what we call a pre-processing code, which is really feature extraction Let’s say you’re looking at a video or you’re looking at some sensor signals from the machines Before you can apply a machine learning model and algorithm, you’ve got to extract some features And that’s typically what we call pre-processing, and that is the most compute-intensive part of it And then followed by you’ve passed those features into your machine learning algorithm or the equation, which itself is not really that computationally intensive And then followed by post-processing, which is how do you then take the results of that equation and do something with it What we do as part of this edgification process is to extract all of the pre-processing, the post-processing, into our highly efficient complex event processing engine that we built from ground up, which runs in really, really small footprint, few megabytes of memory, runs really fast And then lead the machine learning part [INAUDIBLE] to it into your engine Of course, you also have the ability to, in fact, do the machine learning part in the CEP engine itself as we move forward Here are some examples where we took a video-based layer modeling use case, how you can actually optimize when you build Take a model that is built here, edgify it You got almost a 10x or higher improvement, and yet, much higher fertility, much higher accurate results Now, even when you build a machine learning model to derive some insights and predict some things and you deploy this in the edge environment or any environment for that matter, and it’s accurately predicting the results, it does not necessarily mean over time it’s going to continue to predict with the same level of accuracy Because of data drifts, because of changes in machine behavior, the same model that once used to accurately predict the results may no longer do that So how do we address that? How do we fix? That to address that, we came up with this notion of automated closed loop machine learning So what does that really mean? So as you see here in our architecture, once a model is deployed, we built this thing called a prediction model degradation detector What that module does is that it continuously measures several different factors– whether it’s an F1 score, whether it’s a data distribution collection of how things are– and tries to see if the accuracy of the model that you’re predicting is deviating from what it was before And when it does, what it does it automatically sends that information to the Google IoT Core here From Google IoT Core, as Antony explained, it’s a Pub/Sub system You can send that information to Dataflow, which can then be read and configured into BigQuery in the Cloud ML engine where you retrain incrementally your model with this incremental data that’s coming in And then push that model back onto the edge as well This is all automated Of course, you have an option to not automate it But this is done in an automated way until the accuracy comes back to what it was before And this is what we have been calling the closed loop ML I know different terminology have been used This is truly revolutionizing the edge computing because you’re actually introducing the notion of AI, artificial intelligence, onto these edge devices on how do you self-correct the models deployed onto this environment in order for you to continue to predict this with the same level of accuracy As you can see, our solution is a genetic platform We have been deploying this across many, many use cases across many different verticals Our top verticals actually include manufacturing– a lot of big name customers– oil and gas, transportation, smart cities, buildings We have done other things as well like mining, and so on What I want to do is just mention a couple of use cases quickly here, one in the manufacturing, one in oil and gas, that actually involves different paradigms This is a fact that I can publicly reference It’s a GE factory They manufacture these industrial gated capacitors that look like this This is a highly expensive capacitor that are used in power plants The material cost alone is several thousand dollars And the way the processing includes was, they take an aluminum foil, they wind it through what is called a winding machine called [INAUDIBLE] winding machine And the winding machine itself has hundreds of sensors that are continuously measuring several things, winder tension, width aspect ration, and so on Once they wind it, they press it into packs and then insert it These are those packs And they run it through what is called a treat oven to take out all the moisture content And finally, they’ll fill oil into the capacitor to test the capacitor At this point, if the capacitor fails,

so it’s not working or seeing dielectric failures, it’s too late in the process And that’s exactly what was happening in the factory As I walked into the factory, I see a lot of pink slips on the factory floor, costing them a lot of scrap, millions of dollars So the challenge posed for us was, how do we connect our edge solution to the sensors directly into this assembly line in the machine, connect it and identify exactly what was causing these dielectric failures, more importantly, before it is too late so that the operator actually has a chance to go take the units offline and then fix it And that’s what we have been able to successfully do that and reduce all of their pain points on the scrap The second use case I want to talk about is interesting This is an oil and gas use case Saudi Aramco is one of our customers Many other customers as well We’ve actually publicly talked about this at the RC conference a couple of months ago This is, as some of you may be familiar with, in oil and gas refineries, gas is being refined You’ve got hundreds of compressors And because of various reasons, excessive pressure, sometimes other reasons, compressor failures, the gas gets released in through what are called oil stacks These are all oil stacks And then the gas gets burned When gas gets burned, you see a fire What’s worse, you can see smoke sometimes Fire itself is bad Smoke is even worse Dark smoke is even worse So how to identify? Up until before this edge solution came up, typically, they installed video cameras Sometimes they don’t even do that And a human being is able to monitor it 24 by 7, which is not really practical All they can do is to see, oh, there is fire There is smoke What we have done is installed our software on a controller box attached to these oil stacks– many, many, many of them– and then directly take the video feed from this, combine it with the audio feed of the compressor, and identify and correlate one, there is in fact a flare or a smoke and also compute the volumetric flow of the gas that is burned, and things of that nature So human being and monitoring of this is eliminated But more importantly, operator gets an alert when there is a real problem, and we get to the root cause and why the problem was occurring This is being widely deployed in a very popular use case One last thing I want to say is that this particular session, the rest of the talk and the demo, remember as we talked about, is our product that was released on the handheld devices, mobile edge platforms We have noticed that there is obviously a lot of traction in this field because these are all battery-powered devices, which makes it even more important for our edge computing to consume less and less power, less and less CPU and memory And we imported our software onto these devices And deploying with Honeywell, especially initially working with Honeywell device where we have done a couple of use cases, which we’ll demonstrate today as well With that, I’d like to invite Scott Bracken, Director of Advanced Technology from Honeywell, to speak about their technology SCOTT BRACKEN: Great, thank you Thank you, Sastry Hello, everyone My name is Scott Bracken, as Sastry mentioned, and I lead the Advanced Technology Group for the Productivity Products team within Honeywell And I’d like to introduce you to the corporation of Honeywell a little bit to explain where the Productivity Products group sits within Honeywell It’s an exciting time to be an engineer at Honeywell because there’s a significant amount of growth being driven by technology in these four business segments that Honeywell exists, aerospace, home and building technologies, safety and productivity solutions– which is the group I’m from, and I’ll explain a little more about that in a moment– and the performance materials group The one common thing that connects all of these systems or all of these business groups is the fact that connected solutions are what’s driving the growth in all of these business areas And therefore, the technology development within Honeywell is able to develop common platforms across these different businesses And as I mentioned, we’ve been growing significantly in the technology area and most specifically transforming the Honeywell corporation to a software industrial business The point of this slide is to point out that we have already invested heavily in software engineering talent within our corporation to build that capability Now diving a little bit deeper into the safety and productivity solutions group, you can see that there’s a lot of areas within the industrial sector where technology can help our customers be more productive First in the connected worker Many wearable devices that an employee can wear in a industrial setting can help monitor for safety and to help that employee be more productive In the supply chain and logistics area, there’s a tremendous amount of data that is important to our end customers for managing the efficiency of their operations In the distribution center themselves, we have a lot of automation that we’ve introduced and provide to our customer base, again, for reducing their cost of operations and being more efficient And finally, we also have a very wide portfolio of sensors,

primarily for safety applications Pressure sensors and gas detection type sensors are quite prominent in that portfolio, but we have sensors that support all of the different IoT applications across this suite So once again, diving a little bit deeper into the Honeywell technology area, specifically in productivity products, what am I talking about in that realm? There’s several different solutions, and many customers use this technology in different ways But we can provide common platforms across these use cases to provide those customers with the unique needs that they have, yet with the power of having a very robust solution underneath Things such as vehicle-mounted computers, all types of asset tracking technologies Particularly, RFID sensing is seeing a growth spurt in recent years This is not just limited to some of the industrial settings but also spills over into retail quite prominently And then looking at the solutions on the right-hand of this chart, scanning is really at the core of what we do in our productivity products group Product identification, primarily with barcodes, is what our scanning devices is centered around And then we provide value on top of that with other solutions, such as voice solutions to allow for hands-free operation of product identification, the mobile computers, which I’ll go into in a little more detail in a moment, and then, of course, different tagging that we can use for sensing the devices in any kind of inventory setting That all gets tied together, obviously, in a cloud environment for data management and inventory tracking purposes And most recently, in fact, in the last year, our flagship device was recently introduced We call it our Mobility Edge platform, and it’s a unified platform for mobile computing It’s based on the Android or it’s an Android-based device, and it’s purpose built for these industrial applications The rugged sized design of not only the form factor being very familiar to employees but also allowing for a very heavy use during a workday As I mentioned, product identification and barcode scanning is the core operation of these devices, but that can take place thousands of times a day or thousands of times in a single shift So building a device that can handle that kind of workload, survive the entire shift, is a very critical aspect of our customer requirements So withstanding the harsh environment, whether it’s impact or whether it’s the elements, even moisture, say, a rainy environment– it can also withstand We’ve used the Android operating system, obviously, for easy field deployment And giving us a seamless operation with the cloud connectivity And we built this device with multiple form factors, but underneath the hood is a single common hardware platform And I particularly like this slide myself because I’m surrounded by software engineers, but actually I’m a hardware engineer at my roots And so I have to show at least one slide with a processor module on it And we’re quite proud of this processor module that we introduced with the Mobility Edge platform last year because it is common throughout all the different form factors of the different mobile devices that we provide to our customer base And particularly, I’m excited today to be on stage with Google and with FogHorn because now that we’ve deployed this edge device, this platform, we now can build a tremendous amount of value within that software on top of that device And the first place we started or one of the first places we started is looking for this very thin layer that Sastry has already described, that allows us to build the AI and machine learning toolset and capability on top of these Mobility Edge devices And FogHorn’s Lightning offering meets all of those requirements for our needs across our entire customer set So now the last couple of slides, before I hand it off to the demo, are just describing the use cases that we are first investing in The first one is quite an interesting one to us because we have a rich history of designing a very sophisticated software tool, decoder tool, to extract from an image a barcode and decode it for that customer’s application But on the edge of that application and during different use cases, many of the images we receive from our device on our imager come in very blurry or very noisy It’s very common New operators that haven’t operated a scanner before don’t necessarily aim the device accurately

Or I’m sure many of you have been at a retail store when a particular packaging, maybe a shrink wrapped packaging that has some ripple in it, is causing the scan to be difficult to read for that device And an operator, naturally many times, wants to go closer to that barcode to pick up that scan And in fact, they’re making it more blurry by going closer So again, there’s several different environments where a blurry barcode or being able to decode a blurry barcode would be much more efficient for our customer base And what you see here is a data set So we created a synthetic data set of 50,000 barcode images They are image pairs, clean images and blurry or damaged images And we use that for training our machine learning model that then we deploy through the FogHorn system onto our Mobility Edge device And these are just some snippets of the images that were part of our training data set So that’s our first use case, and our second one is quite basic but very important to our customers So these devices are owned by the customer They’re not owned by the employee They’re not their personal devices So the typical method for these being deployed is at the beginning of a shift, the operator will take the device, use it throughout the shift, and then dock it for charging overnight The very simple requirement is this that device must operate continuously throughout that entire shift That shift can vary in time It can vary in operation and what modes are being used And so by applying the machine learning type of approach to this and working with the data scientists at FogHorn, we’re able to create a real-time operating model on the Mobility Edge device that can monitor and optimize to extend the life of that battery for that device during the shift So with that, I’d like to invite Ramya up from FogHorn, VP of Products, to take us through the demo RAMYA RAVICHANDAR: All right, I’m here to tell you that it’s real So this is the Honeywell device It’s a CT60, and our FogHorn Lightning Mobile is actually running on this So following up from Scott’s talk, what I will a demo here are the two apps One is the battery insights app, and the other one is the barcode optimization But before I jump to the app, what I do want to do– I am a products person– is go through our FogHorn manager here So what do you see on the left side here is our FogHorn manager portal So think of it as our remote management configuration monitoring console that lets you see all the edge devices that have the Lightning Mobile or the Lightning stack installed on it One of the things we do at FogHorn is be very focused on the OT persona We understand the users of our product are operators in the manufacturing floor, your refinery technician And so a lot of what we put in here is focused on that persona So especially in manufacturing, the idea is to create this one golden edge configuration that can work across massive volumes of devices And so this FogHorn manager portal lets you create that configuration Whether it’s to add a new sensor, define analytic expressions that do complex event processing, or come up with machine learning models, all of that is packaged up really nicely into a configuration and deployed onto the device And so that’s what we’ve done here So you can see the two solutions, which is the barcode enhancer and the battery insights Here are the two models that are part of this configuration And now, that’s what’s deployed on this device So let me pull up the FogHorn app So you can see the FogHorn app here that’s installed on this device And immediately, we have the two solutions pop up, which is the barcode enhancer and the battery insights Now, Scott did talk about the goal of using battery insights, and I’ll talk a little bit about the metrics that show up here So both of these apps represent two classes of learning, so to speak In the battery insights app, it’s an adaptive learning model that’s going on on this device So it’s learning on the fly based on the pattern of usage How often is the battery charged? How long is the work shift? How does the user actually use it? What we are building is a very unique fingerprint of device usage that is specific to this unit So the metric here is saying, yeah, it’s OK for the shift This battery is going to last for the next 16 hours Oh, by the way, I also inform you that your work shift is around eight hours So this model is going to get better and better as the device gets used over time

Now, if you move on to the next one, which is the barcode enhancer, it happens behind the scenes So this is an operator He has the scanner He’s scanning the barcode And the goal is to have that first scan be successful And in the event that Scott talked about, the image is grainy, what happens is that gets passed to the FogHorn stack, and we’re running this neural net model So the difference in the model is that’s a neural net model built using TensorFlow, running on TensorFlow Lite And the inferencing is happening on the device So to help demonstrate how the barcodes actually look before and after the neural net model runs through it, I’ll actually go through a barcode viewer enhancer app here All right So on the top here, you see the before picture This is what the operator is actually seeing when they scan the barcode And once it passes through the Lightning Mobile stack here, this is the result. There is definitely much more clarity, and more importantly, there’s no loss of productivity when the operator tries to do that first scan There’s no manual re-entry of the code So let me go through a couple of more images This is an example of a full barcode sample, and this is the clarifying one through our stack This is one of my favorites It’s a piece of the barcode I didn’t think I’d ever say I have a favorite barcode, but here it is There we go, and I think they’ll do that as the last one here So the point of this presentation is to say that we have Lighting Mobile, and we have the ability to now run on Honeywell devices with the Google Cloud technique of building AI models on their data science platform What does it do for us as industrial operators, as industrial users? It opens up this whole universe, this whole expanse of use cases Sastry talked a little bit about sensor fusion in the past, that it’s the ability to combine video, audio, structured data What are the newer insights? For example, if you’re from oil and gas, and you have a technician walking around in your refinery, he could take an Android mobile device, point at a valve, and really get insight about should this valve be on or off The question to ask oneself is, what is a use case today that’s tethering my operator to maybe a location Instead, with the use of Android devices, it’s now more liberating and therefore, the ability to get newer insights [MUSIC PLAYING]

Video Surveillance Using Deep Learning With Few Data Samples

hello total homeland of the Dan my name is Nelly Fuu and my hepatitis videos Sweden using with feel that our samples without further delay has proceeded to a present adduction nowadays deepening based models have achieved human never or beyond human level performance in computer vision however depending base models require a large amount of training data in order to resolve a good performance model much training data requirements we introduced some challenges in video surveillance such tests detect new objects with limited the data provided and changes of illumination levels if tackle domain high change is a challenge for the surveillance system to collect enough data in a short period of time to learn to detect new object agrees with good performance on the other hand when the illumination level change how the system can nobody detect the target object is from the same category rather than a new category under the circumstances of the viscosity currently the most popular at this trainer model to detect new objects we feel that our set is true transforming however direct transfer learning is not suitable for object detection this because overfitting problem my focus especially when the darris has unlimited as the object detector is required to learn optimal parameters for both localizations and classifications therefore the motivation of this project is to study and enhanced and approach for a few shot object detectors for video surveillance the main concentrations of the project is to implement and enhance their sim fuschia object detector for videos of them there are three objective in this project first objective is to achieve higher mean average precision through data commendations second positive is to achieve higher mean average precision to readings module modifications and the last objective is to achieve higher mean average provisions through both denominations and everything module modifications the proposed study is based on the fuchsia object detector we are featuring within framework the frame is having a light network noise test the readings modules forecast specific buildings coefficient to delete the feature for feature extractor to detect objects enhancement attack true by earnings the documentation pipeline for input to generate high diversity data and also modify there Steve narrating modules to capture more readings informations the performance of the detector is improved after a German adjustments commit we can see that the improvements of the documentation is higher than reading module modifications but the combinations of both adjustments will further improve the performance of the detector the results of this project is not only applicable within video surveillance but as well as added the object the patient feels where the training data is limited such as the manufacturing industry a medical imaging next I’ll proceed to chapter 2 ratio and fuchsia object detection we are reading this is Melanie frame for notion of the detector in general the framework consists of two mains modules feature extractor and metamodels is also known as the pervasive modules the back points of the feature extractor modules is your only

true detector to extract bases to intermediate features representations warehouse Madonn models is a night with handbook for generating generating coefficients where can represent not both of classes for detection the reading of coefficients will be used to perform chain the wise order provisions with the features extracted from the feature extractor through 1 x 1 collisions therefore there are totals and weighted features where each related feature represents a class for example first with the features a percent person class the second abated feature of represent kept class and the last related features represent multiply cars next let’s look how to petitions layer works under this framework the dimensions of the rated features is M times W times has x times by pasties but n is the numbers of the related features please also represent a number of Castleford detections W and has are the width and height of the receptive field of the greatest features and represents the number of tango boxes in each grid cells for updated actions 5 representing Bob’s parameters and object math score last the syrups and the crust Gotham since the number of classes for deductions is represent by 10 therefore the values of C will be equal to 1 the other was the only one kraspo predicted in each prediction vectors for example this n pink prediction but represent the first anchor box prediction value at first grid cells amongst all the and weighted features the softmax normalizations is a prize to the trust among those and battles if the cusco from the person class features is the highest then the object in first hanger box at the first Street cells will be the person class this puzzle means that softmax normalized cusco from each prediction vector represents the likelihoods of the object to the corresponding process there are total W time hash times a prediction vectors emitted for each and related features the training is divided into two phases phase one trainings and Phase two trainings in this one training a large sets of precursors penalties images are used to train the feature extractor to extract a basic to intimate feature representations in second phase training new costs images are supplied into the model the trained to detect new objects in seconds face training few shots that means will be implemented one two three five or ten images podcasts per 10 cos are provided for fine-tuning the end classes include baked crust and a new process these are she had STDs raised edges of

the ratings coefficients from the visualizations we can cross off that similar classes tend to cluster together for example folate any more classes ships horse cow cat and dog a close to each other at bottom right of the diagram trend bus and car a cluster and me the tops of the direct therefore we can see that the readings module is learning some useful informations to rate the features from feature extractor and I will proceed to a system design three techniques include visual learning we are ratings reading module modifications and denominations will be covered in this chapter the feature extractor modules is in this framework is Yahveh to network layer the deans of these collections is a dynamic conversation layer which will be integrated which will be riveted by the reading coefficients the final output of this network is the Dean’s times the deems time study this is equal to powering post parameters of Technol score a number of classical times the number of anchor boxes where the number of anchor boxes is equal to five and the number of cars four is equal to one this word the dean comment the number of cassis detectors is one because hitch and aerated feature we represented us next is ratings module modifications the original ships of the rating coefficients is one time so on we generate by a global maximum layer finally however having a global Maxima layer the finally may be to loss of class specific informations especially he feel shot at the when new object terraces is limited therefore instead of having a 1 x 1 readings coefficients the readings modules is modified to generate the ratings coefficients we have a same size as the meta features from a feature extractor the readings will be done through pixels whilst multiplications is this of generalize modifications the urgings module is a light network for generating rating coefficients the input channels for layer 0 is 4 because there are 3 channels for hobby and destruction of or must objects for better object detection in future than as we can see the output channels the output size of the ratings modules is same as the input size of the layer the genes from the feature extractor the output vectors from the readings module will be used to read the input features in layer 2 teas in visual instructor next is the lamentations in general there are two types of the documentation of alt first the basics that accommodations or assertively scaling notations sharing translations encouraged these augmentations operations performed at least according

to the value sets in the setting tables character is applied to reduce the enumeration level problem discussed in the changes of media surveillance in chapter 1 the other types of accommodation is combined augmentations complementation is the operation that perform several basics that are fermentations sequential ease in this case basis accommodations perform income by materials horizontally skirting translations and current data commendations is applying to diversify the unknown date images of future earnings in Phase two training this site shows the samples from the documentation operations the original image is commented with operations of random horizontally and on scaling rendang notations went on sharing and on translation not corrected and lastly compiled imitations next I will explain the data presentation to parishes or this project past cameos a data set will be used for this project because of etiquettes mouse of object classes for the Dacians and performance analysis trending and predation sets for 2000 boc 2007 and 2012 a combined for trainings milestone testing set from POC 2007 is used for testing and abrasions a base classes and new classes a speed according to a tables below but fast count motorbike so far you’ll be the new class of these two training wells the remaining classes will be the base cast for Phase one training in summary future detector we are featuring a teens East implement and enhance for be disciplined the improvement a time to buy a determination by time before Phase two training to diversify the date images and the modifications of the readings modules to prevent excessive trust specific information loss chapter four will cover the performance analysis from their hands months before that solo baselines are compared with the results in this project your joint is euro with hood detector that trends on both base and new classes together under one two three five or ten short landing the other have T is your only two detector with two phases training but without raising modules and also without freaking much without freaking much in that the training iterations is the same as the iterations of few short detector we are featured wetting your own active foo is your v2 detector with two phases training and fully combat but without elevating modules before process before proceed I would like to briefly explain how test TD remote our steel frame o combines both SSD and faster Hudson ends to achieve better transferring the object detection SST is used to generate the Maori skill feature maps for region proposals in happy end of faster as CN n then the classic the object has provisions will be performed by the pastor Hacienda heat this framework introduced to regress regions background dissipations regulations to suppress the response from the

background in feature maps and also the transfer knowledge degradations to enhance nourish transfer form sauce torment object to target common objects after brief interactions on how STD we can proceed to the remaining baseline our STD or all is an OST to promote with your OB to as the backbones network without free combat thus the interested er of who is HP the framework v there are bees to as penguin equity debt free college this table shows the baseline performance and the improvements the units of the values in the tables is interracial positions 1 2 3 5 was shot forehand shots mean that the number of images has provided for phase 2 trainings we can assert that the improvements are consistence cost of shots that means under adjustments meet I comparing the improvements of both the limitations and modify the ratings modules you can see that the improvements of the dominations is higher than the improvement of the modified trading modules however the combinations of both the demand agents and modified the ratings modules you further improve the performance of the few short captives compared to the improvements of data combinations or invite the rating models learnt this table shows the average improvements of the adjustments meet the collisions is done by the formula below the improvements some across Holland names and divided by the total number of short the highest average improvement is the combinations of denominations and what if fights prevailing models which is to point to to mean average regions followed by televisions which is 1.6 min ever experienced and lastly modified the ratings audio 1.2 for the average position the following slides will show the deductions before phase 2 fine-tuning and after phase 2 fine tunings on your objects before phase 2 findings no bus and no account of the architect but after five shillings us and the cows objects are detected before fine-tuning the models misclassified a but as person but after fine-tuning both but i detected before fine-tuning only person trust is the date after binding both persons and multi cast re debt lastly before fine-tuning parts of Saba is classified as chair because chance of our shared a highest narrative among the base classes after fine-tuning the sofa is detected this each other discussion and future works will be covered there are combinations improved the performance by an average of one point six we never occasions where else the modified riveted modules improved performance by an average of 1.2 for bid average positions however the combinations of both improved the performance by an

average of 2.2 to mean average position the reason behind the improvements are due to the increases of the diversity to commendation operations and the modified the waiting modules is able to attend more class specific information in future books several improvement ideas will be attempt firstly the improvements will be attends on design a more compact and meaningful loss functions specifically for objetive actions future learning second ideas is to apply other data accommodation techniques as the data a malaysian techniques apply in this project relatively simple and the diversity of the omen that data is limited therefore death accommodations with generative have a serial network will be attempt lastly the improvement will be sought to by modifying the whole architectures for better new class feature learning and fine tuning with that I have finished the project recitations thank you for listening

Zero to App: Live coding a Firebase app in JavaScript, Kotlin, and Swift (Google I/O '17)

[MUSIC PLAYING] MIKE MCDONALD: Good morning, everyone and welcome to the second day of Google I/O. Hopefully you’ve met a lot of really great people, learned about the coolest new technology, and seen some great product launches But there there’s one launch that you haven’t seen And I am so confident in this launch that onstage at Google I/O I’m announcing that I’m leaving Google to start a startup to build this app And in short, this idea, it’s so revolutionary, I want you, all of you, to share your story with your friends, with your family, with everyone on the internet You deserve to share your story Nobody has done this before, absolutely nobody So I’m confident that will be the first to market And I’m so confident in fact, that I have already called all of my VC friends and we have meetings scheduled an hour from now But I think I was a little too confident We actually don’t have anything yet So I have this great idea but in order to get funding I need to show that it’s a viable concept I need an Android app, an iOS app, and a web app to actually prove that we can get users So I have a bunch of really smart people at Google and they all told me build an app that looks like this So have an application, your web app, your mobile app that talks to some database server And that database server proxies all of those app requests out to these storage systems, databases, and APIs That server is going to have all of my authentication logic, all of my business logic It will control who can read and write the stories and where those stories get sent But that sounds like just a ton of work, and I scheduled meetings for like 45 minutes from now So I can’t build this alone I’ve asked a couple of my friends to come out and help, Jenny, Frank, and Kat, can you guys help me? FRANK VAN PUFFELEN: Love it We all have a friend like that, right, who comes up with this great app idea and then sort of gives you 30 minutes to build it because how hard is it to build an app, especially here live on stage So it’s a tricky, tricky situation We have to build three applications And we have, what did he say? 30 minutes So let’s not do this architecture What we’re going to be doing today is we’re going to be using Firebase to build these applications And that means that the application code talks directly to our powerful managed back end services This means that Firebase takes care of all the things like scalability and security so that we can focus on building the features that our users love It takes a bit of time to set everything up so while the team is prepping on their laptops I’m actually going to talk a bit about how we’re using Firebase and building this application So remember, we are building a story sharing app So we’re taking a picture on our phone or selecting a picture on a laptop and then we’re sharing it with the other users with the app First thing we need to do is we actually need to allow the users to sign in And we’re using Firebase authentication for that Second thing is that we need a place where we can store and share the photos So for that we’re going to be using cloud storage And finally, we need a way to store and synchronize metadata between the users And that’s the Firebase database We’re going to go through each of these features in turn I’m going to explain how we use it in the app and then we’re going to switch over to the code and see how it works in practice So you’re going to be experiencing firsthand how easy it is to build an app with Firebase First up, Firebase authentication It’s our secure serverless sign in solution And we’ve taken all the complexity of storing email addresses and passwords, server side overflows, and we’ve wrapped them in a single cross-platform client side API That means that your users can sign in with Google, Facebook, Twitter, GitHub, or email passwords And if you watched any of the session so far about Firebase you know that we now also support phone number authentication And that’s great because that means that even if your users don’t have an account they can just sign in from their phone, get a verification text message, and continue using the application To speed things up a bit today we’re going to be using Firebase UI, which is our open source library for authentication So it handles all of the complex flows that you might have like is this a new user or returning user, do they want to sign in with Facebook or Google? And for example, password resets, email verification, account linking

Actually, it handles a lot It’s an open source library built by the identity experts at Google And it includes years of experience and best practices for sign in flows I think that’s pretty much what we need to know Can we switch to the split screen to do some coding? So to speed things up we’re going to be building all three applications at the same time Top left you see Mike’s screen And Mike is building the iOS version of our FireStories app At the bottom left, you see Kat’s screen And Kat is working on the web version And at the top right, you see Jenny’s screen But hold on, Jenny, that doesn’t look like Java code that I’m used to JENNY: No FRANK VAN PUFFELEN: Did you switch over to Kotlin last night? JEN TONG: Yes I switched it last night FRANK VAN PUFFELEN: Actually, that’s going to be great because it’s going to be so much less code that you need to write for this So they’ve all already built the basic framework for the application So they’ve added the layouts that they need to show the stories and added the inputs where the user can select or take a picture and to put them where they can send it So if we now run the app we can access the device’s camera or the local file browser and we can take or select a picture But if we hit send, nothing happens That’s because we’re not using Firebase yet So in the bottom right, you see the Firebase console This is where you manage all of your Firebase projects We’ve already created a project that will serve as the back end for this application Now all these applications, whether it’s iOS, web, or Android are talking against the same back end services So they share the same list of users, the same file storage, and the same database And that’s great because that means that they’re sharing all their states We’ve taken the configuration data of this project and added it to each of the platforms So now the apps can find their back end services on the Google servers We’ve also already added the SDK to each of the apps So we’ve done a pod install, a Gradle dependency, and we added the script includes So with that, I think we’re ready to start coding First thing we need to do is we take the features that we’re using about Firebase and import them from the SDK into our code We do this in one go for all three features that we’re using, off, storage, and database You can see that the code looks exactly the same for every feature So we have a cross-feature consistency with Firebase That’s great because it allows you to start using new features quickly But if you actually glance from screen to screen you see that it is really also very similar between platforms And that is great because it means that you learn Firebase on one platform, then when you switch to another platform you take what you’ve learned and use it on the new platform also I think we’re now wiring up Firebase authentication So let’s get to that one First thing we need to do there is that we need to configure Firebase UI for the providers that we want to use So today we’re only using Google sign in because we only have a few minutes left We also, in the Firebase console, enable this provider Next, we’re going to wait until the user clicks the Sign in button, and when they do, we start to sign in flow Now this is a very short amount of code but it triggers all the complex flows that you can imagine So if the user signs in on an iOS device tomorrow but on an Android device the day after they still have the same stories And when they want to sign in with Facebook today but with Google tomorrow it handles account linking for you All of that behind this simple code What we need to do is listen for when the authentication state changes And when it does it’s either one of two things Either the authentication succeeds or the user did not sign in If the user signed in we hide the Sign in button, show the Sign out button, and we enable any other UI elements that require an authenticated user That’s really all we need to do So if we now run the application we have Sign in working Let’s see who gets there first I think Kat is already signing in You can see that we get pop ups, we get to pick our accounts If we have multiple Google accounts we get an account picker All of that is handled for us with the minimal code that you just saw But just signing in on an app is not very interesting So let’s switch back to slides and see how we’re going to be using cloud storage If you’ve ever used Google Cloud Storage before you know that it’s our petabyte-scale storage

solution Firebase provides a cross-platform client site SDK on top of cloud storage that allows you to upload files securely directly from your device We provide the security model on top of this so that you can ensure only authorized users have access to those files from their device But whenever you upload a new file through Firebase you also get a download URL This is an unguessable URL that provides read only access to that same file That is great for our FireStories app today because we can use that to share the image between all our users I actually think there’s nothing more we need to know about cloud storage So let’s switch back to the code and see how we make this work We’re back in the code Remember, we already imported the feature before So what do we now do is we have a local file that the user took with a camera or selected from the file browser First thing we need to do is we need to figure out where we are going to store that file in cloud storage And it’s going to consist of two pieces So the first piece is the user’s UID So this is the identification of the authenticated user And by putting this in the path we actually make sure that the files from the various users end up in different locations That is great because that means that we can use Firebase’s server side security rules to ensure that only the authorized user has access to their files So that if Mike uploads a new story only he can change that picture The second part of the path is really just a unique filename because since we are writing to the same cloud storage location we want to make sure that we don’t override files we uploaded previously Next step is that we start uploading the local file to that storage location So we take the storage reference that we just created and we tell Firebase to put the file to that location This is all we need to do to start the upload in the background Now think of all the things we did not have to do here We did not spin up any threads There were no async adapters, no background tasks, no Grand Central dispatch All we did was tell Firebase to start uploading the file and it went to work All we have to do is wait for the upload to complete And when it does, one of two things can have happened Either the upload succeeded or it failed If it failed, we take the error message that we got from Firebase and we show it to the user And if the upload succeeded, we take the download URL of the file that was just uploaded and we display that on the local screen That is all we need So if we now run this app we’ll be able to upload files to cloud storage I see Kat is already selecting an image We don’t really see a lot yet If in the console we switch to the Storage tab in the Firebase console on the bottom right Kat, can you switch in the console? Can we go back to the split screen Thank you So, we have files here now These files were just uploaded but not a really impressive way of uploading a file But it is a very cute dog in there So clearly, we can now upload files to the cloud Not really what we wanted yet Nothing we can go to our VCs with So we’re going to be using the third Firebase feature to actually share the information between our users Let’s switch back to the presentation The Firebase database is our oldest feature And it’s still one of our most popular back end services We have hundreds of thousands of applications that rely on our database every day The Firebase database is a cloud hosted NoSQL database It’s really just a JSON tree And if you’ve ever modeled your data with JSON you know that it’s very flexible So this is the data model that we’re using today At the top level, you see that we have a node called stories And under that we have a child node for each individual story Then for each of those stories we keep the download URL Remember, that’s the unguessable but publicly readable URL And we keep the title that the user entered for that story We also keep the user ID of the user who created the story And just like before with storage, having to UID in the database allows us to secure access to the story with Firebase’s server side security rules So that is great because it means that when Kat enters a story only she can change the title of that story We call our database a real time database

And we do that because of the way you read your data from it So with most databases you do something like SELECT star FROM stories And then you’d get the list of stories back and you display it on the screen With Firebase, you attach a listener or an observer to the story’s node And from that moment on, Firebase will tell you whenever something changes under the stories node So when Jen uploads a new story, Kat and Mike get informed of that instantly And when Jen responds to that story, Mike gets the update straight away This is real time synchronization And it’s really, really easy In fact, let’s not take my word for it Let’s switch back to the code and the final feature to the app So let’s see, we already imported the database So we’re going to go back to where we upload the file and we added it to the local screen So first, we’re going to remove that code We’re not going to play the story any more because we are going to instead write the metadata of the story to the database So to do that we create a reference to the stories node in the database Since we point to that same node in each of the apps they’re going to be writing to the same location in the same database And that makes data sharing really easy But since they’re writing to the same location we also need to make sure we have a unique ID for the story that we’re creating So we’re coding push or child by auto ID for that With that, we’re ready to write the metadata So we take the reference that we just created and we write the values of our story to it So we write the download URL that everyone can display We write the text that the user entered We rewrite their UID so that we can secure access to the story And finally, we also writes the path in cloud storage because I think that might come in handy at some point but I’m not sure where yet I just have a feeling Now we can run the app again And if we do, and we pick a story, it we will write that data to the database If you look at the bottom right, you see that we’ve opened a database panel in the Firebase console And now we’re going to wait for the stories to pour in I see that Mike is already selecting an image Kat is working And you can see that the database lights up as the new stories come in That is great but it’s just one problem left Ooh, smile, you’re on camera There’s one problem left It doesn’t show on the local screens anymore So that’s the final step we need to take in our code We’re almost done here, folks So we’re going to attach a listener to the same nodes that we had before, to the stories node And we’re going to ask Firebase for the last 10 stories Now from that moment on, whenever a news story is added Firebase will fire the child added event And with that event we get a snapshot of the story So we take the value out of that snapshot That’s the value that we just wrote and we display it on the local screen again by calling display story That’s it This handles most situations We’re going to do one extra one because I’ve been telling about changing the story securely so much that I want to make sure that works too So in addition to handling child added we’re also going to listen for changes to the stories, which works with a child change event So now whenever somebody changes a story we get a snapshot of the updated story We take the value and the key of that story and we update to display on the local screen I think we’re ready to run So now you can see that if we run the app, stories that we just created actually are already showing on each local screen So that child added event that we were talking about fires immediately for any existing children But now as they’re taking more pictures and writing more stories those stories show up on all screens within milliseconds This is how you build a multi-user story sharing application with Firebase In fact, I’m not sure I didn’t read the full spec of the requirements but I think we’re pretty much done here, Mike And it took what? Like 15 minutes I think we have like 40 minutes left to actually go play in the ball pit What was it? MIKE MCDONALD: I hope they like it FRANK VAN PUFFELEN: What’s wrong, Jen? What’s wrong? JEN TONG: Wait a second, Frank I’m just getting a call It’s our VC Our VC says that there’s already a bunch of apps

that do basically this thing FRANK VAN PUFFELEN: No way JEN TONG: But you want what? Oh, my gosh Frank, we need to pivot like now Good thing we have like 20 minutes left So it turns out, if you go back to the slides, it turns out that there are a lot of apps out already that do real time communication, that do chat and pictures and text and stuff And it turns out, that’s so a few years ago and people are interested in new stuff now Specifically, people are tired of reading text and looking at their pictures Instead, they want to look at their pictures and look at other tiny pictures next to their big pictures So we need to a emojify our app today But it’s OK Pivots are hard but we’ve been thinking ahead Our skunkworks engineering team has already thought of this and already developed a really cool emojification algorithm that will take us to the next round of funding But, like any big engineering change, we have a few challenges that we’re facing You know, we’re pivoting For example, we have somehow burned through a lot of our engineering resources already I think it might have something to do with the bouncy castle in the break room Anyway, we don’t have time to write it for all three platforms anymore We have to take our original emojification algorithm and write it once so we don’t have to port it to all the different platforms Another problem we have is our emojification algorithm is super great and target’s really well but it consumes a lot of resources And although mobile devices are faster than ever before their battery life has been kind of flat over the last few years And we need to make sure that we save our users batteries so they can be taking tiny pictures of their big pictures all day And finally, security So we have our proprietary emojification algorithm but as you build out an application you end up with secrets that are part of your app, whether they be API keys or other stuff like that And if you put them into your app and deploy them to the world some of your beloved but possibly nefarious users are going to find that information and might do bad things with it And we don’t want our emojification algorithm to leak out because that would ruin us But it’s even more OK because the same skunkworks engineering team that developed the emojification algorithm has also come up with a solution to the problem Turns out, Cloud Functions are the solution to the problem So Cloud Functions allow us to take our already written JavaScript emojification algorithm that we developed on Node.js and port it over and deploy it to the cloud where it solves all of our problems First of all we all have to write it once and then we can hook it up to all of our apps because Cloud Functions integrate wonderfully with the rest of the Firebase SDK And it also allows us to run on Google servers in the cloud, which as it turns out, are plugged into mains electricity And since it’s deployed to the cloud we don’t have to worry about people extracting information from our deployed apps So we have a solution, we have a plan, and we have some JavaScript The first thing to do to appease the VCs is of course to update the architecture slide, since this is how they think This is where we left off before We have our application on one side This is all the different applications for all the platforms that are using the Firebase SDK to communicate with the database and storage And this is how stuff gets synchronized between them So were going to shift the apps up a little bit and we’re in a plop Cloud Functions right here Cloud Functions, when integrating a Firebase, kind of acts like another client It’s like a super client that’s there all the time and it can always watch for changes that happen, make all those heavy lifting operations, and then push them back to Firebase to the database into storage, where the clients will automatically get notified, just like as if another client did the update So I could talk about this all day but why don’t I just show you some code Sound good? Switch back to the code So here we are And we have down in the lower corner, Kat is our JavaScript whiz and she has been the one who developed the emojification algorithm so she will take the helm today And as you can see, she started out with a few imports, like our emojification algorithm, but she’s ready to get coding So the first thing you’re going to do is you’re going to write out a function header This is where you actually wired up to the Firebase database Here you can see that it has a path to Firebase and it is listening for writes on that location Because we’re integrating with an already existing app, this listens to the same location where our stories appear today

It’s going to behave just like an app so we’re going to have a snapshot handy So the next thing we’re going to do is we’re going to unpack the title with boring old text from the snapshot Then we’re going to do the magic We’re going to take that title and we’re going to pass it off to our emojification algorithm, which will take the title in and a wonderful stream of emoji will come out Finally, we’re going to take that emojified title and we’re going to stash it back in to the database in a new field in each story that gets updated But we have to do one more thing before we can deploy it The behavior for triggering on functions is a little bit different than on the clients It’ll actually trigger for each write instead of on a slightly more specific child added event So we’re going to add a short circuit up to the top to make sure that we don’t actually accidentally re-emojify something that’s already been processed And with that, we’re ready to deploy So we’re going to go on, deploy that code up to the cloud And here’s the really cool part Our applications were already aware of this field when they displayed the stories So we don’t have to deploy the app So you don’t have to recompile the Android app We can just run it and all the new titles that get added are automatically going to get emojified And people can be super happy looking at all those little tiny pictures And that’s great because we’re saved now, right? We have our emoji in our app and we can buy that kombucha on tap KAT FANG: Hold on JEN TONG: Wait, what? KAT FANG: My friend, Dog, is calling And my friend Dog says we’ve been mislabeling dogs People are really bad at identifying dogs and call them sheep– JEN TONG: Oh, my god KAT FANG: –and bananas And what’s that? You’re afraid people will upload bad photos JEN TONG: Who would do bad things on the internet? KAT FANG: Oh, cat photos Yeah, you can see the temptation All right Well, it looks like we have some more work cut out for us So let’s go back to the slides because I’ve got an idea We’ll be able to use machine learning We will be able to fix this problem And I bet we’ll be able to get the VCs on board if we can say we’re using machine learning How can we use it? Well, we’ll still have people upload their photos, same as before But instead of making them type in the title we’ll just figure out what’s in the image and perform our imagification on that And while we’re there, people seem to really enjoy not having to read In fact, I’m not really sure my friend Dog can read at all In fact I’m not sure my friend Dog can type So getting rid of the title is probably a great move We can also get rid of the complicated images Just get rid of those and keep the super simple beautiful emoji So our app will look something like this Just a giant stream of emoji It’ll be fantastic But we spent all of our time engineering on the emojification algorithm and we haven’t spent any time on machine learning Good thing we’re at Google I/O. And cloud has some APIs that can help us with this So let’s just put a cloud on it Every Firebase project is also a cloud project That means we can use the same project to access the cloud APIs and the cloud SDKs In this case, let’s make use of the Vision API, which will allow us to look at an image, determine what’s in that, and based on that, we’ll be able to go ahead and perform our emojification algorithm and spit out a stream of emoji So where does this fit in our architecture diagram? Here’s where Jen left us off We have Cloud Functions now, which act as a special client that can listen to Firebase Now we just need a little bit of room to fit in one more new icon So we’ll just shift everything over and, bam, we have room for our Cloud Vision API We’ll create a new cloud function, which runs in a secure environment So we can put in our project formation so we can talk directly to the Cloud Vision API, do that intense computation, and write it using the Cloud Function back to the database So let’s code one last time The first thing we’re going to want to do here is actually hook up to the Vision API And once again, you’ll want to be focusing here in the bottom left We’re using here our Firebase project credentials, the same ID And we’re going to make use of the Cloud SDK, the Cloud Storage SDK as well So now that we’ve hooked up our API we’re going to go ahead and define a new function It’s listening to the same place in the database because we want to pick up those same images

We’re still listening on right so we can fire off multiple functions from the same place But instead of grabbing the title, which we no longer have, we’re going to want to grab the image using the file path that Frank so kindly snuck in for us earlier Once we have the file path, we’ll be able to use the storage SDK to grab a reference to that file And with this new reference we’ll be able to go ahead and pass that to the Vision API to detect labels and find out what is in our image Once we have the labels, the response from the Vision API, we want to filter over just the labels and then we can emojify those labels Once we have our emoji, the last thing we need to do is write it back to our database Here, instead of writing to the same location we’re going to write to a new location, emojis This allows us to differentiate between who can write and view stories versus who can look at the emojis And honestly, everyone deserves emojis so we’re going to open this up to the world So we’ll deploy our new function And any new images will now get emojifiied and labeled automatically We’ll update the clients so that they now listen to the emojis field instead of the stories field And we’ll start getting a stream of emoji, a nice, definitely comprehensible stream of emoji So what do you think, Mike? You think we’re going to be able to get that funding we need? MIKE MCDONALD: I don’t know, Kat I say let’s let our first couple of users take a crack at it and see what they think In the meantime, we’ll do a quick recap Could we switch back to sides, please At the beginning of the talk, I presented this problem How do we build an app in under an hour that we can present to our investors? And Frank came on stage and showed us the easiest way to do that, to use Firebase, Google’s mobile platform to build your application for Android, iOS, and the web He used Firebase authentication to securely sign in our users, cloud storage to upload and share those files, and the real time database to synchronize file metadata across all of our clients Firebase lets you build your app incredibly quickly without having to worry about managing servers or infrastructure, writing your own authentication or authorization code, or dealing with database synchronization Then Jen and Kat showed us how to extend our app’s functionality using Cloud Functions and the Cloud Vision API These features supercharged our application and let us protect our proprietary emojification algorithms, which even though all of you have seen, they’re still secret Trust me And enhanced those algorithms using the power of machine learning And unfortunately, since you all don’t work with Kat, Jenny, and Frank, we’ve provided a number of other tools in Firebase to provide that level of support when you have to go and build your application and pitch to your investors Firebase offers high quality developer documentation in a number of languages, developer tools integrated with Android Studio and your other favorite IDEs, and high quality, free technical support If you’re interested in diving deeper into any of these concepts with Firebase there are a number of other talks available today and tomorrow over on stage seven in the main area And all of us and our team members will be available in sandbox H if you have any additional questions So let’s check in on those users again Let me give them a call Hey, what did you think of the app? Sorry, so you don’t think a stream of emoji is a particularly useful app Hey, that’s not great news team What are you what are you doing Kat? KAT FANG: I’m coding MIKE MCDONALD: Can you speed that up? We have five minutes before our– KAT FANG: I’m done, I’m done MIKE MCDONALD: You’re done? How did you finish so fast? KAT FANG: I’m really quick MIKE MCDONALD: OK Well, let’s see what Kat did So we need to prove to our investors that this app has traction So I need everyone in the audience to go to,

it’s mosaic with a “j,” cause that makes sense, and start playing around with the app Can we switch back to screen number one, please? Let’s take a look at what Kat whipped up in that 30 seconds That was really impressive, Kat KAT FANG: Yeah MIKE MCDONALD: Can we get screen one, please? Or everyone, go to, sign in with your Google account, and what’s happening is, you take a photo, we use our proprietary emojification algorithm to generate the stream of emojis And then, I was talking to some of our investors earlier and they told me that mobile social viral gaming is really popular for some reason So I guess that’s what Kat did She created a social game where you build– thank you, all, by the way, for helping fill this in– a larger emoji because that’s what the world needs So hopefully, this goes well and we’ll be able to get a giant ball pit full of gummy bears at our startup Thank you all very much and enjoy the rest of I/O [APPLAUSE] [MUSIC PLAYING]

The Cognitive Era: Artificial Intelligence and Convolutional Neural Networks

– [Announcer] Ladies and gentlemen, please welcome former dean of the San Jose State College of Science and currently special advisor to the provost, Michael Parrish – Good evening, and thanks for coming to the third in our series of Industrial Revolution 4 talks Tonight’s subject is artificial intelligence and convolutional neural networks If there’s any area in this series that really emphasizes the necessity and importance of interdisciplinary cooperation, it’s this one because in order to really understand and develop both neural networks and true artificial intelligence, you’d have collaboration between people from areas like neurobiology, psychology, obviously, network design, programming, and it’s something that we really emphasize at San Jose State and something that I think that we truly value in our collaboration with these groups I would like acknowledge our sponsors tonight We have ESDA and Team Hogan And I’m very thankful for them and their involvement What we’re gonna do next is, I’m going to introduce Sean O’Kane, who is going to, in turn, introduce the panel of experts This is an area that’s rapidly evolving in Silicon Valley and we have really an outstanding panel of people who have the expertise in this field Without any further ado, I will hand it over to Sean (applause) – Thank you, thank you, Michael Welcome, everyone How’s everyone doing? Good to see you Good, what a great crowd for a Monday Hmm My name is Sean O’Kane, I’m with Cadence Design Systems I’m the marketing director there, and also I’m the president of Big Kahuna Productions, it’s a video production group that works with many of the high-tech companies in the valley here So, once again, welcome to Jim Hogan’s Fourth Industrial Revolution: The Cognitive Era speaking series Tonight’s panel discussion is on artificial intelligence and convolutional neural networks So this series helps unlock really exciting possibilities and change the way we make decisions and interact with people to solve our biggest challenges So tonight the panel is a whole lot smarter than I am My job is to keep it very simple, keep it at a thousand-foot view They’re gonna take a technical deep dive and give you more of an understanding I’d like to give you more of an understanding how AI and neural networks can be applied in our life The capabilities generally classified as artificial intelligence includes successfully understanding human speech, autonomous cars, interpreting complex data, including images and video So the term artificial intelligence is applied when a machine mimics cognitive functions that humans associate with other humans, such as learning and problem solving Now, if you think about convolutional neural networks, it’s like using multiple copies of a same neuron in different places It’s like writing a function once and using it multiple times in programming So the network is better able to model data when it learns how to do something once and then uses that in multiple places So we’re gonna talk about that as well So here’s some examples that I’d just like to share with you In some of the emerging areas, putting substance behind some of those billion-dollar projections So the first one is something near and dear to me and to my heart is advanced melanoma screening and detection So my early detection was my wife when she saw something, spotted something on my chest and she goes, “That doesn’t look right.” Timing is everything, and now I have a foot scar from here to here, and I get checked every six months So she was my early detection and I was pretty darn lucky So researchers at the University of Michigan are putting advanced imaging recognition

to work detecting melanoma, one of the most aggressive types of cancer that is treatable at the early stages So today high-resolution imaging is so sophisticated that we’re relying on it for recognition for security systems, traffic recognition for our vehicles, and autonomous, or the autonomous vehicle of the future, and there’s a little pitch here using a 10-silica vision processor, which Chris Rowen might mention or talk a little bit about during the panel discussion So this is a great opportunity to use neural networks, neural network techniques to enhance computer vision, applications with a very high level of accuracy So neural networks, I’m sticking with healthcare Neural networks for brain cancer detection A team of French researchers note that spotting invasive brain cancer cells during surgery is very, very difficult in part because of the effects of lagging in the operating room, which is interesting So they found that using neural networks in conjunction with biomedical optics during operations allows them to detect the cancerous cells earlier and easier and reduce residual cancer post-surgery Pretty good Here’s another example, and this will prime you for thinking about some of the questions you might want to ask, how that may apply to your life or career when we bring the panelists up and we’ll have a Q and A Energy market price forecasting using neural networks So researchers in Spain and Portugal have applied artificial neural networks to the energy grid as an effort to predict price and usage fluctuations So a lot of different examples from weather forecasting to disaster event detection, civil and mechanical engineering to sociology, psychology, and the humanities Many different disciplines this touches So AI and neural networks are paving the way to solve our biggest challenges through innovation saving time, money, and lives So this is not a comprehensive list, but we will talk about many, many different areas, and many different disciplines have been added just this past year, and we’re just at the beginning of a mainstream applications for deep learning So once again, I really thank you for joining us tonight Right now we’d like to welcome to the stage our host Mr. Jim Hogan has held the senior engineering and marketing and operational management positions at Cadence Design Systems National Semiconductor and Phillips Semiconductor Serves on the board of advisors at San Jose State’s School of Engineering and is currently the managing partner of Vista Ventures, Mr Jim Hogan right here – Thanks, Sean – This gentleman is a Silicon Valley entrepreneur and technologist We’d like to welcome Mr. Chris Rowan (applause) He’s the co-founder and CEO of Sonics Incorporated, Mr. Drew Wingard (applause) He’s a registered patent attorney experienced with mobile hardware and software architecture Mr. James Gambale Jr., right here And easily, our last panelist is the president and CEO of One Spin Solutions Please welcome Raik Brinkmann (applause) Thank you, Jim – Well, thanks for coming, everybody I appreciate you being here tonight So this panel grew out of a panel we actually did back in the middle of summer in Austin, Texas at a design automation conference We wanted to expose that audience to ideas around AI and machine learning, convoluted networks, and it surprised us, right? That we got through the presentation, did the questions, and people just wouldn’t leave It was like 90 minutes So I know better No, people didn’t come to see me, that’s for sure They came because they have an interest, a curiosity about the subject matter And the subject matter is hard to get your head around because there’s a lot of noise going on, right? So, it was back then that we actually conceived, that was the beginning of this Industrial Revolution 4.0 series that we’ve been running for the last few months here at San Jose State Thank you to the university for letting us do it So tonight what I’d like to do is not try to solve every question in the universe,

but at least expose you to these gentlemen, to get an idea of what practitioners in the art are doing now Maybe get your interest, you know, I’m sure there will be a lot of questions A lot of you want to talk after the pitch I’d be happy to do that as well So I like to move on and we’ll go from Chris, to Drew, to James, to Raik And there’s an order that’s implied in that because Chris has been a practitioner for a long time, he’s got his own venture firms, got his own company that he started recently that he may talk about Also, if you look at the history of (mumbling), one of the first solutions out there that was dealing with a lot of the image technology So, without further ado, Mr. Rowen, would you please start – All right, thank you very much So I’m gonna take a perspective here of thinking about some of the broad impacts of this class of technology on key applications and even on the structure of the technology that we’re creating Drew, I think, is gonna do a little bit more of an interesting exploration down into the underlying technology So our two talks in particular are sort of a pair, and hopefully, together we’ll give you something about how it works and something about what its technical and business impact will be So a place that I think is really interesting to start is to think about cameras, to think about image sensors And in fact, we know that the rate of shipment of new image sensors is really spectacular, and we can therefore, assuming, say, a three-year lifetime for an image sensor for a camera in the field, we can look at the population of cameras versus the population of people Now, you might say, well, what does that matter? Well it matters because the old, conventional view of what cameras were for is that they were to capture pictures that people looked at So if you get a lot more cameras than you have people, it begs the question, who’s looking at the pictures? What are those pictures for? And we see that we’ve just hit a magic crossover point where we really now have significantly more cameras than people So if all our cameras were on and all of our people were looking at the output of those cameras 24 hours a day, seven days a week, 365 days a year, they could not keep up So we really have to think about, what happens when we have to do something to filter all that down to find the useful stuff? And that’s really an important theme here It’s also worth noting that of all data that we capture from the natural world, virtually, all of it is pixels Motion sensors are important, microphones are important, humidity and temperature sensors are important, but all of them are very low data rate, or very low volume compared to image sensors And so 99% of all new data that we’re capturing is pixels And so it has a profound effect on what is the shape of computing, what is the shape of storage, what is the shape of networks And just to give you a little bit of a challenge here, if you take the assumption that these are high-resolution cameras and that they’re on all the time, that means that you’re capturing something like 10 to the 19th pixels per second, and that’s far more than any network we can build or any storage medium we can build Another way to look at it is to say, okay, well, what does that mean economically? And I recently looked on Amazon for the cheapest camera The cheapest complete camera system I could buy, which turns out to be $11.99 for this nice, little, high-resolution security camera with power supply, and electronics, and network interconnection, and even infrared lighting – [Jim] And free shipping? – Free shipping with Prime And it sort of allows you to start thinking about, well if that’s what the camera costs, what does everything else have to cost to make that make sense? And I think what we find is that we are going to see

not just what we do with images, but because images are so important, everything about the cost structure, the technical structure of networks, and computing, and storage will be shaped more than any other single force by the needs of cameras And one thing at the very heart of what we’re gonna do with it is apply these more and more intelligent algorithms in order to interpret the data on behalf of the humans because either we need to get the cameras to be smarter or we have to make a lot more babies So we can ask the question, where does computing happen? And we have, in fact, lots of choices If we think about some set of monitoring cameras out in the real world, we could say, oh, well, we’re gonna capture those images and we’re going to do the processing, possibly some sort of neural network that will find interesting events taking place right in the camera Or we could have that computing taking place somewhere upstream in a local network It may be happening across the wireless and fiber and DSL channels in some cloud edge serving? Or it maybe happening all the way up in the cloud And there are very important trade-offs taking place between these two ends of the spectrum, because the cloud is very flexible You have infinite amounts of computing capacity theoretically available, you have lots of storage, it’s a very convenient place to merge together the data from many different sources But it’s actually very expensive, and if you add up the costs of storing there, transporting all those pixels up there, it’s prohibitively expensive to consider taking all the pixels and putting them in the cloud, at least from all these $12 cameras And so, in fact, what we’re gonna find is, we have to make some very smart choices, and these are not just technical choices, not just economic choices, but these are social choices as well as to what happens where So the question of system responses Generally, making the decisions close to the camera is a good thing If you’re driving your car, you do not want to be waiting for AT&T to decide that you can stop The scope of data analysis on the other hand gets much better as you move towards the cloud Protection of privacy is generally best if you don’t let most of the pixels out, if you just get the necessary information only being shared up into the cloud And the costs are dramatically affected, that when you look at what the cost of a unit of computing is if you do it in a specialized system on chip device close to the camera, versus doing it on some GPU or CPU in the cloud, you’re talking about a couple of orders of magnitude of cost difference and power difference and carbon footprint difference and everything else So one of the things that happens is that this move toward neural networks has made everybody in the hardware business extremely excited It really is a fundamental technical discontinuity in computer architecture because we’ve taken a whole wide variety of different kinds of algorithms that need a very wide variety of different computing models and replaced them with a computing model that basically says, if you can do a lot of multiples and adds in parallel, you win And so a whole new class of computer architectures is emerging And in this little graphic here, I look at many of the choices plotted on a horizontal axis, which is computing capacity per chip, and a vertical axis, which is energy efficiency, both of them representing in billions of multiplies per second, or in the vertical axis, billions of multiplies per second per watt And what you find is, if you track out what’s happening, everybody is moving up and to the right But we see the data center GPUs from NVIDIA, who have been extremely successfully and extremely dedicated to this segment over the last few years You have new kinds of architectures from new players like Google, who really understand this problem well, introducing new platforms And then you have the evolution of highly efficient architectures like vision DSPs

to be even more efficient in this specialized class so that the state-of-the-art solutions are dancing around a trillion multiplies per watt, and that’s more than two orders of magnitude More efficient than what we have of our general purpose, X86 processors And that two orders of magnitude is a pretty big deal, and it’s really gonna change the landscape And we’re gonna see this kind of computing perhaps as special purpose subsystems entering into almost every kind of computing platform, from small chips down in security cameras up through the largest cloud servers, and certainly occupying your cellphone in a significant way There’s even people working on some really interesting low-precision analog methods to do the same thing And so my particular interest over the last couple of years has been on startups, both because I’ve done startups, but because I think it is one of the fundamental engines of innovation, not just in technology, but in business models and use models and impact on society And there are a lot of AI startups In fact, it’s a basic intelligence test for startups If you have any way that you could possibly call your company an AI startup, you should, and everybody did regardless of what they’re actually doing, other than they probably are processing a bunch of data in some way And so there’s a lot of sorting out to do because if, the best working definition of an AI startup is founded since 2014 Bu within that– – [Jim] Close, 2015 – But within that, you find there actually is five or 10% of them that are doing something serious within neural networks And we can break those down in lots of different ways and it becomes a very interesting way to understand where innovation is really taking place And so some really basic statistics here Two-thirds of these startups are in the cloud One-third doing embedded systems Half of them are doing vision Only a small fraction are doing new chips, but the 15 or 20 companies doing new chips around the world, and a lot of them actually here in, of all places, Silicon Valley represent a big step function in terms of chip startup activity compared to what we’ve seen if you look back just three or four years So that’s kind of encouraging There’s also really interesting to look at it geographically This is an area, for example, where China is quite active but by no means dominant There are, by my measures, significantly more deep learning startups in Israel than in China today There are more in the UK than in China today And there are far more in the US So that this isn’t one of those things that’s sort of fulfilling some of the worries that we’ve long had There’s lots of good work happening in China, but also in a lot of places around the world And so it is a truly global phenomenon, and it’s one where this kind of technology disruption really is putting us in a position to do, who expect a lot of deep innovation – Thanks, Chris Yeah, to give you an idea, you know, as Chris was saying, there’s a lot of startups going right now, we’ll get to this in the question period So many have tried and put it on What, 5% or something like that, or actually probably need to test whether they actually do it So in terms of innovation, at least from my perspective, those are the ones I want to try to find and foster, and we’ll talk a little bit about more in the question-answer So Drew, I’d love to hear your thoughts, my young friend – Well thank you very much I am, as Chris mentioned,, gonna try to dovetail on his presentation a bit and take it a little bit more into the technology only because I think it’s interesting, and secondly, I think maybe it will help you better understand some of the trade-offs that are going on out there So there’s a pretty famous diagram, that if you start to get into this space, you will no doubt see it was published by someone as the Asimov Institute, that tries to summarize all the different commonly described topologies for neural networks And it turns out there’s a really large number of them I won’t go into this I don’t understand them all

I’m not even sure I understand what all their colored dots mean But what I think you can see pretty quickly is, these all have a relatively similar shape You see these circles and they’re connected by these lines, right? There’s a network effect that’s built into them There’s a couple really interesting things about this One of them is, for a lot of these topologies, we know much more about how they actually work than why they work And so when you hear terms like data scientist being thrown around that has a lot to do with the people who have the intuition that’s say, if I’m trying to solve a given problem with a neural network, which one of these topologies should I pick? And please understand, a large number of variations in exactly how you apply one of these The number of layers between the yellow and the purple and the green and, can vary quite a lot based upon the dataset So the one up here that I’m highlighting is the one that we call the convolutional neural network, which is kind of one of the focuses of the panel today, it’s largely because that one has been shown to do some very interesting things, especially with image data like what Chris just talked about So if we take a look at that one in a little more depth, what we found out is there is a lot of math in making one of these work, and that’s kind of interesting because people who’ve got a background in computers are used to trying to balance things like decision So the if statement in most of our programming languages and things and these don’t really look that way It really is a lot more math So if we take a look at one of those circles here, that represents the neurons because we are talking about cognitive computing is kind of biology-inspired, and so we’re trying to mimic, in some way, some fashion, what we think goes on in our brains And really the mathematical operation of habits inside of those is fundamentally a weighted sum? You take a look at the layer before you and you take the values that are output from that layer, you multiply them by a set of weights, and each one of the weights is specific to each one of the arcs You come up with a sum, that’s the closest thing we get to an equation here, probably the closest thing we get to an equation the entire series of lectures, which I think is a good thing – [Jim] It’s a great? – Exactly, the big sigma? – [Jim] It’s a fraternity? – And then depending upon how you’re doing this, you take that sum and you come up with an output value, and that’s the activation function, and then people argue to have lots of different ideas about how you do it, but typically they end up being nonlinear So as we look the ones that we use for things like image processing like Chris talked about, they’re much deeper than this, and the order of each of the layers is much larger They end up being not just, those look like one-dimensional layers, each one of the columns there, they end up being multidimensional layers And so the total number of nodes it takes to get one output is very large, and if you think about each one of the inputs ideally being connected to each one of the guys in the prior layer, you find that the total number of weights is massive And so if you think about that mathematical operation, the weighted sum, it is a set of multiplies followed by set of ads That’s this famous operation that is how we’ve characterized digital signal processors for many years that the multiply accumulated? And when Chris talks about trillions of multiply accumulates per watt, that’s what we need These networks can absorb as many of those as you can throw at it Now there’s a bunch of optimizations of people trying to apply because the energy and the amount of hardware it takes to process one of these networks can be really large very quickly So one of the things they play with is, how precise is the arithmetic We started off using general-purpose computers, and we used floating point numbers with a large number of bits of precision Then people started doing analysis and say they could use much, much less There’s actually plenty of research work that try to use one bit values We also recognize that by the time you’re done going through the process that’s called training of the network, a lot of times, a lot of the arcs have a very, very small weight on them You can replace that by multiplying with zero, and I know how to multiply by zero, it’s a really easy thing And so you can build structures and take advantage of the fact that the networks, while, ideally they’re fully connected, they don’t necessarily need to be and so they can be sparse Then as Chris mentioned, the sum of products function is relatively expensive in the digital domain, but if your precision doesn’t have to be very high, it can be implemented much less expensively in the analog domain it turns out

So there are plenty of people that are looking at interesting approaches there – Well, and the currency is power, for example – That’s right If you’re trying to get a certain amount of work done on a certain budget, sometimes the analog domain can be the right way to go So through all that people have tried a whole lot of different ways to go Maybe the most famous early work here was the chips that IBM built, including the very famous True North design, which where they were really trying to more closely mimic the behavior of the brain And they built a chip with our governments funding that implemented a million neurons and 256-million synapses connections between neurons The most commonly deployed technology in this space right now are of general purpose or graphics processing units like the one that NVIDIA just came out with I think their most recent one is the Volt V100, which has a set of dedicated cores that now add it in to make this type of multiply accumulate more efficient And this one is quite good at both the step of actually coming up with answers to neural network problems, which we call inference, or the steps to try to decide exactly what weights to put in our neural network, which we call inference – Training – Training, thank you Good Lord Chris also mentioned this interesting design that Google did The first design they did, which I think they talked about last year They called a tensor processing unit, or TPU It is really a machine that is really, really, really focused on doing matrix multiplies Most of the area on the chip is a big parallel multiply accumulate array They can do 64,000 multiply accumulates per second, which is a pretty darn big number But because of some of the limitations in the design, it was much more focused on inference and wasn’t very good at training And another design I know a bit about is a startup company in Campbell called Wave, who took a data flow processor architecture that they had and adapted it to this space So they’ve got a design that has 16,000 eight-bit processing elements all connected in a hierarchical network approach that they’ve applied with a massive amount of memory bandwidth If you look at these four designs, they could not be more different And what that says to me is it’s a very rich space right now for the exploration of architectures But there’s a bunch of implementation limitations So first of all, as Chris notes, the type of system you might want to build, and therefore, the type of chips you might want to apply may depend upon where in the network you’re trying to do the computing Another big question is, what types of services would you want to provide with your neural network Right now, inference is everything Being able to process those mages right now with a network that somebody else trained is the way it needs to go But as we look at the expectations that people have for where things are going to go, it appears that there are a number of interesting applications where being able to update our weights in realtime based upon what we see at this specific node becomes much more interesting, and so it looks like architectures that support some amount of continuing updating would be more important These weights end up being a big barrier So we have problems with where we keep the weights We keep them on chip, then we can’t have very many of them because we run out of storage space If we try to keep them all off chip, well then we run to energy problems and just fundamental bandwidth problems of getting them onto and off the chip on time And then so what some people do is apply techniques that we use in general-purpose software all the time, like you take a series of computations and you unroll the loops so that you can batch things up That’s a good way of producing the amount of bandwidth you need for weights, but that increase the overall latency And then there’s this overall question in this space of how do you manage the overall communications The communications that you need are dependent upon structure of the network you’re building, and if people keep coming up with clever new ways of doing this, data scientists keep coming up with new network topologies, the new ways of hooking things up, then you need more flexibility in the communication system than you might have immediately imagined – Thanks, Drew Well this is gonna be the transition We just talked about the possible architectures, the importance of weighting information on models, for example The fact that you have sparse matrices that need a lot of compute power And we’re gonna shift a little bit, and purposefully, we have James going next James is a good friend of mine

and somebody who’s taught me a lot about this Just so happens he’s a lawyer, and don’t hold that against him please I asked James to kind of give us some thought about the problems we’re gonna have, just not from a technical standpoint, but also sort of, what are we walking into? What kind of hornets nest are we gonna see once we get to the other side? And then Raik’s gonna follow it up with that, says, yeah, well how do we manage to keep track of everything James is talking about and ensure that we don’t run a car into some school bus Sorry, I had to come up with an image that wasn’t too bad But anyway, so, James, kick her off – So I wanted to talk a little bit about the value proposition for a cognitive science department here at SJSU And I think it’s important to think about the roles that a multidisciplinary department can play in contribution to thinking about this space So if you think about it, it’s not limited to these four silos There’s many, many different departments here at the university that will contribute in this space Engineering, of course, we’ve seen the technical, the highly detailed nature of these hardware devices that are designed to do neural net processing, and the complexity involved is pretty significant It’s important to think about, well how do non-engineers contribute in this space? What’s the role of philosophy, and psychology, and business, and the other departments in a cognitive science program? I decided to address that problem by giving two examples and talking to two different faculty members here at the university who are involved in thinking about problems that actually turn out to have relevance in this space The first person I talked to is Dr. Evan Palmer He’s doing some very interesting research here involving cognitive psychology and gamification I talked to Evan, and we actually agreed that this research could have application to neural network training We’ll talk about that for a moment And then I talked to Dr. Daniel Susser from the philosophy department And we have a very interesting discussion about accountability of machine learning decisions and who should be responsible in issues like accountability versus transparency And the point of all of this is, there are a lot of folks here at the university – Well, how do you depose of machines? Sorry – Yeah That may happen some time – I know – So, let’s go through this Let’s talk a little bit about some work that’s being done here at the psychology department Gamification is a way of using game-like elements in non-game situations to make people perform certain tasks better For example, people who are scanning, looking at X-rays at the airport to scan baggage or technicians who are looking at radiological images to try and detect disease What Dr. Palmer is doing is he’s applying game-like elements to determine whether or not these tasks can be performed better in some way And so, it actually occurred to me and by talking to Dr Palmer, that this is not that different than what is going on with training systems of neural nets, most notably, Google’s Alpha Go, which has a training scheme involving multiple neural nets There we go, now I’m back – It’s alive Yes So, anyway, this is not meant to be a detailed, substantive example, but clearly, there is a role for non-engineers in departments like a psychology department to play in the cognitive science department,

and I think this is a great example where research that’s not directed directly at the details of the hardware or the details of the network topology, perhaps, could potentially be applied to make neural nets or help neural nets think more like humans, which really has turned out to be the key to making some of these systems of neural networks like Alpha Go perform more like humans and perform their tasks much, much better than if we were just thinking about the math alone Another way that I think the non-engineering departments can contribute is thinking about issues of ethics and policy and technical policy And so I talked to Dr. Daniel Susser from the Department of Philosophy I had recently seen an article in the Wall Street Journal, maybe other folks have seen it, published last month It was written by an early pioneer in the neural net space, Kurt Leevee He was with a startup in the ’90s, HNC Software, that did a bunch of very early machine learning systems that are fundamental to fraud detection And the gist of the article was that, we need to think about how we’re going to explain decisions that are made by these emerging machine learning systems Why do machine learning systems reach a particular decision? You get into these issues of opacity Not only how do we explain, is it even possible to understand the decisions that these networks are making by analyzing the weights alone Or even going further, the network topology, the code, the training dating development, the methodologies, these all may affect the decisions So in the Leevee article in the Wall Street Journal, the issue was accountability verus transparency When you start thinking about the policy around analyzing the decisions that are made by these neural nets in the case where something might go wrong, is transparency enough, or rather, do we need to have other standards of mechanisms or technical solutions to provide for accountability? Accountability, perhaps, may mitigate the insufficiency of transparency You may not be able to understand precisely why a neural net is making a particular decision by looking at the weights or the topology or the code, but maybe by developing certain factors, for example, explainability, to ensure that non-technical reasons can be given for why an artificial intelligence model reached a particular decision, or developing confidence measures that communicate the certainty that a given decision is accurate Procedural regularity means the artificial intelligent system decision-making process is applied in the same manner every time And then thinking about responsibility to ensure that when something goes wrong, there’s appropriate means for those adversely affected to have a recourse And so this is really going to be a very important area, and having active involvement from the philosophy department, many, many folks across the spectrum here at the university is going to be very important I just wanted to highlight some of the thoughts in the conversation with Dr. Susser that I think are very, very interesting with respect to the non-technical analysis of some of these emerging issues One issue is, do technical solutions that provide accountability eliminate the need for transparency? In other words, can we come up with a way to explain the decisions and eliminate the need to look inside the network? A related kind of corollary is, is it even possible for technical solutions to keep pace with the increasing complexity of these machine-learning systems The devices that Drew was describing are only gonna get more powerful, and the networks are gonna become more complicated, and the weights and the data, it’s going to continue to increase

So what does that mean for our ability to explain the decisions that these systems are arriving at? Some important questions posed by Dr. Susser Who are the machine-learning algorithms being explained to? Do we need someone like Drew and Chris to understand exactly what the explanations mean? Or can that even be communicated to policymakers and decision-makers effectively Which aspects of automated decision making are being audited in the explanations Is it simply a function of the weights and the different variables as suggested in the Leevee article Is there some kind of bias in the training set that is affecting the decision making? Who’s responsible for problems? These are all complex questions We’re not gonna answer them here today, and again, I think you can kind of see what’s involved here and why you need a robust university with a multidisciplinary team to address some really, really challenging questions – Thanks, James So this is a great transition Raik’s company is involved in formal verification Formal verification is a method that ensures that we have coverage, and where they’re successful in particular is it’s a Munich-based company No accident there that they’re very successful In the autonomous car world, the safety world So it would be great to hear from Raik and hear his thoughts about where we’re going on ensuring and auditing this Please, Raik – Thanks What I want to do is want to bring the discussion to a particular context and try to take convolutional neural networks and its prime application, which is division and decision-making, plus some safety critical aspects in autonomous vehicles and try to understand and look at a few questions of what it means and as we have heard before We’re doing a good job in understanding how these artificial neural networks work We can explain the weights and the mechanics behind it We can do a great job on verifying the mechanics of it, at least to small scale parts of the designs, and we have some issues maybe verifying the connectivity and all the interconnects, but that’s something that we can solve But that’s not explaining how things work It doesn’t explain why Verification is basically for us a situation of providing a convincing argument why things don’t go wrong for a wide range of scenarios So it’s not just a question of whether it works for the cases you have studied Whether a training set is good enough, whether your test set is performing well That’s only one aspect to it Even for this it’s hard to say why it works There is more to it There is other questions that you need to look at if you bring these systems into a safety critical context One is, for example, they’re pretty hard to train but they’re easy to break There’s quite a few examples where you can easily manipulate images, and you will see that the network will come to some really interesting conclusions, and I put one here that is a study from, it was published in IEEE, where people have put some markers on some stop signs, and it would actually let the neural network drive that this ais a 45-mile-per-hour street sign instead of a stop sign, which is not really something you want to have in practice, I believe The second point that you may look into is– – [Jim] Although I know people that practice that – It’s hard to defend That means hen we look at machine-learning applications compared to more standard ways of engineering, if you look at how people built cars and things like that before, you could go back to the engineer and ask him, why do you believe your system works? Explain it to me What did you consider? What can go wrong? You cannot ask a neural network to do that It’s not explaining itself That means it’s not inter– – [Jim] Just like Trump – Yes, it’s a bit like it He may be a neural network in some way (speaking with strong accent) – Maybe some neural networks, sorry to interrupt – [Raik] I wouldn’t say it

Sorry to mess you up – At least quite a few of them So the causality of the models is the big question here So you cannot actually audit the decision trail You cannot say why a certain decision was made by the network Today it’s not possible, and there is no theory behind that that we can use in order to answer this question Also because of the complexity box and the operating process are also difficult to find That’s something that we can work on, but now solving the whole problem And there’s the motive I call it what I picked up from the literature, the recent studies The answer and uncertainty It’s about you don’t know what you don’t know It’s the fact that the training takes place on a decision you make It’s the trainings that you put in And will only be answering questions similar to the trainings that you have considered You can make a risk analysis on a training set You can, you know the probability distribution maybe for a certain amount of parameters But you don’t know that for all the parameters because you might not know all the parameters that are relevant in the system So there’s an uncertain risk here You cannot actually measure that and you cannot take precautions other than trying to understand the environment as closely as you can and trying to model all the different ways the system might be used or the situations that might be exposed to There was one study that I saw that said, hey you should actually make a collection of all the different scenarios that get into Including the car driving on a motor way close to the scene, it’s an army sitting the coast There was one video of a poor guy who actually survived the tsunami in Japan making really interesting and very successful decision on how he was maneuvering his car and survived the blast because he was actually seeing the situation, making the best for him and could get away There was no way an AV would do that if it hasn’t been exposed to the situation before Obviously, this is maybe something that you say okay, this is really reckless Who’s telling your system engineers what cases to consider? What’s the autolesion, how do you find out? And last, but not least is a really complex problem because there’s so many interdependent sub-systems that it’s pretty hard to get the connection between all of them and make a full verification for this So it is said that there is a no one knows what they’re defending against Some of those are real examples, this was the first case It’s actually soriatically possible or hopeless So we’re really in the early days of understanding the impact of this technology And what it means is if you want to apply it, you need to at least find some good practices of how we tackle this And there’s some interesting ideas of how we can do that The main point I wanted to make was the answer in uncertainty is the training data forms the requirements This is the data processing It’s not an engineering science that we had done so far So box are the ones that kill you It’s the things that were omitted in the training set that actually will cause fatalities in the end It’s not the thing, I believe that we can make these works on the things that we know But the things that we don’t know are the problems though One way to do it is to insist on models that can be interpreted by people That was going into the question of accountability So if you need to find ways of doing it, as a practitioner of you may want to exclude features from your training set where you don’t know how they relate to the outcome Just don’t do it because if it’s in some unknown way related to what you want to do, you may not want to do it because you don’t control it There’s failures of these networks that you might be able to detect Or when your network is less confident than the predictions, you may want to actually put some precautions into the system It can also look at the cycle that it puts you on the right How common factors want to learn from the experience in the field Verification becomes implicative for the whole cycle It’s no longer just designing the system, verifying the system, deploying the system

Because the data’s getting feedback into the system From the field, you need to have the verification in the loop the whole time So you need to re-verify the whole model You need to recertify what it’s doing every time you deploy that otherwise you run the risk of data spoofing as I mentioned on our first leg and other things So last but not least, it’s the systems verification question and you need to have interdisciplinary views on this It’s not as efficient to just as the machine guy to come up with the verification And we need to employ a lot of non-machine methodologies in order to control this So as one of the last points here, someone made a comment that the ID verification is more complex and interdisciplinary than the chip verification in the street and that’s a complex one if you, from sitting in your industry, you know what this means It’s really a very, very difficult thing to do – Yeah, thanks Raik Hopefully what we gave you so far tonight was a overview of from the hardware to the sort social issues and verification issues And then, what’s a method that we can actually design these things Now I think a common misconception is big data is often said in the same sentence as neural nets and machine learning And we had a pretty nice discussion right before we came on, it’s about the training data People kind of know that data scientists have know what they want because just turning a convoluted network on a big data set, who knows if you get a result at all So Chris, why don’t I start you, asking you what do you think about that comment by me? – It’s an interesting perspective because it’s clear that having a lot of data, big data is one of the prerequisites for applying this class of algorithms In fact, a good way to think about all of neural networks and deep learning is that it is a statistical method, which is applicable in problems that have gotten so complicated that we don’t know of any conventional algorithm to do a good job on them And so we really use this learning method to get a statist, a good statistical guess, the best guess we possibly can get based on all of these examples about what the right answer is or the right behavior So you need big data to do it, but the mere existence of masses of data isn’t very useful because the real purpose of the big data is to use those examples to train the system And that means you not only need to have big data, but you have to know what the right answer is for all of those examples And so there’s a higher, there’s a higher standard or an additional need besides just having terabytes of data You have to have useful data – Useful data or apply it usefully Yeah so a quick comment on what Chris just said In semi connected manufacturing it’s a continuous process so more akin to the pharmaceutical industry than building a car So it’s statistically driven and used to operate in the three sigma range and in terms of distribution thought we were geniuses So with the smaller, smaller geometries, the variance is so high that I have to be in seven sigma range So back to statistics for a second, we don’t have enough time to run the statistics So what the machine learning’s allowed us to do is reduce that time and hopefully find that variance So along those lines, I’m the investor, Chris is as well, but and I’m the investor tonight And so one of my commies just exited Friday and it was all about getting to the seven sigma distributions and we did that through AI techniques So when I hear these guys talk, you can just imagine that I’m thinking company, company, company, right? Drew what do you think about the training data? How hard is it? How important, for example, Wave, one of the companies you’ve illustrated, getting training data for those guys must be really hard – Okay well so a company like wave is trying to sell the computing solution part so they don’t need the, it’s their customers who need the training data But as Chris mentioned, the data by itself isn’t valuable until the, as he said, the answer’s been plagued Now for a lot of data sets that is often called labeling And to label a data set is actually an awful lot of work And so one place where people might apply their

gamification strategies that James was talking about is in trying to incentivize people to go through data sets and as humans and label them So that’s like in some of the famous image, that’s a cat, that’s a dog, this is a stop sign – [Jim] That’s cancer, that’s not right And so the quality that labeling is incredibly important to the success of the training process The training process itself is complex The math that I showed was the inference math, but the mathematics and the communication associated with action training involves them trying to calculate, okay I ran this training data set through my network and I didn’t get the answers I wanted Now what do I do? And then there’s a whole lot of different kinds of math and communications that go on in trying to come up with what’s the best way to update my weights so that the next time I push something through it has a reasonable chance of getting better – Or you can multiply by zero So now let’s see how comfortable James is with this question So we have this training data, it’s got value, commercial value, right? Actually the training data is IP I don’t want to give my IP up to anybody Right? – Yeah I mean that’s really kind of the central issue I think that’s posed by the Levee article that was in the Wall Street Journal I mean he’s suggesting that we can circumvent that issue and not disclose the training data, not disclose the weights because we can develop technical measures of accountability And the philosophical question and the thing for the technical ethicist and the lawyers to straighten out is accountability versus transparency Is accountability alone enough or do we have to open up the black box and disclose its contents in order to develop technical policies around how accountability is going to be developed for these – Yeah, yeah for me it’s standard It’s all business, all over that kind of thing And so yeah, I can see a commercial offering at some point – I wanted to chime in with just one additional thought associated with training databases And that is not only are they valuable, but they potentially capture bias of one sort or another And it’s pretty hard to look at, say, a million photos and say, is there something wrong with this statistical distribution here? Am I over represented you know– – Cocker spaniels – Cocker spaniels over a Norwegian Elkhounds – Bulldogs – Well it’s kind of hard to tell just by glancing at it And in more profound ways we actually need to think about the robustness of our systems as being bias free, whatever that means But it means something in order to be able to use these systems effectively – So you think that’s a great lead in for Raik, actually – Well I wanna lecture on the question about the data and where it comes from and that’s actually the gold of the next machine age It will actually be probably one reason why people want to push computing to the edge Because if you control the data and you actually do the computations and you learn from it and you control the whole flow and it’s part of your IP, it’s part of your system, you’re not gonna, you don’t need to give it up It doesn’t go to the cloud No one else needs to know what it’s doing so it can protect your IP If you can get away with accountability then this is gonna be a really good thing for people doing that But if it needs to be transparent, this is gonna be challenge and it goes the other way around If you have these trade-offs that you have heard of about the the data identity and the storage that you need, how do you put any form of transparency in place if you have so much data It’s probably not going to work So it actually makes it even worse – Let’s see if I, yeah there’s some Hey can I invite Sean to come back up? We’re about 50 minutes into the hour and I’d like to get Sean to help me out, invite questions from the audience if possible – Yeah, so we have one mic set up in the back here Don’t shout your questions from your chair Just feel free to line up and ask a question

I have one for James I don’t know if this was discussed earlier, but what about computers as inventors And in other words, what are the legal and policy implications of a computer actually being an inventor? – Yeah, that’s, I don’t think we have thought on that yet I haven’t researched any case on that specifically, but I think it would require modification of the US code to, only people can be inventor as this point So– – I’ll go ahead– – I think to the extent that a piece of AI came up with an idea and that piece of AI was developed by a person, it would likely be the case that the person would be considered the– – The inventor? – Yeah – That makes sense I mean otherwise we’d sort of have to say, well my pencil invented it, I didn’t – Well maybe cyborgs Any questions from, it looks like Graham’s ready – Graham, do you have a question? – [Graham] On, Sam can go first – Oh Sam – [Sam] This question is for Chris – Well, you need to turn that on? – The mic out in the audience is not working, fellas – There’s switch right on the end Could you help him out there? – Tech support’s coming There it is Nice ring to it, thanks – So this question is for Chris What do you think is the role of open source processor like risk fibers in AI? – That’s a good question Two thoughts One, I think there is this open source movement which is touching many, many different areas of intellectual development I think it is most profound in software and most likely to be successful because if you, you can replace a piece of open source software relatively easily if you find it isn’t everything it was meant to be Open source so far has played a very minor role in hardware, partly because it’s really the case that things like processors are such essential technology that people want a robust architecture with great support And here are at two four, there has not been open source solutions that have really good support Now risk five does seem to be a departure in that it appears that there may be enough of a critical bass in the ecosystem to have it survive On the other hand, the risk five architecture is more like that intel X86 at the bottom of the chart than any of the other architectures that is shown there And that these new architectures which are extremely parallel machines are going to be the ones that do essentially all of the evaluation for inference and do all of the training And conventional risk four six architectures are really going to be supervisors on the side So I think that open source is becoming relevant, but there’s nothing about this revolution that particularly favors intel versus armed versus sci-fi architectures They’re really next thing to being irrelevant to this computational revolution – A couple things on risk five Risk five event was last week in And WD’s gonna use it and their systems make sense from WD’s standpoint because gotta integrate a whole lot of companies So it cleans up their integration path, but as Chris says, it’s for control The computational engine is this big massive parallel process and unit There’s a reason in video’s good at this, because they’ve been moving a lot of pixels for a long time So we’re gonna see something on your, I believe, out of application processors that are working in tandem with control stuff So that might be a good segway Drew, what do you think? – Well I mean, as Chris said, I think the reason that open source, when open source it definitely works because you get a network effect, different kind of network But you get a network effect You get a critical mass of people who are willing

to contribute and care enough about the result that they themselves or their company allows them to invest some time for the benefit of the community And so the whole is worth more than the sum or the parts So that’s been a big challenge for hardware type structures because typically, semiconductor companies don’t like to let their engineers give their time away to such tasks People try to keep things to themselves And when you’re pushing really hard on numerical metrics like the number of things you can do per second, people get all excited about what they believe to be their competitive advantages In this space, as Chris pointed out and I tried to elaborate a bit on my slides, it really is all about a specific branch of math And I see a very, very rich idea of places where universities could play, putting together ideas for technologies that potentially could be non-differentiating open technologies that people could use as accelerators in some of these systems I think that’s a great area for academic research – James, what do you think about source here? – I think– – Or on the hardware I think – Yeah on the hardware side, I think risk five could play an important role as a supporting technology I tend to agree with Drew’s comments on the IP issues I think to the extent that hardware companies have technologies that confer a particular advantage in the neural net computation training, arms race, they’re gonna wanna keep those close to their vests They’re gonna wanna file patents around these things either as a protective or a source of licensing revenue and it, that’s problematic in the context of a risk five type model – Yeah, I mean we didn’t disclose, but you can Google James and you will see that was, he worked for a call calm down in San Diego so he’s well versed in this Raik, what do you think? Did open source give you a good headache, buddy? – Open source hardware, I think not really because as we’ve discussed, the application companies for open source hardware are actually in those upper companies so it’s not really the same environment as if you had this open source software And the other reason that I would like to add is that potentially there’s a difference between software and soft course It’s not really soft that we want to do in hardware IP And there’s also the cost for the hardware itself So it’s not like you have the NIE on the software and then you have a community and it distributed and copied as many times as you want There is a substantial cost in the hardware itself And that’s making big part of the cost for the system that you’re open source It’s not gonna come for free even though it’s open – Yeah, free is never free A great example of that is check out the market cap of Red Hat It was free sport at Linux and they’re like a $30 billion company for God sakes How’d that happen? So we got some more questions? – Hi, I had a question on the social side It seems like with all of these sensors that are available, I can capture all of my data, all of my experiences as a human being from birth onwards, including my biometric data And it seems to me it’s almost like a privacy issue Shouldn’t I own all of the training data that can come from me as a human being? And shouldn’t it be only as an opt in that you can collect any kind of training data about me? And maybe there’s a business model for managing people’s personal data I don’t know, but it seems to me on a social level, people should be able to have complete control over their data, or at least of complete visibility in what’s happening with their data – I think that’s a really important distinction between visibility over and control over And I think it’s always useful in these cases when we’re talking about automatic data collection to substitute human data collection So in society, in the absence of technology, do you have complete control over what everybody thinks of you?

Other people are observing you and forming opinions about you and you don’t Especially if you go into public places, people are going to be sensing you, learning about you, training their brains about your behavior And it seems implausible that we could say that we own everything that comes out of an interaction with other humans However, I mean clearly this is data collection on a massive scale and data collection with a possibility of reuse in purposes that we never had in mind But we need some practical ways to draw the line I mean today I think the law generally is if you go do something on the public street, then you are by actively going out in public, you’re giving a kind of permission for other people in that public space to sense you But what you do behind closed doors remain yours But we have kind of this very crude separation But well we better ask the lawyer here – I think that to the extent that you’re interacting on a public street, as Chris used in his example, is and you’re being sensed that you’re not gonna be able to control the use about what others learn about your interactions with others or things in the public space – But what about in the work place? Do I have control over the biometric data that I’m emitting in my work place? – Yeah, that’s a more complex question that would probably be governed by contract I mean it really, you’d have to look at the contractual agreements between your employer I mean that could be something that’s covered in an employment agreement It might be an issue for an employment agreement in the future And the disposition of that data and the ownership of that data is all something that would have to be straightened out via contract They have to contrast that private setting with all of us in a public setting And I think the law is pretty well developed with respect to privacy issues or your lack of any clear right to privacy in a public setting In a work place or a private setting, it depends – The only example I can give you that, where there’s process or policy in place is on health records So one of the companies I’m involved with has contract with the Veterans Administration and we’ve automated an aggregation of their health data Which is useful because it’s in despair in places and different formats So we get all that VA data into a big data set We start looking to transfer the veterans Now what you wanna do is ensure, and what the law requires by the way, there’s laws, HIPAA laws on this, is that we don’t ever, we deal with the patient themselves We can talk about behavior, trends, things like that So we gotta really abstract the data So I would argue that at least in the case of Veterans Administration and healthy records, there’s thought and law there Now is that gonna, the first place it’s gonna migrate to is the rest of us So I think health is probably the first place – I think it’s really useful to also recognize that different parts of the world are wrestling with this very differently Europe has it on a trend towards much greater protection of privacy, much greater assumption that people retain rights and the other end of the spectrum may be China Where surveillance is– – [Sean] Everywhere – Everywhere And the idea that somebody’s going to claim that they’re behavior on a public street is private information, I don’t think is gonna play terribly, terribly well – I think it might be important to think about who’s collecting the data too Think about your interactions in your vehicle and you’ve got waymo driving around, sensing I mean is your behavior in your vehicle immune from capture by those companies that wanna understand driving behavior to fill their autonomous vehicle systems to put them in I don’t think so I mean I think that data’s out there for everybody to collect, but distinguish that with perhaps the government trying to collect information about the way you interact in your car So I think who’s collecting the data and what the purpose is these are all very intertwined prognosis – Well the scientist in me, let’s talk cars for a second

So he was saying give me the data set, give me all the the assertions that I can plug into my verification system And if I have everybody’s driving behavior, then I have a big database that I can utilize to ensure I save lives So who makes that decision? That’s above my pay grade But it seems to me, yeah I’m not particularly happy with them knowing where I am all the time, although I never go anywhere because I’m an old guy Because you just sit home, and Friday night’s Denny’s so I’m pretty predictable But you know, I think, the special and 6.49 for anybody over 55 But with that said, I’m kind of okay with them taking all my data, that data, and plugging it in to an autonomous thing Raik, what do you think? – Well I think as European and a proponent of privacy, I think it’s one of the most important cornerstones of democracy and freedom And you shouldn’t give it up too lightly And in this particular case, going back to Graham’s question I think you may want to have some control over where the data goes that you use So it is the first stage that you talk about the cars collecting data about you And there may be some reason for doing that in order to provide some service to you that you actually want Now leaving the government outside and the Chinese government particularly, it is then the question, where does that data go from there and who is collecting data from multiple sources And Jim, your health insurance might still go up because people are collecting some data from other sources making connections which actually identify you as Jim Hogan, even though you’re record was not private But because nothing personal, it was anonymous But there’s enough study out there that makes clear that it’s, from enough data you can still go back to the source and identify individuals And you will all have all this data reconstructed from multiple sources – It looks like they’re piling on I mean that second scoop of ice cream might show up on your health record – The third sure does But okay, who uses Waves Anybody? Yeah man, if you live in Santa Cruz you do And so that’s a great example of everybody using your data Now it was really great before everybody was using it But okay, that was kind of useful experiment, but I don’t know, it’s lost its utility I think So another whole lecture – [Sean] We should take another question right here – Hi, I’m Edmond It seems as the most AI projects will run on new type of chips, so it’s silicone Silicone is hard and of now we have a big wave on consolidation in the chip industry So my expectation is that in five years there will be two electronic chip companies left So my question is to those who are in the startup industry, what’s the point of having a chip startup in that kind of environment? – So I mean if you saw Chris’ data, actually we’re seeing an expansion in the number of small chip companies because of these opportunities The trend 15 years ago was absolutely in the direction you’re talking about We had everyone trying to chase this one socket in that phone thing that was going into our pockets and it required huge investments Hundreds of millions of dollars and a thousand plus person design teams, all chasing this unicorn of an opportunity For better and worse, that market changed a lot Large semiconductor companies started scattering those design teams into different application areas And some interesting things happened So we have this set of emerging technologies Today we were talking a lot about the neural network space There’s been some fasting work done in trying to exalberate things like crypt to currency, blotching activities There’s a lot of small companies, incredibly economically focused companies in that space And so actually there are, from my perspective a lot more small semi companies today than there were five years ago – Yeah I mean let’s stay on the crypto currency sort of the second right So I got a company, it’s an Asic company, builds Asics for people

And they got a pipeline of 20 Bitcoin like companies They’re bring their FPG algorithms over to build in Asic because they can do it 100 times faster And that 100 times faster means they have that much more revenue they can produce So economic, it’s worth while So we’re seeing a lot of little guys, I mean a lot of small volume processors – I think there are really three, three trends that are woven together First of all I actually agree with the basic premise of the question I don’t think that there is something about neural networks that say, oh yes semi conductor consolidation is gonna be reversed Because that’s much more about manufacturing scale and distribution channels, which are not fundamentally changed by neural network technology And so I think there are a lot of startups I think some of those startups will fail I think some of those startups will be acquired A very small number of them may survive over the long run, but we’re not gonna see 17 of them as major semi conductor companies 10 years from now Impossible But I think there’s also the trend that says that a lot of semi conductor innovation is now taking place in systems companies, not in semi conductor companies And it really reflects the fact that due to integration and due to the cleverness of software methods and higher level algorithms, like deep learning, that system companies often have much more of the really important rare insights compared to chip companies The chip companies are good at producing chips for the problems that are already very well understood But I think we’re enjoying a period of very rapid change in how you solve some of these really big problems And that advantages the system companies And some of them, like Google most recently, will choose to leverage that know how in their own silicon designs So it’s true, we do have this consolidation taking place in the chip companies, but in background you have this re-vertical integration taking place in lots of other places Often with very particular applications in mind And I think it is in fact this phenomenon where more and more of the real value add in systems is happening in higher layers in software And partly because more slog deceleration Partly because there’s just so many interesting problems out there to solve, that can be solved at the software level And so just as a very personal dimension of it, I’ve been a semi conductor guy since I was working on 1k dynamic rams A very long time ago And I really reached the point as an investor, I don’t invest in semi conductor companies because that’s not where the action is I really believe with my heart and my wallet that the innovation opportunities, the pace at which you can do things is much higher in the software domain So I am starting a new company in speech processing applications It’s a very technically intensive, it cares a lot about high compute, and it’s never, I hope, going to do chips – Okay let’s just look at the Google account So they’re actually going to be a merchant supplier semi conductor They’re gonna consume all that in all likelihood insight Well maybe they will, a chip android on it or something But with that said, it’s unlikely that they’ll end up being a big volume semi conductor guy So the systems folks are realizing that they get a lot more cycles and a lot less energy by getting to an Asic So all this is true This is a constant debate – [Sean] Jim, this was an observation from Twitter from Catherine and she feels like there’s a disconnect between Silicon Valley and Humanities because she basically says, tech people think we just sit quietly reading a physical book Not a lot of what we do is construct, analyze narrative Who reads labels, code, humanness So that was just in her observation, but would that be a topic of social engineering? – Well I’m gonna olay this to my panel I’ll do it to my person to my immediate right – Thanks a lot for that question What was the question? – [Sean] It wasn’t a question It was an observation that there’s a disconnect between the technical community and the humanities

– I see So in some sense that’s true because we still keep trying to understand what we do here on the technical side As I mentioned earlier, it’s like we don’t even know why it works How are we supposed to explain that to someone else And then understand the consequences of what we’re trying to do I think that a bit early This discussion will come, I think, and we have it and there will be discussions about implications of what the technology will do The question really is when is the point to have this discussion What’s the thing we need to fund out And I think right now, we are still trying to understand the consequences, the potential ones of what we’re doing and obviously we need to be careful with what we’re trying to release – I am struck with the fact that the premise of the question is sort of that we are just specialists in Silicon Valley But in fact, we are humans first and specialists second We have families, we have all of the usual problems The question of our ultimate demise weighs on all of us especially as we get older And so we’re living the human experience first, but like so many people in so many places in so many parts of the world, the best way for us to survive, feed our families, do something interesting, is to specialize in a particular way And we specialize around electronics and software here in the valley Just as brain surgeons specialize and farmers eking out and existence in some demanding ecosystem, also are very highly tuned specialists for that But we are, I would claim humans first and specialists second – Yeah, just pick up on that Thanks for giving me some cover while I came up with an answer But the way I see it is, rarely can I fix anything I mean I’m just, you know, I’m a math guy But I often can make things better And that’s the specialization side I can bring our skills and our knowledge for a tool for everybody That’s our value proposition, I believe – [Sean] Question here – Hi, I’m an old guy and I’m looking at the neural networks and think, those look like analog circuits and all the math is like fast pace Which is something that most of the digital guys I see in the panel don’t really deal with So is EDA ready for this? – Actually I think, thanks for bringing it up People that know me say you turn everything into a spice problem, Jim But I kind of see it where as far as metrics So metrics solvers, et cetera, et cetera, et cetera Analog’s great because you just cycle You only use power when you cycle So you put it on the edge where you have a battery maybe and you can get 10 years worth of battery life out of it So I think the analog, me personally, that’s one of the things I’m gonna work on Is I’m interested in that, as well as all of the other things, but the analog solution errors is a good one You’re bringing up a great point – I think it’s still the case with lots of people having tried to make it otherwise, analog is still more art than science And that’s a little bit like the whole neural network problem today where as we say, we don’t know why the things work So people come up with an insight of oh gosh, if I add another layer of this shade here maybe I’ll get better results Let me throw a bunch of data at it and back propagate and trade and my gosh, it’s worse And so it really is that domain (speaking over each other) Pardon me? – It’s good fit – Yeah – Two drops of black magic together – So I don’t think it’s an EDA problem yet It could become one, but I don’t think it’s gonna be a problem yet – I’m gonna take issue with the premise of the question because I don’t think the interesting dichotomy is between analog and digital I think the interesting dichotomy is between programmable and non-programmable Because non-programmable fundamentally assumes knowledge is stabilized, we are able to freeze this problem in a way that we are either gonna go build the digital circuit or go built the analog circuit that does this one thing And that we’re willing not to really change it in important ways for the next two or three years that it takes to get that, either analog or digital chip, out there I think that if the large arc of technical history

tells us anything, it says that software wins That of course everything’s running on some hardware deep down inside, but the layers of innovation and the layers of value add that are accumulating in software are so rich and so deep that we really need to be ready to leverage those things And yes, of course, the non-programmable system is always more efficient than the programmable system But the non-programmable system is really a bet against learning And history says, people are gonna go on being creative So I bet on programmability every time Programmable analog, great Programmable digital, great But just not non-programmable – There’s a fair amount to start us going into the analog space And in machine learning so that’s an irrelative point But I think that the question earlier connect to the systems question and if you look at the most profitable systems companies in Silicon Valley, they actually do have some hardware business inside It’s actually a good thing to have it So I don’t fully agree with the software Only thing if it comes to making a valuable business– – Now the panel’s getting fun – I’m not saying that there isn’t a role for hardware, but you think about the software experience first and that’s where the innovation, I think had predominantly been driving value for the last couple of decades – I think it’s actually not even software anymore It’s the ideation from various startup experience that you provide to the user And it doesn’t really matter if it’s hardware or software It’s only meets to end So if you look at it from this perspective, maybe we can say that it doesn’t really matter how you implement it, software is maybe the most flexible thing do it and get to the money quickly If you make profitable, you want to add some hardware But the user actually doesn’t care as long as it works – I’m really glad we invited you Because we would have never got that out of any of us And so yeah let’s think about Chris’ chart The edge, the cloud Over here, maybe some analog Not much programmability If you have some programmability you can afford it, okay But if it’s gonna be low energy and you want it to last a long time, you know software take a low energy And so that might, there might be a place I don’t disagree with you that software, this stack of software has allowed us to have a platform for multi-generations without changing the hardware, which is really cool – But Jim, changing a million devices costs a little bit of money So if you got it wrong and you deploy it, then having some programmability is incredibly valuable – Oh now they’re piling on Bring it on, it’s great – Well and I think there’s ample evidence that to some extent you can have it both ways I mean you can have things that are very highly energy efficient and which are highly programmable And most of the really significant hardware innovations have taken place in things that are pretty programmable That’s the only reason why we’re so fixated on processors along the way – Actually I don’t know that I can even care, as long as it meets the parameters of the system problem And so you know, darn it we’re out of time – [Sean] Yeah, we’re running a little over, but could you just do one question here This gentleman right here, he’s been waiting – Hello, so I wanted to know what your views regarding artificial intelligence for securing networks and infrastructure There’s a couple of startups which I actually work for it, but all of this kind of debate going on My second question is securing your artificial intelligence systems – So I mean I think in defending networks, pattern matching is fundamental And that’s since we know artificial neural networks have a leg up because you can feed a lot of example, a tad factors at them, and then they will, they will recognize the no ones very, very quickly How good are they going to be at predicting other ones? I think there’s a lot of interesting work going on there now to figure it out – Yeah it’s an immune system problem, right? You inoculate yourself with a known virus and the CNN’s will work well Are they gonna figure out new viruses? – I mean there are a half a dozen security companies using pretty serious deep learning on my list And so I take it as a proof by example that it must work to some degree Though how you get enough examples, I mean the number of examples you can get relative to say what you can get in areas like vision or speech, has got to be a lot lower And it makes it much tougher to generalize if you don’t have a whole lot of data – Kind of back to the training question in a way

Well, did we get to your second question? – [Man] Not yet – Okay, not yet Somebody remember the second question – Yeah How do you secure these artificial intelligence systems? At some level at the sort of, it’s a piece of software running on a piece of hardware The issues are not fundamentally different from any other software running on a piece of hardware But there are of course, some additional areas of exposure One of them is, as I think Drew eluted to, oh and no, and Raik eluted to, spoofing these systems is actually a big issue And so there are ways of getting at them through their back door in new ways and getting them to do things that weren’t expected by giving them inputs that they never expected And so I think that actually that robustness of the problem definition is leaving lots and lots of big white open doors that do represent kind of serious security and robustness questions – Yeah, yeah I mean that’s, at least my few projects I’m involved with it’s finding the data, excuse me, the behavior that is mal-behavior to breach So that is a big challenge So we’ll see where it goes If it’s okay I’d like to conclude if I can – Yes Yes you can and the next event will be in the new year February 23, right? – [Jim] Yeah and while– – [Sean] 21st – 21st and while we’re on the subject, as I tried to explain in prior once, we set this up eight or nine months ago and I’ve been fortunate enough to have August panel volunteer their time and join us And we’ve taken a little bit of criticism, a lot of criticism for not having females on our panels Obviously there’s females that can help us on this understanding and I can guarantee you that next year as we work on the agenda and the panel discussions, we’ll have representation for sure Hopefully you understand that the planning of this took a while and I wanna, while I’m here, I’d like to thank my August panel You guys did fabulous Thank you so much for giving us of your time (applauding) I really appreciate the support of San Jose State at San Jose State University I hope and wish for them as practitioners that you guys, you all can develop the cognitive science curriculum here and let’s see some students graduating with master’s degrees and undergraduate degrees in cognitive science That would be our reward and I would really appreciate it And I do appreciate the time that the university has given me this year I’m a, what do they call it, a estrous alumni I wasn’t when I was here So you gotta, so there’s some kids going to school right now that are probably, Jim Hogan said And if it weren’t for San Jose State, I certainly wouldn’t be here today So I wanna thank the university for not kicking me out when they could And making my life really, really making a difference in my life One of the people that’s not here today is Emily She’s gotten pneumonia for God sakes And Emily was the real driving force behind this Anybody that’s ever tried to manage me, it’s been pretty hard and God bless her She managed to get all this thing done So God bless Emily, I hope she well sooner Anything else? – No, I just, one more time for Chris Rowen, Drew Wingard, James Gambale, and Raik Brinkmann And the big kahuna, the riddle one, Mr. Jim Hogan right here (applauding) – Thanks for hanging out – [Sean] Yes, we’ll see you on the 21st of February Thank you so much – Thank you so much

Cloud OnAir: Getting Started with Google Cloud IoT

INDRANIL CHAKRABORTY: Hello, everyone Welcome to “Cloud OnAir,” live webinars from Google Cloud We are hosting webinars every Tuesday My name is Indranil Chakraborty I’m the product manager for Cloud IoT Core And we also have– SAMRAT BAIRARIA: Hi My name is Samrat Bairaria I’m a technical program manager at Google Cloud, focusing on IoT Core INDRANIL CHAKRABORTY: And what Samrat and I will be talking about is Google Cloud IoT solution and Cloud IoT Core You can ask any questions any time on the platform And we have Googlers on standby to answer them So let’s get started As a starting point, we need to clarify what we mean by IoT The term is overloaded, as you know But let’s stop and think about it a bit Imagine a city with no congestion at all Can IoT do that? Imagine an airport with no delays Would IoT be able to play a role in that as well? Imagine the perfect energy consumption every day and every place How would IoT make that happen? As in health care, as well, can you imagine medicine that would be tailored with each person? Again, can IoT help with that as well? Like we see with games, but now with physical toys, these are the cues for IoT, connecting the physical world to the cloud, and being able to get a comprehensive view of an ecosystem embedded with those fantastic outcomes IoT is all [INAUDIBLE] Yes But the reality is that IoT is everywhere, and is the key to a better outcome for everyone Now, it is estimated that 8.4 billion devices are connected just in 2017, which is more than 30% higher as compared to the previous year We are in a fast accelerating part of an exponential curve A single jet engine from GE generates about 500 gigabytes of data per flight Two jet engine per plane means 1 terabyte of data How many flights per day? You see where I’m going with this, right? Pratt and Whitney’s geared turbofan fitted with 5,000 sensors generate up to 10 gigabytes of data per second A single twin engine aircraft with an average 12 hour flight time could produce up to 844 terabytes of data To put it into perspective, it was estimated that Facebook accumulated around 600 terabytes of data every day in 2014 But with more than 7,000 engines, Pratt and Whitney could potentially download zettabytes of data once all the engines are in the field Devices are much faster, and they’re constantly generating a large amount of data And we are at a time where we are generating more data than ever before And not only are our devices being connected and generating an extremely large volume of data, it is happening virtually in every industry If we look at manufacturing or industrial, we are seeing impacts to every aspect of business, from connected devices on the shop floor, to connected products in the field From a paper machine, you know that one– and the one which makes diapers, for example, generates about 4 terabytes of data per machine annually Or perhaps a gas or a steam turbine generates a ton of data annually as well In fact, we have partners who are already delivering solutions for manufacturing built on the Google Cloud Platform, which we will talk about as well We are also seeing IoT solutions in health care, whether it is personalized medicine or asset tracking in hospital Someone like a company called Dexcom, which continuously monitors glucose today, and are sending 10k compressed [INAUDIBLE],, which is expected to grow to about 1 terabyte of data per month This is compressed size The extracted data will likely be 10x more than what it sends from the glucose monitor If we look at the transportation industry, for example, we’re seeing significant changes there as well We’re seeing applications in commercial vehicles, to vehicles and the suppliers For example, we’re seeing the use of telematics to improve customer service by responding to early signs of failure in commercial vehicles, and avoiding unexpected breakdowns and towing This use of telematics and maps are, again, generating a ton of data, which can be then processed

to get meaningful insights from the field If we look at consumer products industries, we’re seeing a number of use cases there as well for marketing personalization But recently, we want to deal with a large toy manufacturer, focused on reinventing their business driving innovation and disruption So you get the point When you think about these large volume billions of devices connecting to cloud, or through the internet, it results in generating a massive amount of data from all these devices, which are much larger than what humans have been generating so far So this is IoT data And Google has been dealing with this kind of data for a long time In fact, Google is one of those companies which has seven products which has more than one billion 30 day active users And all of these devices and users are generating a ton of data in petabytes or in exabytes every day And what happens when you collect that massive scale of data? Well, you become a master of big data And you become a data driven company This is our strength This is what Google brings to customers that nobody else can Mastering big data has some benefits We can use the data to get key insights, we can build machine learning models, and so on and so forth It allows you to supercharge innovation, and bring new answers to hard problems We have been working on this for a decade And we have used our knowledge to help our customers and our developers in solving these big data and machine learning problems when you collect this massive scale of data We have worked in the energy sector with our data center optimization We have also worked in many other fields as well When we are talking about IoT, we don’t just focus on one IoT product, or a couple of IoT products We really want to drive the conversation to a business outcome enabled by Google Cloud IoT– analyzing big data in stream or batch, and then applying machine learning capabilities It is together, Cloud IoT as a solution, and under the Cloud IoT umbrella that customers will be supercharged So what does it mean when we say a Cloud IoT solution? Well, as we started talking to our customers, partners, and developers, there were three challenges which appeared And as Google, as a company, we have been dealing with these challenges for a long time First is securely connecting things Security applied to device connectivity, and communication and identity Nobody wants their devices to be hacked And you’ve heard cases in the past where this has happened And you know of the consequences So how do you make sure that all these devices, hundreds of thousands, millions or billions of devices, which are out in the field, have the right security patches, have the right version of their firmware, and are very unlikely to be hacked and compromised? And how do you make sure that all those devices can securely connect to Google Cloud or to other cloud platforms? Second is scale We’re talking about unprecedented data As I just mentioned, engines generate gigabytes of data per day And so when you think about that level of scale, how do you make sure that you can ingest the data at scale from millions of billions of these connected devices? And you can store all this data in a cost effective way And you can analyze and process it in a meaningful way as well And third is actionable insights Once you have this data, what do you do with it? At the end of the day, why does a customer need IoT if he cannot get any meaningful insights? And this is, again, where we have been working with our consumer products and a number of products to understand, to analyze data to get really meaningful insights which we think we can share with the audience at large So, as I mentioned, we have been working for a decade on large scale infrastructure and machine learning And we have built our IoT stack on the same technology and tools that we use at Google, for Google Search, for at least two billion Android devices that connect to Google every day And we have an end-to-end solution to derive intelligence in the moment with Google Cloud IoT Here is an architecture of devices

that can connect to Google Cloud via Cloud IoT Core, which is a new service which I will just mention And you can use our downstream services, such as Pub/Sub, Cloud Function, DataFlow, Cloud BigQuery, machine learning, and other services for storage, processing, analysis, and using as machine learning for predictive insights as well The great thing about our Google Cloud IoT solution is that you can seamlessly move IoT data across Google services And you can decide which services suit you best for your business use case, and then use it for specific use cases You can ingest data with Cloud IoT Core You can distribute data with Cloud Pub/Sub, apply data transformation with Cloud DataFlow, and then store data with Cloud BigQuery, or big cloud storage, or BigTable You can perform ad hoc analysis with Google BigQuery, visualize data using Cloud Data Studio or Datalab, and then derive intelligence using cloud machine learning So let’s start with IoT Core What is Cloud IoT Core? So Cloud IoT Core is a new service which we announced last year And we just announced a general availability of this service It’s a fully managed service that allows you to easily and securely connect your IoT devices to Google Cloud, and then manage and ingest data from millions of global [INAUDIBLE] And we think you would benefit from this service as well So a couple of key benefits of Cloud IoT Core As you know, Google is probably the only non-ISP company who has fiber optic deployment across the Atlantic and across the Pacific Ocean And we have a number of points of presence throughout the globe What this allows us to do is to provide a single global end point for your devices to connect to Cloud IoT Core Whether your devices, or your vehicle, or your asset is in the US, or it’s somewhere in Asia, or somewhere in Africa, you just connect your device to Google Cloud IoT Core And leveraging our global networking infrastructure, we will connect it to the closest point of presence And from there, we’ll connect it to our closest data center So you don’t have to worry about picking which particular server you need to connect to You just connect to one global endpoint for Google Cloud IoT Core And we take it from there Cloud IoT Core has two key components One is what we call Protocol Bridge And second is the Device Manager What’s great about Protocol Bridge is we realized that most of the industrial devices are using standard protocols such as MQTT, and even HTTP in many cases So with Cloud IoT Core, we have native support for MQTT 3.1 and above And what’s great about this is, you can connect your devices without worrying about instantiating a resource, or an infrastructure, or a VM Whether you’re connecting one device, or you’re connecting 10,000, or you’re connecting millions of devices as you get ready for deployment, you just connect your device over MQTT or HTTP, and we internally scale for you So we have automatic load balancing built in as part of Cloud IoT Core managed service And since Cloud IoT Core connects with and publishes its data to Cloud Pub/Sub, you can remotely access all your device data from a central location And you can monitor all your devices or your factories from a central location that way The second component of Cloud IoT Core is Device Manager And this is where we allow users and developers to remotely manage, control, and monitor their devices So, first of all, we really care about security And so we have a provision to individually authenticate each IoT devices We support asymmetry key based authentication, where the public key is stored in with Cloud IoT Core, and the private key is stored with the device And we use the standard [? jar ?] token to sign with the private key And that’s what we use for individually authenticating each device You can use our Device Manager on Cloud IoT Core to update configuration of the devices, and even control the devices You can, for example, send firmware updates,

or new configuration, or new settings remotely from the cloud to the device Your application just needs to call an API And Cloud IoT Core makes sure that those new settings get propagated to all the devices as they come online We also have added row level access to different groups of devices So, for example, you might want to grant only View level access to certain groups of users Or you might want to grant just Add or Delete level access So there are a variety of roles which you can grant by using Cloud IM on Google Cloud IoT Core And you might have heard this term, digital twin And this is something which is used a lot in the industry With Cloud IoT Core, it essentially offers you a digital twin, which is a logical representation of your device on the Google Cloud We also have a rich set of APIs and a UI console for device deployment and monitoring as well So this is, again, a fully managed service with a rich set of APIs which allows you to securely connect, manage, and ingest data at scale from your IoT devices Recently we added a new set of features as well with our recent beta and GA release You can bring your own certificate for additional security And our users can now bring their own device key, which is signed by the certificate authority for device authentication Logical device representation, I just talked about, which is essentially a digital twin on Google Cloud IoT Core And we’ve also added support for HTTP in addition to MQTT as well Now, we understand that many of you, whether you’re a developer, or a device partner, or a customer, before you go full scale deployment, you would want to build some prototypes, some proof of concept to understand the ROI, and to really understand the impact of IoT So in our case, we have a simple UI, User Interface, for management of devices, and monitoring those devices as well So here’s a screenshot of our Google Cloud console, where if you scroll down on the left menu, and you go to IoT Core, you can start sort of adding your devices, and then monitoring them as well So it’s a very simple UI which can really get you started With IoT Core we have a notion of registry, which essentially is a collection of devices within a specific region So you can start by creating a device registry And then the next step, once you create a device registry– for example, here we have created a registry called Weather Station– once you’ve created a registry, you can add devices to it And as you add devices, you would specify the public key for authentication And there are a bunch of other settings which you can add to it as well Once you’ve added and registered a device, when the physical device connects to Google Cloud IoT Core, you would be able to see the status of those devices on the same UI console And with this UI console you’re also able to not just monitor the incoming data, but you’re also able to send configuration or updates to the device from the UI as well So we’ve made sure that we not only can’t handle a massive scale, but we also have a simple UI for you to get started quickly for your initial proof of concept or prototypes So once you’ve used Cloud IoT Core for connecting your devices in a secure way, and then you manage them at scale remotely, you can use Pub/Sub for ingestion and distribution of data The great thing about Pub/Sub is that it has durable message persistence, which means Pub/Sub stores the data which is published to it for seven days So what happens here is as the devices connect to IoT Core, and as they publish data, as they start transmitting data, IoT Core then publishes data from all those devices in a registry to one Pub/Sub topic And Pub/Sub then keeps the data for seven days So even if you lose some connectivity, you still have persistence on that data And as Pub/Sub is a global service, you can use Pub/Sub to remotely monitor to have a central monitoring system for all your globally

dispersed devices In addition, Pub/Sub can be a great tool for distributing data to other downstream services, such as Cloud Function, DataFlow, or even your own ETL pipeline So you can build a simple business logic Let’s take an example If you just want to monitor the incoming temperature data from your temperature sensor, and say every time it goes above 80 degrees Fahrenheit you want to turn on the fan What you can do here is you can write a simple Cloud Function, which subscribes to Pub/Sub And as it gets those temperature readings from the device that’s connected to Cloud IoT Core, your logic will kick in And then you can send a command back to Cloud IoT Core to turn on the fan DataFlow can be used for more complex event processing, similar to how you use Cloud Function And here you can have a window And you can say, I want to compute the median temperature over a 10 minute window or a 10 second window This is great for complex data processing as well And what Pub/Sub allows you to do is it allows you to abstract and separate your upstream devices from your downstream application So you can continue to make changes on your downstream application without meaning to update the library or the firmware of your devices So Cloud IoT Core in conjunction with Pub/Sub provides you a great set of services for securely connecting your devices at scale, ingesting data at scale, and then distributing this data to downstream services for storage, processing, and analysis So that was Cloud IoT Core and how you can build a solution with Google Cloud for IoT And now I’m going to hand it off to Samrat, who is going to give you a demo SAMRAT BAIRARIA: Thank you, Indranil So for this demo which we are about to show you simulates an asset tracking company which can have multiple devices or sensors around the globe I can do it We can just go back quickly So we wanted to showcase how we can build an end-to-end application using Google Cloud IoT Core and other Google Cloud Services, like DataFlow, BigQuery, App Engine, and visualize data on a dashboard in real time I wanted to quickly go over how the back end is set up IoT Core is designed specifically to securely connect, manage, and ingest data from globally dispersed devices We have set up Pub/Sub topics and device registries to connect the devices Once connected, they send the telemetry data via MQTT to the IoT Core endpoint As raw data is arriving into Pub/Sub topics, we then use DataFlow, which subscribes to Pub/Sub topics, set up earlier Cloud DataFlow is a fully managed service for transforming and enriching data streaming in This could be real time or batch data The DataFlow pipeline then dumps the data into BigQuery, which is our data warehouse BigQuery also auto scales And set up is easy All we need to do is specify our data schema to get started So from there, we also have some Cloud Functions set up, which also check the value of the data against Cloud SQL If a certain threshold is crossed, we can then trigger the Cloud Function routine based on alert required Current threshold setups are defined for weather, temperature, and geofence We are also using Cloud Data Store to store geofencing data And in the end, we’re tying all of this information using App Engine In particular, App Engine Flex And we are also using Google Maps API to visualize the data points on the dashboard We are also calling some shipping APIs to know the shipping routes a certain ship might take Now, switching over to the demo So as you can see here, we have built a dashboard where we can see all devices and sensors a typical asset tracking company might have We have the following items on screen We have ships, which are transferring goods around the globe And then we have large containers, which are carrying goods from dockyards into warehouses And then we have trucks out here, which are delivery trucks which are transferring data from the warehouses to a customer, or to a certain asset on the ground One thing to note here is, no matter where the devices are located in the world, you need to not change which endpoint the data is sent to

So you can have devices in the US, in Australia, or even in China You, the developer, do not need to specify an endpoint where the data has to go Google front end will take care of routing the packet to the closest location Let’s drill down a little bit deeper on what kind of information the devices are sending us So in this case, we have a ship And the ships have a bunch of sensors on them And if you can see, this is a passenger fleet out here We have a passenger fleet ship out here And we’re calling shipping APIs to see what is the flag of the ship? And then we have a bunch of sensors on the ship itself, which is streaming data using IoT Core We can see temperature readings We can see the course the ship is taking, as well as the wind speed the ship is encountering All this information is being streamed through IoT Core in real time And we can visualize this information on the dashboard running our App Engine Let’s take another example here of a truck which is en route or which is on a delivery route to a customer So for this truck, again, we have a bunch of sensors on this truck as it’s traveling from point A to point B. We want to track, what is the equipment status? Is it full, or has data been unloaded? We want to make sure of the door status This is for security purposes to see if someone opened the cargo door during transit We also have ambient temperature within the cargo hold And we also have the device battery levels for each device that is on board As the truck is making its journey across from point A to point B, it’s streaming data to us in real time And we can visualize this on screen We can also interact with the devices on these trucks For example, we want to set the threshold of what is the perfect temperature the cargo hold should have So let’s say in this case– let’s take it a little extreme Let’s take it we want it to be one degree Celsius Once you set that, the IoT Core communicates back to the device and tells it, this is a new threshold it needs to act against We also use Cloud Functions in the back end to trigger if certain thresholds are crossed So if you can see on screen, as the threshold has been crossed, it is having a alerts generated on screen You can also see the history of all the data that this device has been streaming All this information is stored into BigQuery We are passing the data from the device, to IoT Core, to DataFlow, and then to BigQuery And then we are acquiring it using the App Engine application You can also interact with other APIs like the Google Maps API, and then make this whole internal application work Let’s take an example out here where we want to separate a geofence And if a device crosses the geofence we want some kind of an alert As you see, if the device is not in a particular geofence, we have a notification on screen So we wanted to show the robustness of IoT Core, and how it interacts with the various of the subsystems underneath, and how fast this whole application works You can also see historical alerts that the devices have been generating and sending it through IoT Core So like Indranil said earlier, it doesn’t matter if you have a few hundred devices, or thousands or devices, or a million devices We scale automatically for you You can seamlessly interact with other Google products, like Google Maps API, and data store, based on data streaming, into IoT Core Hopefully this small demo was able to demonstrate the robustness of IoT Core, and how you can use it for your own business logic, or for your own personal projects Thank you for your time, and over to you, Indranil INDRANIL CHAKRABORTY: Thank you, Samrat Let’s go back to the slides So, as Samrat gave you a demo of asset tracking– and what was great in this demo was you could see real time the movement of the assets And whether it’s in the US or a different location, we can still provide the same real time tracking regardless of the location And it uses IoT Core Google Geo APIs and a bunch of other services where all of this

comes together for a complete solution So the point I want to make here is, Google Cloud IoT is really a platform for end-to-end IoT data processing So for ingestions, connection, and management of your IoT device data, we have IoT Core Cloud Pub/Sub We also have Android Things, which I’m going to talk about in a bit For processing, cleaning, and storing, in a cost effective way for this massive amount of data, we have DataFlow function, Cloud Functions, BigTable, Spanner, and even GCS And, finally, to analyze and visualize, and then predict outcomes, we have BigQuery, Cloud Datalab, Machine Learning Engine, Data Studio And you can even use Cloud Functions and DataFlow So Google Cloud really has all the set of services which you need to build an end-to-end solution, and build some compelling applications, just the way Samrat just showed you about real time asset tracking I want to touch upon Android Things, which is our version of an operating system for the IoT devices Android Things essentially is Android for IoT And whats great about Android Things is it gives you three key benefits One is, since it’s built on Android, it’s highly secure It’s a secured boot operating system And if you have a gateway or a device which runs Android Things, it’s very hard for that device to be compromised Second is it’s fully managed by Google When you have millions or billions of IoT devices globally dispersed, you want to make sure that all those devices have the right firmware version, have the right security patches, and have the same version across all the devices And just the way we do it with our Android mobile phones and Chromebooks, Android Things also ensures that all your fleet of devices will have the same version of operating system and security patches as well So that’s great for a large enterprise, and even a medium sized enterprise customer And finally, Android Things comes with support for some of our Google services, such as TensorFlow and machine learning So you can imagine if you have a camera which runs Android Things and you have a cloud machine learning model, or a TensorFlow model to detect faces of employees of your company, training of the model can happen in cloud The inference can still happen locally on the camera, using Android Things and its set of APIs So it really makes it easy for you to build compelling applications And the benefit of Android Things, the related benefit is since it’s Android, you are tapping into the Android application developer ecosystem as well So it’s really easy to build edge applications for edge compute as well So with Android Things, it’s secure manageability, secure boot, and APIs for machine learning And it works seamlessly with Google Cloud IoT Core We believe we have a pretty compelling solution for not just on the cloud aspect, but also on the device aspect so you can build an end-to-end solution for your business specific use cases We also understand that in order for our platform to be more sort of useful, we can’t just do everything on our own So we’ve been working hard to build our partner ecosystem, both on the device side, as well as on the application side So we have a number of device partners, such as Intel, Wireless, NXP, Arm, Marvell There are a large number of device partners who have IoT devices which work seamlessly with Google Cloud IoT Core So if you’re using any of our partner devices, it’ll work seamlessly with Google Cloud IoT Core We also have a number of SIs and application partners as part of the ecosystem who can help you build your specific application for your IoT use cases, companies such as Mnubo, Agosto, Losant, SOTEC All of these partners can help you build specific IoT applications using Google Cloud Platform and Cloud IoT Core So we are very, very excited about our growing partner ecosystem And this is just a snapshot We have a lot more on the website as well

So why Cloud IoT? And I think I just want to summarize the reason why we think you, as a developer, as a partner, or as a customer should use Google Cloud IoT for your IoT application One, Cloud IoT Core lets developers and enterprise easily connect your millions of devices, which are globally dispersed, through the protocol endpoint, against a global endpoint, with less hassle, and no worries about scale We take care of automatic scaling And that’s great for applications and developers so that you can focus on your business application and not worry about the underlying infrastructure Second is, as part of Google Cloud IoT platform, you’re not just collecting your device data, but connecting your global device network on intelligent cloud Google Cloud machine learning and big data innovations helps you to make sense of the IoT data You can perform ad hoc analysis, as I mentioned, with BigQuery You can visualize data with Data Studio And you can derive meaningful, intelligent insights using Cloud Machine Learning And third is, Cloud IoT Core is a service architecture It’s a fully managed service So you don’t have to instantiate a separate instance of Cloud IoT Core You will just connect your devices to Cloud IoT Core, and off you go with your application So whether you start with few devices as you embark on your journey with the proof of concept, to when you’re ready for deployment and you want to deploy millions, or hundreds of thousands, or billions of devices, it’s the same application It’s the same service which you just connected to without worrying about the underlying infrastructure Fourth is, given our global network, we can really offer minimal latency And especially in the case of IoT, every millisecond, every second matters And latency is critical And Google ensures that your devices are delivered with the lowest latency using our global networking infrastructure The highest quality of private networks that connects our regional locations and data centers to more than 100 global network points of presence close to your device and your users This means that the device will always connect to the closest point of presence and benefit from our global network backbone with the lowest latency And finally, Cloud IoT core works seamlessly with millions of Android Things devices, and devices from leading hardware makers, manufacturers, such as Intel, NXP, Microchip, Marvell, Sierra Wireless, and others And the Android Things device operating system is updated and passed by Google As data is generated in the sensor, it seamlessly moves into Google Cloud for processing, analyzing, and integration So that was an overview and a demo for building IoT solutions on Google Cloud Thank you for your time And stay tuned for live Q&A We will be back in less than a minute Thank you everyone for the questions

that we received from the audience Samrat and I are going to walk you through and try to answer as many as we can So let’s get started So the first question we had was, what are some of the advantages of MQTT over HTTP? So there are a couple of advantages First of all, what we find is that in the industry, many of the devices are already using MQTT And MQTT, as you know, is a relatively new format which was introduced in, I think, towards the end of the 1990s The key benefits are, with MQTT, it’s built over TCP So you can use TLS and achieve the same level of security as you do with HTTP In addition, with MQTT, the payload size, the bandwidth consumption is much lower You can send binary data over MQTT And that saves you a lot in bandwidth So that is one of the key reasons why many of those devices use MQTT SAMRAT BAIRARIA: And you don’t need much storage space with MQTT So you can have a smaller footprint for a device INDRANIL CHAKRABORTY: That’s right You don’t need much storage space And then second is MQTT has this topic structure which really works well for devices, as devices publish data on a particular topic, and you have applications subscribing to those topics to collect data And you can also use it for device to device communication as well So I think those are the two key benefits of MQTT over HTTP SAMRAT BAIRARIA: [INAUDIBLE] INDRANIL CHAKRABORTY: Second question What formats are supported for payload? So the way we look at IoT Core is, we’ve taken a set of data agnostic and format agnostic approach At the same time, our recommendation is, you can use JSON as a format You can also use Protobuf If bandwidth is critical for you, and you want to save every bit, you can use Protobuf as well to send binary data Third question is, do I have to create keys for devices on the cloud? You want to take that? SAMRAT BAIRARIA: Yeah Well, you can use keys, or you can also use certificates We have a bring your own certificate mechanism where you can generate your own certificate Give us the public certificate on IoT Core And you can have your device assigned with your private certificate So you can choose whatever path you want to take You can create your keys on Cloud IoT Core You can do it using Open SSL on your machine Or you can get your own certificates INDRANIL CHAKRABORTY: Yeah And I think, if I understand it right, you really want to create virtual devices on cloud for the purpose of testing And the short answer is, yes Even if you are creating virtual devices to connect to Cloud IoT Core, you have to use private keys and public keys, a private-public key combination to connect to Cloud IoT Core Otherwise it will reject connection And on our website, there are very clear instructions and samples which will help you to generate keys, whether it’s for a virtual device, or even for a physical device Then the final question is, how can I use multiple MQTT topics? So there are two ways to address this question One is, if you’re trying to use multiple topics on the device side, today we said we had two parent level topics So one for sending telemetry, which are telemetry events, and one for the device to publish its own state information And for telemetry, the topic is slash events Devices slash device ID, slash events But you can create subfolders So if you want to send temperature on a separate topic, you can say slash event, slash temperature, or slash event, slash pressure And interesting enough, just now in our GA release we also have a new feature which will allow you to map these different MQTT topics to a different Pub/Sub topic on Google Cloud So you can map the temperature topic to temperature Pub/Sub, to [INAUDIBLE] the topic, and then so on and so forth So you can really use multiple topics, both on the device side, as well as on the cloud side Great So thanks everyone, again, for the time And hopefully this was useful And we love to hear your feedback Stay tuned for the next session, End-to-End Machine Learning with TensorFlow on the Google Cloud Platform, live from our Kirkland office And thanks again SAMRAT BAIRARIA: Thank you everyone

Recorded Webinar: Will camera technology combining ANPR become a standard ITS data sensor

good afternoon everyone I will be beginning the webinar thank you for joining us for our visions second I TS webinar in regards to will camera technology combining am PR become a standard i TS data center I am Francis ova der maler Jurgen Patterson will be providing this presentation or giving this presentation but we’d like to thank you guys again for joining us and without further ado let me turn it over to your good Thank You Francis and hello everybody I can see we’ve got some friends on from mainly north american this time but earlier on we were joined by friends from asia and europe um a quick overview of what we’ll be discussing today i am going to change a little bit when we started putting this together when i sent out the notifications we were talking mainly about a NPR as i did more research and became more aware of what was happening i think it just really needs to be how camera technologies can be can be used as a standard i TS data center because i can’t bear in mind that a NPR is just one of the elements that can actually identify different vehicles etc so today there’s not that many slides there’s only about 15 odd slides which is a lot less than the world’s last time but those slides are pretty busy so anybody knows me knows I don’t read those slides so if you have any questions please just drop an email or a drop of a message over to Francis I’ll give him something to do for change and agnostic he can ask the questions as they come up and let’s keep this little interactive and a little light-hearted and move forward now I’m right in assuming that they were the presentations are being displayed on the on the website and the webinar presentation itself is being displayed in the website right and I also believe the question answers from the previous one also up now as well is that right Francis that’s correct you should be able to access the first present first webinar on our website but then also this this Reverend today we’ll be on our website as Watson just a little word on that we’ve already had quite a few questions come up which I’ll be going through as as we move through the the webinar those answers those questions won’t be available until the new year I’m afraid obviously we’re winding down for the Christmas period or the holiday period i should say and it probably won’t be done before before the new year anyway without further ado let’s get started what I’m going to go through our today is a bit of an overview of the current technology is being used and anybody that’s all my previous presentation know a little bit about that are going to a little bit more detail today and really an insight of where traditional okay cameras have traditionally been used in ids and look to the future we’re probably gonna head I’ve we head to the area that I’m looking at in terms of this is what may happen I think we also need to identify how a sustainable system could be implemented within I TS and reason I say that is because they’re other changes to technology it must also happen in line with what we’re discussing here as a result I’ll go through the issues that camera technologies will face I look at the short term future and I is really short term we’re looking at probably 12 months or 24 months and probably less than that in reality out at this point and then we’ll go through the challenges again and the conclusion at the end of them click overview of what i TS means i TS is intelligent transportation solutions and i’ll put this up on every one of my slices to keep focused on this but that really is everything to do in every mode of transport so that’s traffic transit fairies airports autonomous vehicles mapping technologies or street mapping technologies very least and pretty much walking cycling in anything else you can pretty much think of it is really the quintessential product of how to get from A to B so moving forward I’m gonna have a look at the current technologies now and I’m going to go into this in a little bit of detail can I think it’s important to understand where they’re failing a way that perhaps not providing the level of detail we need from a mightier system point of view I’ve separated into general and specific and to be clear these are not ideas general I’m relating to those that can identify vehicles but not a specific vehicle and specific it is obviously opposite where they can identify specific vehicles and the first ones we’re going to go through other general and loops as we discussed the last webinar are probably the most prevalent well they are the most problem by a long long way the two examples you see in front of you have both signal at signalized intersections with loops as detectives and one of the questions we had at the last webinar was how many loops to the need to actually get can an inductive coil and the answer that is

actually two I did look into there and you can actually see the Year inductor coil on the right hand side as it would normally be be delivered a loser a great they can be anything which is vaguely euclid resolute shape so it can be circular square or tagging 11 triangular they can as long as it eventually some forms a loop it will work the next technology i want to discuss other magnetic perks and to be clear these are all images from the census website although again to be clear there are other magnetic parks on the marketplace as well since this property is probably the most prevalent of those um but essentially they are the same technology enough just a smaller package the advantage of the magnetic Kirk is that it can actually communicate by a wireless device are back to a back-end system up to a relay system which then goes back from that but essentially speaking they are the same well and they also suffer because of it for the same issues and the main new main issue is that they’re in a very hostile environment you’re actually digging a hole or channeled in a road you’re sucking out all the grip from that channel at that hole I mean you’re in building your device in there or your your cable in there and then you’re effectively putting some kind of sealant on top of that and of course the problem with that is if it’s anything above two inches that’s likely be ground off as you repay of your road so it’s it won’t you’ll take your we paid more you’re grinding machines will not differentiate behavior will be and the real tarmac the other one which is very currently used as radar radar you can see being used for anything from speed enforcement right hand picture to pure data collection the middle picture I’ve even seen them being used in parking management systems and I think we haven’t spoken about by on this particular presentation but one of the one of the advantages of a radar is that if a couple multiple lanes or even cover to both directions of a freeway simultaneously but it has to be mainly pretty high if it’s mounted lower you can see by the subtle graphic image on the left it is mounted lower any high side vehicle actually shadow or called shattering of any vehicles running decided you just won’t get a true reflection what’s happening on that rope system so oh we have a question sorry to see those questions coming please okay well this question as you know you’ve separated loops and magnetic perks but aren’t they essentially the same thing they are they are they’re very similar a nearly real differences that obviously digging a fairly small holding road to digging a big a fairly large channel one actually has a wireless device the other one has a roadside receiver but yeah they are essentially the same things so looking at the specific vehicle identification technologies the obvious one is and the first one I’m going to go through his bluetooth but it wasn’t the first to be developed Bluetooth is actually develop to test the data coming from GF BD our GPS floating vehicle data and it’s developed by the Maryland University for the i-95 coalition corridor arm and the idea behind it was anything with a Bluetooth device and at the bottom of it says Bluetooth signals come from cell phones PDA laptops gps car radios the fact the matter is it shows how older technology is because nobody used PDAs in law I’ve got to be honest I’ve never known of a glue to signal beam coming from a car radio but the fact of the matter is comes from a variety of different devices car itself as she has bluetooth now and as it goes past one receiver ill identify the time either go to the second which distance apart on this case in this graphic 22 miles you can identify time next one from the distance travel on the time you can identify to me it’s very very simple technology now the big issues with that are when you have a bus loop the kids going down we’ve got 50 kids in a bus or a 50 adults and applause for that matter all of which have one or two cell phones you get a choice a mismatch of what’s really happening if you have somebody walking beside the freeway or walking beside a frontage road close to the freeway let’s pick up their signal and again you have all these outliers which have to be removed at the same time so looking at the floating vehicle data now are basically the idea between floating the elevators when you have tracking devices on vehicles or when there are tracking device on vehicles what they effectively to do is every minute or every three minutes or however long that interval is it sends back a signal saying i am here by identifying where I where you are identifying the next where you are next

one you can effectively create a daisy chain of what I vehicles doing along a road networking second you can quickly plug it to a map and and as you see on this example when when the vehicles traveling quickly they’re quite follow little bit space quite 50 and when it’s driving slowly they get closer together and it’s relatively simple technology and it’s relatively effective in all fairness it’s just reasonably expensive and the next one is cellular floating vehicle data and what you can see here or the graphic is trying to show you are the example elements on here or effectively cell towers and a cell tower would Jim Reese Pam something like a five meters thousand yards and every time it goes from one cell tower you go from one cell tower to another cell tower you have a crossover point that crossover point identifies though okay this this cell phone is moving from here to here are the next once again I by actually pulling those all together you can identify the passage of the issue you have unfortunately is when you have a frontage road beside that and I’ve just used the graphic from the previous slide to demonstrate this but you can see that actually it goes through most of the exactly the same cell points as it as the other Road would and therefore you can’t really differentiate which road you’re on and so effectively what it’s doing is aggregating that data together and it’s not particular fish in that because of their two other examples i want to give in terms of its limitations when you have an urban environment city block make spam thousand meters as i said and therefore it could actually cover three or four or five rows back potentially which clearly doesn’t really give you the kind of information that you really want another example would be if you have a say a mountain pass it could actually meander through the cell cell the same cell areas multiple times and give you a false reading but to be clear all of these technologies are very effective they have limitations but there again to be blunt every current ITF sense that has a limitation some description or other the last one I want to hear before we actually start looking at the camera in specific and why the cameras different to to all these is the tolltag readers until tags up I use threat world map this one is showing from it’s actually from the e-zpass website it actually shows a great graphic of what happens transponders to fix the windscreen as it goes on dragon tree identifies the vehicle and then there’s a traffic monitoring camera above that just in case the vehicle fails to read transponder or the transponder kills reads vehicle when the point of that is you know whilst these are very effective they are generally only used on toll roads and as a result you really can’t use them anywhere else now there are places such a new york city as example who I have booked old tag readers up in various areas to try and get a a better indication what’s happening but that’s that’s unusual I’ll put it that way fashion here this question states now for floating vehicle data you haven’t mentioned mobile applications or crowdsourcing information do these not play a part yeah and that’s that’s a really good question actually and both of those do play a part certainly the likes and the likes of the larger organizations Google Amazon are collecting data through mobile applications and actually doing it very very well and most of the Google Maps you see will actually use that data to attempt to demonstrate what’s really happening on that road network the biggest issue have you can’t just buy that data and then use it you have to use the whole google map if you want to and that really is a limitation of what they’re providing now when we look talking about IITs we really want to be in control of our own destiny and therefore being controlled data we receive and be able to manipulate that data as we need to and that’s what you can’t do which is why that wasn’t included but the crowd source wants an interesting one and probably the most prevalent I’m most well-known crowd source application is ways that was bought by google I think it’s last year could have been this year though I can’t honestly remember but basically what it does is allows us set of users to identify what’s happening on the road network by pressing a button on their mobile phone and say okay Starion road here there’s a slow down here the congestion here as maximum here or whatever and by by pulling all that data together you can effectively start building up a fairly precise overview of what’s happening on the road network I can certainly see that crowd source technologies will play a really big part we move forward oh very good question give those questions come in please so

looking at the traditional IITs camera applications you know what I’ve tried to do here map out the processes as we move forward from you know where we started to where we’re going to I think the first i TS applications are mainly around the enforcement and they were around speed enforcement read ly enforcement and as the packages became smaller we started to see those murder those appearing in police cars to look for certain vehicles primarily author look for vehicles of interest should we say and that’s where we really started once he got a point where that was working working reasonably well we then went to the revenue collection element and the top picture of here you can really see was the traditional toll booths where you go in you drive up you pay your money you move on and am PR applications and camera applications were particularly used to just identify any vehicles that fail to stop and pay and there are relatively slow moving vehicles are in a controlled environment it wasn’t particularly hard having said that that was probably at that time a major capabilities of that camera as processing fees speeds increased ma’am and as a mpr software applications improved very able to actually put those and open road tolling environments and that’s the second picture here which basically shows a tolling environment where you don’t have to stop on this particular one you do have to slow down I think the average speeds are between 40 and 60 miles an hour but you slow down it takes you trim it takes your your your electronic wallet takes a deduction from the electronic wallet and you proceed as if there was nothing in your way and again that’s all governed by translate all tag readers and then camera applications to back those up in the event of a failure to read the bottom one which is probably the most exciting is really the hotline kind of applications that max towle environments are now starting to be delivered and this one in the bottom here’s a few part VI 95 around Washington Dulles area and basically you can see that what they effectively trying to do is some hot lanes is charged you for the distance you travel the time of day you travel and potentially even the congestion is on that piece of vote at any one time so as an example you could actually increase the price based on how much conditioners there to dissuade people from using if it’s ready happy flow capacities and I think that’s something you’re going to see a lot more of as we move forward the next one is really infrastructure management and we clear infrastructure management and revenue collection system pretty much came in in parallel to each other the first ones are tunneling with the second timing system the second one is a bridge environment of third one signalized intersections and each of those uses cameras to search a huge effect now the first one you’re effectively intelligent environments you’re effectively using a four-speed you using it to identify over-height vehicles and you’re actually using it also one of the major elements of camera technologies also to identify hotspots on vehicles and that’s really a result of some of the issues that occurred in Switzerland with quite a few people unfortunately losing their lives because we’re firing the tunnel and likewise and the Channel Tunnel also have a fire so all vehicles are now being monitored before they enter the title and any hot spots they can actually pull that vehicle off and just check that it’s okay or not signalized intersections of bottom are used cameras used extensively now to I should identify the turning movements of vehicles as well as the loops themselves you can use both the next area is congestion management or I’m going to call a congestion mitigation because I think this is really where ites starts to really play a really big part the top picture is actually something you see quite prevalent in in Britain there and that’s really our speed over distance cameras so you’ve set up cameras to identify not only spot speeds but also speed over particular distance so there may be another one of these gantries a mile down the road and another a mile on another a mile and what it does is keeps people honest one of a better way of phrasing it effectively friends people from going going through the slowly and speeding up and going through the next one slowly yeah you know if over the course of the distance you’re actually exceeded the speed limit you will get fined now it’s not about finding though and I want to be clear about that it is about managing the speed of vehicles and Fleur vehicles on the road so the bottom picture is actually a variable speed control also in Britain and basically what it shows is that you’ve got gantry

it’s identified that the average leave the speed you should be going it’s 40 I don’t know in a bit France and I were discussing this earlier and we most cheated and we wait you know you tend to go through these at 40 and then speed up to 60 if you can and then go here and then go through the next one and 40 and the idea of the top one is to read the bottom one from from happening so I can really see a real need for variable speed controls and speed over distance happening in harmony with each other I think that will happen a lot I already know it’s happening and it will happen in a big way as we move forward mental picture i think is also really quite interesting there as well it’s actually shows a lane control system and the idea behind it is there in the morning you had get a flow of vehicles in one direction in DVF logo pickles in the other direction and actually you have quite a little availability in the bet in the wrong direction the idea is to start using those lanes effectively to move vehicles in your direction it’s really as simple as their next one is safety management I think this is really where itís is going to have searched show of some real improvements in what’s really happening our road networks the first one is one of a better way of phrasing debris and road now normally debris in road would relate to a tire a dead animal a fridge is falling off the back of a truck a box not a mountain obviously the mountain is a big debris in road element but it actually amounts the same thing the fact of the matter is it doesn’t matter what it is if you have a lane obstruction you can have cameras with automatic identify what they are on what not what the obstruction is but there is an obstruction flag it to an operator and when the only operator will actually then start putting mitigation plans in place in this case that actually closed the road the second one is obviously over high vehicles and it’s staggering how many over hide vehicles hit bridges and overpasses every year and it still hits them but the fact of the matter is it’s a problem this problem that can be fixed pretty easily through camera technology and just needs to happen and it will happen as you move forward and I was quite quite lucky to have worked on one system to actually look at debris in road and I think 160 old cameras room invented over several miles stretch just identify what was really happening on that magic road network the last one is actually equally as important and probably more important as you move forward is automatic Matic instant detection and the idea behind that is if you have a body of cameras they’ll be able to identify when slowdowns occur and even when an instance actually occurred and the same camera that’s used for debris in road will be used for instant section I mean it’s pretty straightforward stuff and then we look at the future the future systems we have drone technology I think may bear bar and a semi or you’ll be use the cameras but and and I can certainly see the drone technology will be used for IT s applications fraud the second one is street mapping and street views are the third one’s the autonomous vehicles and none of these are little one exception but the other two are not like you have a short term but I’m pretty sure they will happen Oh question here now why haven’t you included applications such as congestion charging freight overcharging or way in motion again thank you very much the questions and the reason I didn’t include congestion charging is because it’s really quite similar to tolling environments they really are they are talking environments what a better way of phrasing it certainly the london congestion charging system is is a tolling system between the hours of thinking 77 but i could be wrong about that now likewise the singapore is pretty much the same stuff comes a bit different exactly takes more the model of the hot on the hotline charging so you can be charged by the time of day a different charge i think you can should even be charged to stuck on based on the vehicle classification as well but again I could be wrong about that so that’s why I haven’t included those trainees are charging again is a tolling system just identifying the vehicle type as opposed to identify the specific vehicle again those are being trialed around the world at the moment and suddenly it will be technology use the same essentially over how vehicle detection it was like no way motion sick wave motion unless the other one and again that’s the reason that wasn’t included because while cameras play a part is really a situation that way emotion requires scales as a primary data source and then cameras as a secondary and that’s why that wasn’t included but they’re all very very good points and they cameras play upon each

and every one of those so the quick look at the future of cameras and potentially a impure applications certainly the speed occupancy volume vehicle classification can can be done through a camera so can wrong way running into every one Road and a whole host of level applications as well and the reason has changed this on the beginning to say it wasn’t just about am PR is because I’ve actually in researching this a bit more I’ve actually come across to potential come more to companies who because of a NPR issues in some geography so actually gone away from Maine Pyaar and actually identifying specific vehicles using camera technologies without identifying the number plate and one of them is doing it by using hash codes instead and converts the number plate 220 128-bit encryption and delivers you a string which is huge but can actually go identified but not identify the specific vehicle it’s quite clever and the second one will actually identified vehicle type and elements of that vehicle to identify the vehicle specifically and it’s really quite interesting and it’s a different approach but actually one which I think will prevent some of the privacy type questions that have been written I have been being have been being raised I’m not sure that so you look about anyway forgive me it’s been a long day some of the new systems are is I really want to hit the the tolling environment for first of all I believe there’s a opportunity here to actually change tolling environments and actually have transponder less or non transponder tolling environments and actually when I first put an idea together for this I was I was told by one organization that I was mad and then shortly after their thereafter and I think we’ve been in 24 hours thereafter actually found an organization already doing this on a big scale I’m actually covering an entire country but the idea behind it is every every trans every tolling that not every bit a lot of toning systems have different transponder requirements of chips transponder networks so I think there’s 15 different tolling environments in North America when I went to Southeast Asia they were actually a lot more than that in a very much more pressing a compressed area and the problem of that is you end up having to have if you wanted to have a trucking company with traveled all across North America it need 15 different transponders on there and that just isn’t efficient it’s not isn’t you know and each of those hatch others own electronic wallet and darla but it’s just it’s just not not not nice when you consider that one of the environments have been investigating actually has a has a ninety nine point two percent read rate of all number plates over a multi-year period include across all weather conditions including rain snow for hail etc that is actually equally as good if not better than its transponder read so it can be done and I certainly see that Ashley as we move forward it probably will be done some of the other areas are obviously security at the bottom right hand corner you can see what we call in North America and Amber Alert and this is a child abduction or child gone missing you also have silver alerts when an old person is gone they’re going missing or on a wall and the idea is that using cameras with a NPR or with identification properties you can identify where the vehicle is an effective track that vehicle to point where it can be it can be stopped all or at least monitored but some of the other ones the safety issues the top you know the other two images here at the bottom left one is what I’m really going to talk about but what you see here is a freeway environment from a dash cam and you can see that what’s actually happening here is a vehicle struggling along a freeway in the wrong direction with the kind of technology we’re talking about here and with someone automation we will be talking about what could happen here is all the way down this freeway you can actually have until that be able to stop you can have the dot matrix signs telling vehicles to pull off again and get out of the way and then we could come to the small operations and this is a typical picture of traffic management center basically you have a not rated money manning 12 or anything up to six screens of scene and in front of them or somewhere in that traffic management center you have a video wall of however many images you can display the point is that in reality that operation doesn’t really look up at well parade is really very busy on undoing whatever they need to do in front of them and you know even if they do look up a fact the matter is they’re seeing perhaps 16 different images they may have 300 different different cameras out on the on the network and they can’t possibly mon amour so the idea behind this is to have

some operator flagging some automated analysis and from that be able to actually identify what’s going on to the operator as they need it you have a question um now could you please describe a little bit more about operator flagging yeah sure yeah I should have been going to a little bit more detail I’m so thankful for reminding me now the idea is that if if you do some pre-processing on the data getting from any sense of whether it’s a camera or anything else that can automatically flag to the captivity operator an issue which is occurring so for instance wrong way running you could have a red a red screen pop up identifying that there is a problem a particular Road what that problem is i will just say a little bit more of us okay about the automation as well because the idea behind it is not only will you flag to the operator but your sister will be intelligent enough to be able to bring up the right images at the time and even potentially even bring up the decision support system I don’t find what they what the courses of action should potentially be some of those limitations of the current systems there are you know a lot of these advanced traffic management systems the 80 ms systems are fairly proprietary and they’re not used to having this collection of data coming at them and they used to collecting data in the traditional ways which doesn’t necessarily have the quality or quantity of data you’d otherwise see from what we’re proposing here as I said there is limited in challenges to fight the fact that IDs down from intelligent transportation solutions there is limited intelligence on there and then come and some of that could be through the inability to trust the data they’re gonna get the inability to trust decision decision support systems but I have to say that is changing with the introduction of integrated Court of management you see a lot more intelligence being being injected into I TS and that’s really good to see because what that will do you know will be baby steps but it will eventually get there and then as we look forward to that once you have that you can start to date in mind you can start to throw up some management telematics but it has to be said that some of these issues are actually preventing us from managing environment as we need to okay actually we have a question on this slide here could you please describe which mean by management telematics oh yes actually progression have used that word in all fairness i’m really talking about dashboards the idea behind this is to actually provide a level of the our level of intelligence and glance so throw a dashboard you can potentially identify watch what’s going on in the network but it’s really one screen giving an overview a quick overview which event reveal the answer she wanted to that’s what I was thinking come on guess so thank you for answering that question arrogant the second question that the received here isn’t what do you mean by time stop data analysis you know surely everyone is monitoring monitoring in real time but how does this differ yeah that’s that’s a good question I perhaps I should set a little bit more about that I probably skipped all together so forgive me back in the late 2000s I was responsible for looking at real time other time at that time we were purely historic and there was a huge move to actually start looking at at real-time data and being able to get systems to accept real-time data the real-time data is effectively historic there is latency involve all the time you get it it’s always little bit old babe many of these seconds but it’s the lot but the idea behind time stop is you collect that data you continuously store that data and you continuing continually reflect that data against the other date you’ve already got so for instance if you’re looking at shall we say the congestion build up in the morning and the morning rush hour on a particular stretch of road you may start looking at it as say 6am r 6 05 16 6 15 and it could be in 15-minute increments or could even be in for the second increment or less latch onto to the idea is you actually build up this profile and then from historic the real time you can actually start to identify what the predictions going to be and that’s really what I think so in order to get there how can you actually build this sustainable system you really need to move forward on this now we really need a sensor to able to identify every passing vehicle at least majority of passing vehicles because by doing that we’ll get to this microscopic model we discuss the last webinar but the idea behind it is by having this microscopic model we can actually identify the beginning points and the end points of

every of every vehicle or most vehicles on the road now we don’t need every vehicle but we do need a majority and once we do that we’ll collect data from morals need to create data from more locations to get there that will give us an understanding what has happened doric what is happening the real time and what will happen predicted by using this and as I said before by by using this in that in a profile curve if you like to identify how it’s actually changing how that event is changing once we get to that point we’ll only be able to analyze and to better manage that data and if we get there and I think we will I really you know I think we have to with the congestion issues what it currently facing I think yeah we’ll get from up this passive passive monitoring state to a real active management but I want to be clear it’s all about the data and more data we have the more the more quality of data we have and the bigger the more we understand that data by I don’t answer identifying the cause and effect on the better we’ll be able to really take this forward so some of the major issues issues facing camera technology there’s obviously privacy and let’s talk a little bit about privacy because this is a this is a really big hot topic you know as an example Germany doesn’t allow any NPR application because of privacy laws in Germany that’s a it they do have speed cameras using ampl and in fact I actually applied to German government to I hope to get a test case set up so that we could test anpr applications as a research project nothing more than research project where the data would be destroyed at the end of it after much tuna throwing it became apparent that actually the German law wasn’t prohibitive it wasn’t defined and because it wasn’t a find a reluctant to say yes but you know here which matches the same thing but for the same token you’ve got other countries such as Singapore I believe I’m like saying this but has has identified that dash cams are must be used in vehicles it’s no longer an option you’ve got places like Spain have just allowed it to be used you’ve got poland i think they they’ve also passed legislation to I don’t think it’s not to have it but it’s certainly recommended to have them and then you have this kind of situation where you’ve got some countries prohibiting it some countries saying you actually have to have it but let’s look at this logically just for a second anybody with a mobile phone is being tracked okay there are more mobile phones in this world may our cars significantly more you know as far as I know there’s not any any 13 rolled out there driving cars uh-huh but I don’t know a 13 year old it doesn’t have a phone the fact that matter is you know if you have a phone you’re being tracked if you’re going down two miles of freeway and you pass one camera it’s a surprise frankly if you go down to two miles of freeway with a mobile phone you’re being tracked through at least three cells you know so this privacy issue has to be put into perspective you know yes there is an issue there but yes these managing but it shouldn’t be the end be all and end all of ideas can be computed communication networks are also very important as we get higher resolutions faster frame raids you know a better quality of cameras in all fairness the bandwidth requirements for cameras is going to go up so therefore there has to be some kind of mitigation strategy to to bring that band to a level which is acceptable again so that’s going to be one of the areas we have to face camera management as it comes with more cameras in the world we’re going to have a situation where you can’t possibly monitor 600,000 cameras you know and there has to be some automatic processing arm not only undertaken but then relied upon by the operators to actually flag up instance one of the things I worked on years ago was identifying which cameras were unavailable not in use failed need to require maintenance we did that just by looking for static on those cameras as an example I can be done very easily through software obviously the cost of NPR software significantly reduce any applications out there at the moment you know I’ve been quoted as much as two and a half thousand from license but that will come down I’m bummed now there’s quite a lot of competition now and that’s going to come tantrum a salir in the next few years and of course we have the two new technological applications as new technology comes to place comes in you know that’s going to have a significant impact this we as you move forward as well so actually circle that you’re in the question did pop up

for the last one and it’s really not clear what you mean by new technological applications please could you provide your thoughts suddenly come Francis would like to sail again ok I got it I will go through what this means but I’ll go through in the next slide I think it becomes more apparent as we move forward on here a peek into what my my perception of the future is going to be in is my perception you know a camera standard i TS camera will require a lighting array will require obviously the camera itself but would require some kind of processing capability on board software onto a processing board what I’m thinking about for that is really things like the speed the opportunity volume revealed classification runway running debris on road they could all be done at the front end and these are all areas where you don’t specifically need area because the API is simply too big for matching up the camera in most you know in free flow of applications of them you’re going to have to have a communication their works on there as well and the reason I say that is because as we move forward if we can actually do some of the proceeds on camera it means we don’t to send back every image we only have to send back the data we need and that’s very important and that will lash bring back bandwidth down significantly now if we do that welcome effectively have is the iphone of canberra’s although short of the smartphone of cameras let’s let’s let’s keep off the year when the branding here so forgive me for that russian have said that but the smartphone of cameras i’m basically the idea behind the smartphone is you have all these applications and you run the ones you want to run when you need to run them and the same will be for my vision was the same um just a brief question actually popped up um do you need an IR lighting array if you have adequate high dynamic range rage yes good question on the jury still on this my reputation is I think you do be there is like and say if you have a high enough high dynamic range the fact imagine you probably don’t need I think I would air inside a caution and put it in any way I have it and rely upon it and if you rely upon it leave the problem you’ve got high dynamic ranges are it’s generally the better sensors that have the high dynamic range and that increases the cost which is probably not a good thing priety of applications anyway so it with me I keep the high dynamic range within a sensible set of central enviro central sector now I move my hand around and massively but it’s not really helping the webinar have it in a sensible area have a relatively low price sensor put in a lighting array and at that point you can guarantee to hear everything the problem with a lighting array by the problem hi Dermot range though that matter is that like nighttime conditions obviously the lighting array will only be so far the similar high dynamic ranges but again if we are sequencing that will also help some of that as well very good question and please keep those questions coming so what does this mean for camera applications and I obviously we need help increase resolutions increase frame rates reduce bandwidth and we’ll do that as a sibling actually do some of the pre-processing at the front end the buffering and sequencing and buffering but by that what I mean is when there isn’t the bandwidth available the idea is to buffer the data and send that data when there is bandwidth available and sequencing is effectively exactly what I HDR does if you have three shadow HDR what it does affect it takes three frames and then use the best one always use suggestive sequencing is that actually it will either also there are high risk control and also the white balance or a lot you know the frame speed just to try and get the best possible image so to make sure that all data is captured as close to a hundred send as possible and then the HDR wdl to answer the question as we came up before and is a CNN almost every condition and to be clear here the current camera technology can see about ten times better than human eye and as such they can seem rain fog hail snow lot better than we care and that’s been proven and as a second there is one application that I investigated she’s using cameras with back-end npl are unable and over the three-year period I say we’ve been able to glean 99.2 percent of all reads so that’s pretty

that’s pretty good I think we’ll go with that one multi-use application and it gives us the ability to collect all data men they’re there for that microscopic model we’re talking about which i think is absolutely essential to move on there as I said next element and it gives us the opportunity opportunity again to do the triggering the automation and thereafter the prediction I think we have a question quick question here could you please define what you mean by operating triggering yeah I think we hit this a little bit before on the on the slide with the traffic management center of the idea behind that is probably should use the same word as they use their flagging by triggering flagging effectively the same you know the idea is you have some automation that automatically brings up screams the operator saying okay we’ve got an issue here a potential issue here you need to look into this and as we move forward as we go down there this world rotten powerful not so well trodden path at the moment you’ll then get the automated decision support system saying okay this is the issue and this is the action we recommend you take civil issues with this approach some leaves already gone through i’m not going to go through the law again but the decision support systems are going to be critical and some of those you can already see started to make an impact on here with the integrated court of management type systems which rely on decision support systems scenario development and event triggering really to actually assist the operator to manage the quality and quantity of data they’re getting will certainly have to have changes to the ATM s systems to accommodate the data and that could be made available from this technology and frankly from any other new technologies are gonna going to candlelight you know at this moment i’m lisa can’t accept predictive data they don’t accept real-time data now so we’ve gone from only accepting historic to only accepting real-time and i would really got to start looking at how we accept with addictive data as well because of the where we you know crystal will have a better understanding what’s really happening a little on our network of roads we’re in a situation where the camera locations at the countless location of cameras we currently have may not be right but they are effectively there to do is identify congestion hot spots or accident hotspots and what we really need to do is identify the build-up where it starts from when after those camera locations may be beneficial that they may have have to be other ones which actually bring that together and one of the last things we have to learn to trust the day we get from this one of the I’m got two exams I’m going to concentrate on one when computers first meet you know came to light and was first used in anger one of the first things are useful was to identify how to predict the result of an election based on exit poll information I wanted what it did at first was identify a landslide victory to Canada a or b I can’t remember which one a bit of such a landslide that he wouldn’t believe the data is wrong you know and because they would never got aired only to find out actually when when the founding had finished the the system was actually uncannily accurate and we have to actually start if we go down this path we have to start to trust that data and the operators are in tribal management centers have to trusted later as well obviously privacy were already addressed but the last one here is change management we have to get past this whole privacy perception you have to get past their idea that I TS is about revenue collection and finding people who are driving too fast or whatever it has to go from enforcement to management that’s really important and we have to go from enforcement to safety and I think that’s also equally as important no this is more about managing our networks rather than finding people because they’re probably too fast we manage our networks efficiently and effectively those people will actually automatically be slowed down anyway and I think it’s really important to actually look at this holistically as opposed to how we have been doing up to now so to answer the you know to answer the old important question will cameras become a standard ikea sensor i think i think it could do I think it should do I don’t think there’s anything out there at the moment which will actually give us level or quality of data that we’re recovering looking at I think it’s absolutely imperative to start getting that data so we can really start to manage that network ID to manage that that time network will it require back in system changes without doubt now we will not all be done in a day no it will be done in one time no it will take months and potentially even years to get there but I passionately believe that we will have

cameras being used as data sensors on the road in a very short term future and I’m talking probably anything from trees six months I think we’ll actually start seeing some real differences here if we do it will truly have go from active management factor for passive monitoring to active management and as you if we get there then you’ll start seeing some real improvements in integrated management systems and that will then breed to evolve too smart transportation small city in active traffic management so in a nutshell I think it really will happen and I think it really needs to happen and for that reason alone to get well that’s the end of my slide so if there any other questions I’ll take those now so we do have one that popped up are there are existing standards within I TS /a and PR for cameras software networks and are they mature or maturing well that’s that’s a really good question i’m not sure i have the answer may have to come back with some research on this I’ll be honest standards in i ts are pretty limited there are a couple of standards which such as the tmd that the transport management data dictionary and the NTC IP neither which apply to a NPR or not apply fully anyway at this moment time I don’t know of any standards which apply specifically to a NPR and in all fairness I think it’s evolving so rapidly I don’t believe it standards would apply brake well to NPR applications having said that what am PR applications give you is essentially a number plate that’s all so providing you can identify the vehicle number plate accurately oh and it gives you a number plate and it gives you a confidence factor generally our sir Balin if the confidence factor is too low you effectively ignore another plane taker I take a picture and actually have it verified manually but the fact the matter is you know as I said there are other there are other applications coming to market now which are identifying a hash code with 128-bit encryption there’s also other ones identifying the vehicle through particular marks on the vehicle etc so I don’t believe there are really standards at the moment will it help I think at the moment probably wouldn’t I think at the moment probably limit some of the technology technological innovation that’s actually happening out there so it probably wouldn’t be the first place I’d look at for Standardization unless we really look to the Future Sight try to standardize what the future would look like I think at that point we could I you’d see a huge influx of new players does answer the question okay any anything else yep that doesn’t got confirmation good yeah so kind of going back to the sex questions University traffic management systems and in regards to the change do you think that they will change to accommodate the new day this new data source and if so why I think in all fairness they’re not going to have a choice some of those changes already be made for to accommodate integrated corridor management and there’s at least two organizations I know of which are actually developing new systems to accommodate their and accommodate the decision support the automation and the event triggering that we’re talking about so I think anybody it doesn’t or any organization that doesn’t move forward on these kind of new innovations will actually be left behind and actually I also think that with these new innovation you’re going to get other companies who haven’t been in this field developing system which are far superior to what’s currently out there now one of the problems we’ve got with the systems that we currently have is they are pretty proprietary they’re pretty old a lot of them and they’ve effectively been tailored to meet the requirements own reports of time and as a result of that perhaps their back-end data elements and databases on as organized as they need to be with within the big data concepts account you are persisting so frankly i think yes they’re going to have to I also think whilst they won’t it won’t happen overnight I think you’ll start seeing changes those systems happening pretty quickly Hey thank you and we do have another question here no you mentioned in regards to the different the privacy issues in you know different regions kind of acquiring or having different

rules obviously it looks obvious that they will require different approach is any idea of how they can go about accomplishing this yeah I think I think he gets as hard as it sounds and yeah there will be different approaches and different regions there but really are those different coaches in different regions anyway so for instance the Middle East requires different technology to talk to other areas as well Germany escreveu needs identified requires different technology and in fact one of the technologies I discussed was developed specifically to accommodate Germany but that said all that provides back to you is effectively an encrypted license plate which isn’t very different from a non-encrypted license plate you know from a from a systems point of view as opposed to the plane letters themselves so yeah we’re not accommodate will it be able to accommodate the chair variants the variations yeah I think they will I don’t think that’s as big as we think it is yeah the little bit need to be changes there so for instance but there again any system which accommodates the license plate at the moment has to accommodate all the different license plates of world anyway so no so there has to be the ability to actually do that anywhere in the first place anything else Francis I think that’s it right now just strike waiting to see if there’s any more questions that have been how we doing for time at the moment but a little four minutes under an hour after your perfect so give me we give a quick minute to see if there’s any more questions well most we’re doing that their system if you okay wish everybody a happy holiday if those questions do pop up in the next four minutes then we’ll alarm slows if they don’t then please feel free to drop me at myself or sponsor an email or give us a call that matter I will happily answer any questions you may have and other that anything else Francis looks like it’s quiet right now ok well I just want to thank everybody for attending and happy holidays thank you all a happy holidays to everyone