Responsible AI with TensorFlow (TF Dev Summit '20)

[MUSIC PLAYING] CATHERINE XU: Hi I’m Kat from the TensorFlow team, and I’m here to talk to you about responsible AI with TensorFlow I’ll focus on fairness for the first half of the presentation, and then my colleague, Miguel, will end with privacy considerations Today, I’m here to talk about three things The first, an overview of ML fairness Why is it important? Why should we care? And how does it affect an ML system exactly? Next, we’ll walk through what I’ll call throughout the presentation a fairness workflow Surprisingly, this isn’t too different from what you’re already familiar with– for example, a debugging or a model evaluation workflow We’ll see how fairness considerations can fit into each of the discrete steps Finally, we’ll introduce tools in the TensorFlow ecosystem, such as Fairness Indicators that can be used in the fairness workflow Fairness Indicators is a suite of tools that enables easy evaluation of commonly used fairness metrics for classifiers Fairness Indicators also integrates well with remediation libraries in order to mitigate bias found and a structure to help in your deployment decision with features such as model comparison We must acknowledge that humans are at the center of technology design, in addition to being impacted by it, and humans have not always made product design decisions that are in line with the needs of everyone Here’s one example Quick, Draw! was developed through the Google AI experiments program where people drew little pictures of shoes to train a model to recognize them Most people drew shoes that look like the one on the top right, so as more people interacted with the game, the model stopped being able to recognize shoes like the shoe on the bottom This is a social issue first, which is then amplified by fundamental properties of ML– aggregation and using existing patterns to make decisions Minor repercussions in a faulty shoe classification product, perhaps, but let’s look at another example that can have more serious consequences Perspective API was released in 2017 to protect voices in online conversations by detecting and scoring toxic speech After its initial release, users experimented with the web interface found something interesting The user tested two clearly non-toxic sentences that were essentially the same, but with the identity term changed from straight to gay Only the sentence using gay was perceived by the system as likely to be toxic, with the classification score of 0.86 This behavior not only constitutes a representational harm When used in practice, such as a content moderation system, this can lead to the systematic silencing of voices from certain groups How did this happen? For most of you using TensorFlow, a typical machine learning workflow will look something like this Human bias can enter into the system at any point in the ML pipeline, from data collection and handling to model training to deployment In both of the cases mentioned above, bias primarily resulted from a lack of diverse training data– in the first case, diverse shoe forms, and in the second case, examples of comments containing gay that were not toxic However, the causes and effects of bias are rarely isolated It is important to evaluate for bias at each step You define the problem the machine learning system will solve You collect your data and prepare it, oftentimes checking, analyzing, and validate it You build your model and train it of the data you just prepared And if you’re applying ML to a real world use case, you’ll deploy it And finally, you’ll iterate and improve your model, as we’ll see throughout the next few slides The first question is, how can we do this? The answer, as I mentioned before, isn’t that different from a general model quality workflow The next few slides will highlight the touch points where fairness considerations are especially important Let’s dive in How do you define success in your model? Consider what your metrics and fairness-specific metrics are actually measuring and how they relate to areas of product risk and failure Similarly, the data sets you choose to evaluate on should be carefully selected and representative of the target population of your model or product in order for the metrics to be meaningful Even if your model is performing well at this stage, it’s important to recognize that your work isn’t done Good overall performance may obstruct poor performance on certain groups of data Going back to an earlier example, accuracy of classification for all shoes was high, but accuracy for women’s shoes was unacceptably low To address this, we’ll go one level deeper By slicing your data and evaluating performance for each slice, you will be able to get a better sense of whether your model is performing equitably for a diverse set of user characteristics Based on your product use case and audience, what groups are most at risk? And how might these groups be represented in your data, in terms of both identity attributes and proxy attributes? Now you’ve evaluated your model Are there slices that are performing significantly worse

than overall or worse than other slices? How do we get intuition as to why these mistakes are happening? As we discussed, there are many possible sources of bias in a model, from the underlying training data to the model and even in the evaluation mechanism itself Once the possible sources of bias have been identified, data and model remediation methods can be applied to mitigate the bias Finally, we will make a deployment decision How does this model compare to the current model This is a highly iterative process It’s important to monitor changes as they are pushed to a production setting or to iterate on evaluating and remediating models that aren’t meeting the deployment threshold This may seem complicated, but there are a suite of tools in the TensorFlow ecosystem that make it easier to regularly evaluate and remediate for fairness concerns Fairness Indicators is a tool available via TFX, TensorBoard, Colab, and standalone model-agnostic evaluation that helps automate various steps of the workflow This is an image of what the UI looks like, as well as a code snippet detailing how it can be included in the configuration Fairness Indicators offers a suite of commonly-used fairness metrics, such as false positive rate and false negative rate, that come out of the box for developers to use for model evaluation In order to ensure responsible and informed use, the toolkit comes with six case studies that show how Fairness Indicators can be applied across use cases and problem domains and stages of the workflow By offering visuals by slice of data, as well as confidence intervals, Fairness Indicators help you figure out which slices are underperforming with significance Most importantly, Fairness Indicators works well with other tools in the TensorFlow ecosystem, leveraging their unique capabilities to create an end-to-end experience Fairness Indicators data points can easily be loaded into the What If tool for a deeper analysis, allowing users to test counterfactual use cases and examine problematic data points in detail This data can also be loaded into TensorFlow Data Validation to identify the effects of data distribution on model performance This Dev Summit, we’re launching new capabilities to expand the Fairness Indicators workflow with remediation, easier deployments, and more We’ll first focus on what we can do to improve once we’ve identified potential sources of bias in our model As we’ve alluded to previously, technical approaches to remediation come in two different flavors– data-based and model-based Data-based remediation involves collecting data, generating data, re-weighting, and rebalancing in order to make sure your data set is more representative of the underlying distribution However, it isn’t always possible to get or to generate more data, and that’s why we even investigated model-based approaches One of these approaches is adversarial training, in which you penalize the extent to which a sensitive attribute can be predicted by the model, thus mitigating the notion that the sensitive attribute affects the outcome of the model Another methodology is demographic-agnostic remediation, an early research method in which the demographic attributes don’t need to be specified in advance And finally, constraint-based optimization we will go into more detail in over the next few slides in a case study that we have released Remediation, like evaluation, must be used with care We aim to provide both the tools and the technical guidance to encourage teams to use this technology responsibly CelebA is a large-scale face attributes data set with more than 200,000 celebrity images, each with 40 binary attribute annotations, such as is smiling, age, and headwear I want to take a moment to recognize that binary attributes do not accurately reflect the full diversity of real attributes and is highly contingent on the annotations and annotators In this case, we are using the data set to test a smile detection classifier and how it works for various age groups characterized as young and not young I also recognize that this is not the possible full span of ages, but bear with me for this example We trained an unconstrained– and you’ll find out what unconstrained means– tf.keras.Sequential model and evaluated and visualized using Fairness Indicators As you can see, not young has a significantly higher false positive rate Well, what does this mean in practice? Imagine that you’re at a birthday party and you’re using this new smile detection camera that takes a photo whenever everyone in the photo frame is smiling However, you notice that in every photo, your grandma isn’t smiling because the camera falsely detected her smiles when they weren’t actually there This doesn’t seem like a good product experience Can we do something about this? TensorFlow constraint optimization is a technique released by the Glass Box research team here at Google And here, we incorporate it into our case study TF constraint optimization works by first defining the subsets of interest For example, here, we look at the not young group,

represented by groups_tensor less than 1 Next, we set the constraints on this group, such that the false positive rate of this group is less than or equal to 5% And then we define the optimizer and train As you can see here, the constrained sequential model performs much better We ensured that we picked a constraint where the overall rate is equalized for the unconstrained and constrained model, such that we know that we’re actually improving the model, as opposed to merely shifting the decision threshold And this applies to accuracy, as well– making sure that the accuracy and AUC has not gone down over time But as you can see, the not young FPR has decreased by over 50%, which is a huge improvement You can also see that the false positive rate for young has actually gone up, and that shows that there are often trade-offs in these decisions If you want to find out more about this case study, please see the demos that we will post online to the TF site Next, we finally we want to figure out how to compare our models across different decision thresholds so that we can help them in your deployment decision to make sure that you’re launching the right model Model Comparison is a feature that we launched such that you can compare models side by side In this example, which is the same example that we used before, we’re comparing the CNN and SVM model for the same smile detection example Model comparison allows us to see that CNN outperforms SVM– in this case, has a lower false positive rate– across these different groups And we can also do this comparison across multiple thresholds, as well You can also see the tabular data and see that CNN outperforms SVM at all of these thresholds In addition to remediation and Model Comparison, we also launched Jupyter notebook support, as well as a Fairness Lineage with ML Metadata Demo Colab, which traces the root cause of fairness disparities using stored run artifacts, helping us detect which parts of the workflow might have contributed to the fairness disparity Fairness Indicators is still early and we’re releasing it here today so we can work with you to understand how it works for your needs and how we can partner together to build a stronger suite of tools to support various questions and concerns Learn more about Fairness Indicators here at our tensorflow.org landing page Email us if you have any questions And the Bitly link is actually our GitHub page and not our tf.org landing page, but check it out if you’re interested in our code or case studies This is just the beginning There are a lot of unanswered questions For example, we didn’t quite address, where do I get relevant features from if I want to slice my data by those features? And how do I get them in a privacy-preserving way? I’m going to pass it on to Miguel to discuss privacy tooling in TensorFlow in more detail Thank you MIGUEL GUEVARA: Thank you, Cat So today, I’m going to talk to you about machine learning and privacy Before I start, let me give you some context We are in the early days of machine learning and privacy The field at the intersection of machine learning and privacy has existed for a couple of years, and companies across the world are deploying models to be used by regular users Hundreds, if not thousands, of machine learning models are deployed to production every day Yet, we have not ironed the prize issues out with these deployments For this, we need you, and we’ve got your back in terms of in TensorFlow Let’s walk through some of those privacy concerns and ways in which you can mitigate them First of all, as you all probably know, data is a key component of any machine learning model Data is at the core of any aspect that’s needed to train a machine learning model However, I think one of the pertinent questions that we should ask ourselves is, what are the primary considerations that there are when we’re building a machine learning system? We can start by looking at the very basics We’re generally collecting data from an end device Let’s say it’s a cell phone The first privacy question that comes up is, who can see the information in the device? As a second step, we need to send that information to the server And there are two questions there While the data is transiting to the server, who has access to the network? And third, who can see the information in the server once it’s been collected? Is this only reserved for admins, or can regular [INAUDIBLE] also access that data? And then finally, when we deploy a model to the device, there’s a question as to who can see the data that was used to train the model In a nutshell, if I were to summarize these concerns,

I think that I can summarize them with those black boxes The first concern is, how can we minimize data exposure? The second one is, how can we make sure that we’re only collecting what we actually need? The third one is, how do we make sure that the collection is only ephemeral for the purposes that we actually need? Fourth, when we’re releasing it to the world, are we releasing it only in aggregate? And are the models that we’re releasing memorizing or not? One of the biggest motivations for privacy is some ongoing research that some of my colleagues have done here at Google A couple of years ago, they released this paper where they show how neural networks can have unintended memorization attacks So for instance, let’s imagine that we are training a learning model to predict a next word Generally, we need text to train that machine learning model But imagine that that data or that core piece of text has, or potentially has, sensitive information, such as social security numbers, credit card numbers, or others What the paper describes is a method in which we can prove what’s the propensity that the model will actually memorize some data? I really recommend you to read it, and I think that one of the interesting aspects is that we’re still in the very early days of this field The research that I showed you is very good for neural networks, but there are ongoing questions around classification models We’re currently exploring more attacks against machine learning models that can be more generalizable and used by developers like you, and we hope to update you on that soon So how can you get started? What are the steps that you can take to do machine learning in a privacy preserving way? Well, one of the techniques that we use is differential privacy, and I’ll walk you through what that means You can look at the image there and imagine that that image is the collection of data that we’ve collected from a user Now let’s zoom into one specific corner, that blue square that you see down there So assume that we’re training on the individual data that I’m zooming in If we trained without privacy, we’ll train with that piece of data However, we can be clever about the way that we train a model And what we could do, for instance, is just let’s flip each bit with a 25% probability One of the biggest concerns that people have when doing this approach is that it naturally introduces some noise, and people have questions as to, what’s the performance of the resulting model? Well, I think one of the interesting things from this image is that even after flipping 25% of the bits, the image is still there And that’s kind of the big idea around differential privacy, which is what powers TensorFlow Privacy As I said, differential privacy is the notion of privacy that protects the presence or absence of a user in a data set, and it allows us to train models in a privacy-preserving way We released, last year, TensorFlow Privacy, which you can check at our GitHub repository, github.com/tensorflow/privacy However, I want to talk to you also about some trade-offs Training with privacy might reduce the accuracy of the models and increase training time, sometimes exponentially Furthermore, and I think more worryingly and tied to Cat’s talk, if a model is already biased, differential privacy might make things even worse, as in even more biased However, I do want to encourage you to try to use differential privacy because it’s one of the few ways in which we have to do privacy in ML The second one is our Federated learning So a refresher, TensorFlow Federated is an approach to machine learning where a shared global model is trained across many participating clients that keep their training data locally It allows you to train a model without ever collecting the raw data, therefore, reducing some privacy concerns And of course, you can also check it out at our GitHub repository This is kind of what I was thinking with or mentioning

about TensorFlow Federated learning The idea is that devices generate a lot of data all the time– phones, IoT devices, et cetera Traditional ML requires us to centralize all of that data in a server and then train the models One of the really cool aspects about Federated learning is that each device runs locally only, and the outputs are aggregated to create improved models, allowing the orchestrator not to see any private user data In terms of Federated, as a recap, allows you to train models without ever collecting the raw data So if you remember the first slide that I showed, it really protects the data at the very edge In terms of next steps, we would really want you to reach out to tf-privacy We would love to partner with you to build responsible AI cases with privacy As I said earlier, we’re in this together We are still learning The research is ongoing And we want to learn more from your use cases I hope that from the paper that I showed you, you have the sense that keeping user data private is super important But I think most importantly is that this is not trivial The trade-offs in machine learning and privacy are real, and we need to work together to find what the right balance is [MUSIC PLAYING]