Android Jetpack: Easy background processing with WorkManager (Google I/O '18)

[MUSIC PLAYING] SUMIR KATARIA: Hi, everyone My name is Sumir Kataria And I’m an engineer on the Android team I work on architecture components And today, I want to talk to you about a new library we have called work manager and background processing in general on Android So let’s talk about background processing in 2018 What are we trying to do these days? Just this morning I was trying to send a picture of my lovely wife and my beautiful son to the rest of my family So that’s an example of background processing We’re also sending logs, syncing data, processing that data All of this work is being done in the background And on Android, there’s a lot of different ways you can do this work Here’s a lot of them You can do takes on threads and executors using JobScheduler, AlarmManager, AsyncTasks, et cetera Which one should you use? And when should you use it? Meanwhile, Android also has a lot of battery optimizations that we’ve introduced over the last few years For example, we introduced doze mode in Marshmallow If you’ve been following Android P, we’ve had app standby buckets In Oreo, we restricted background apps, background services So all of these things have to be taken care of as a developer And finally, we always have to worry about backwards compatibility So if you want to reach 90% of Android devices, you want to at least have a minSdk of KitKat So given all of this, what tools do you use and when do you use them? And the trick behind this is that you have to look at the types of background work that you’re doing I like to split this up into two axes The vertical axis here is the timing of the work Does the work need to be done right when it’s specified? Or can it wait for a little bit? So that if your app– your device enters doze mode, you can still do it after that Also, on the horizontal axis here, how important is the work? Does the work only need to be done while your apps in the foreground? Or does it absolutely need to be done at some point? So for example, if you’re taking a bitmap, and you decide that you want to extract a color from it, and update your UI with it, that’s an example of foreground-only work You don’t care about it once the user hits home or back That work is irrelevant Meanwhile, if you’re sending logs, you always want that to happen That’s an example of guaranteed execution So for things that are best-effort, you really want to use things like ThreadPools, RxJava, or core routines For things that require exact timing and guaranteed execution, you want to use a foreground service So an example of this would be that your– the user hits a button, and you want to process a transaction, and update the UI and the state of the app based on that That really needs a foreground service That needs to happen right then Your app cannot be killed by the system while that’s happening This fourth category is very interesting So you want guaranteed execution, but you’re OK if it happens later, doze mode can kick in And there’s a variety of ways to solve it On your APIs, you’ll use JobScheduler If you want to go a little further back, you can use Firebase JobDispatcher to do that And if you don’t have Google Play Services, you’ll probably end up using AlarmManager and BroadcastReceivers And if you want to target all of those things, well, you’ll use some mix of these four things And that’s a lot of APIs, a lot of work to be done WorkManager falls here It’s guaranteed execution that’s deferrable So WorkManager, let’s talk a little bit about its features I just mentioned guaranteed execution is also constraint-aware So if I want to upload that photo that I talked about, I only want to do it when I have a network That’s the constraint It’s also respectful of the system background restrictions So if your app is in doze mode, it won’t execute It won’t wake up your phone just to do this work It’s backwards compatible with or without Google Play Services The API is queryable So if you haven’t queued some work, you can actually check, what is its state? Is it running right now? Has it succeeded or failed? These are things that you can find out with WorkManager It’s also chainable So you can create graphs of work So you can have Work A depending on Work B and C, which in turn depends on Work D. Also WorkManager’s opportunistic So this means that we will try to execute that work in your process as soon as the constraints are met without actually needing JobScheduler to intervene or call you and wake you up It doesn’t wait for a JobScheduler to batch your work if your process is up and running already So let’s talk about a little bit of the basics

and talk through the code So I just described the example I want to upload that photo So how would I do that using WorkManager Let’s talk about the core classes There’s a Worker class This is the class that does the work OK This is where you will write most of your business logic And there’s a WorkRequest class, which comes in two flavors– OneTimeWorkRequest for things that just need to be done once, and PeriodicWorkRequest for recurring work And these will both take a Worker I’ll show you just now So here’s my UploadPhotoWorker It extends the Worker class, and it overrides to doWork method This is the method that will run in the background We’ll take care of that on the background thread You don’t need to put it in a background thread So you simply do your work So in this case, we upload the photo synchronously And we return a result So in this case, let’s say we succeeded And the WorkerResult in here has three values– success and failure, which are fairly obvious; and retry, which says, I encountered a transient error Let’s say that the device lost network connection in the middle, so retry me after a little bit with some backoff So now that I have this, I can create a OneTimeWorkRequest using the UploadPhotoWorker, and then I can enqueue it using WorkManager.getManager.enqueue So soon after this is enqueued, it’ll start running You’ll upload your photo But I just talked about this What if you lose connectivity in the middle of this, or even before it does? What if you’ve never had connectivity? You actually want to use constraints in this case So an example of the constraint here you want to use is you make a Constraints Builder And you say, setRequiredNetworkType to be connected So you need a connected network connection You build it And you also set the constraints on the request that you just built. So by simply doing this, and then enqueuing it, you make sure that this work only runs when your network is connected So let’s say you want to observe this work now that we’ve done it So I want to show a spinner while this work is executing, and then I want to hide the spinner when it’s done How would I do that? So as I said, I’ll enqueue this request And then I can say, getStatusByID on WorkManager using the So each request has an ID And this returns a LiveData out of WorkStatus If you guys remember architecture components, LiveData is a lifecycle-aware observable So now you can just hook into that observable, and you can say, when that work is finished, hide that progress bar So what is this WorkStatus object that you were looking at at the LiveData? It has an ID This is the same ID as the request And it has a State The State is the current state of execution There’s six values here enqueued, running, succeeded or failed, locked and canceled And we’ll talk about the last two later So let’s move a step up in concepts here Let’s talk about chaining work So I promised that you can actually make directed [INAUDIBLE] graphs of work How would you do that? Let’s say this is my problem now I’m uploading a video It’s a huge video I want to compress it first, then upload it So these are both eligible for background work because they’re time intensive things So let’s say I have two Workers, CompressPhotoWorker and UploadPhotoWorker They’re both defined to do the things that I just said So you can make WorkRequests from them And you can say workManager.begi nWith(compressWork) Then uploadWork and enqueue it So this ensures that compressWork executes first And once it’s successful, then uploadWork goes And if you were to write this out, because that was a very fluent way of writing it, what happens behind the scenes here is that begin with returns of WorkContinuation And a WorkContinuation has a method called then that also returns a WorkContinuation, a different one So you’re using that to create, that’s fluent API So you can actually use these WorkContinuations and pass them around if you want, et cetera So let’s say that I’m uploading multiple photos I take lots of photos No one takes only just one photo of their child So how would I upload all of these in parallel? Well, so let’s say I’ve got a WorkRequest for all of them I can literally just say, enqueue and put all of them there It’s a [INAUDIBLE] So you can pass more than one thing there And these are all eligible for running in parallel They may not actually run in parallel depending on your device, and the executor being used, and all of that, but they could be So let’s choose an even more complex example Now you want to filter your photos So you want to apply some kind of–

I don’t know– grayscale filter or a sepia filter to them, then you want to compress them, then you want to upload them How would you do this? WorkManager makes it very simple So first you say, beginWith You do all the filter works in parallel After those have all completed successfully, then you do your compression work And after those have completed– that has completed successfully, then you do your uploadWork And don’t forget to enqueue at the end So we’ve talked about all of that, but there’s a key concept that I want to cover that’s very much related to chaining This is inputs and outputs So let’s talk about this problem that I have here It’s a MapReduce, and really a good way of explaining a MapReduce is to give an example I love reading I’ve loved reading Sherlock Holmes novels since I was a kid And the other day, I was thinking, well, Arthur Conan Doyle has a very specific way of writing What are the top 10 words he uses in his books? Well, how would I figure that out? I would go through each book I would count the occurrence of each word, and then I would combine all of that data and sort it so that I would find the top 10 of those This is a distributed problem that we could call a MapReduce And for inputs and outputs, the common unit of operation here is a data The data is a simple class that’s a key-value map under the hood The keys are strings The values are primitives and strings, and the array versions of each So this is kind of like a bundle or parsable, but it’s its own thing It’s serializable by a WorkManager, and we limit it to 10 kilobytes in size And I’ll go more into that part later So how do we create a data? So in Kotlin, you can make a map very easily So in this case, we’re mapping the key file_name to the value a_study_in_scarlet.txt That’s the novel that I’m going to look at And I’ll convert it to a WorkData So this is a data object And once I create my workRequest builder, I can set the InputData on it So this is the InputData of that map And I pass it along So inside my worker, I can actually retrieve this InputData by just calling the getInputData method And from that I can get the string for the file names And now I have the fileName And I can say, count all the word occurrences in this fileName That’s some method that I’ve written somewhere else, and I can return my success But you don’t want to do just that You actually want to also have outputs, right? Now you’ve done all this work, it should do something There should be an output for it So let’s say that data that we have returns a map of words to their occurrences We can convert that map to a WorkData And we can call a method called setOutputData that sets this data– so getInputData, setOutputData So the key observation that you need to know here is that the worker’s outputs become the inputs for its children So what happens is the findTop10Words worker, which goes next, its inputData is coming from the previous worker So in this case, you can pass the data all the way through, find the top 10 words, and return out So the data flow for one book becomes like this– I’ll count all the word occurrences in that book I’ll pass it to the findTop10Words worker It’s inputData will be whatever I pass through And it will do the sorting or whatever it needs to do But here is a really tricky thing, what happens when you have multiple books? What’s the input for the findTop10Words worker? You’re passing multiple pieces of data, but I’ve only been able to get one inputData What happens to the rest of them, or how do they combine? For this, you want to look at InputMergers So InputMerger is a class that combines data from multiple sources into one data object And we provide two implementations out of the box– OverwritingInputMerger, which is the default, and ArrayCreatingInputMerger You can also create your own, but let’s talk about these two First, OverwritingInputMerger– so we have two data objects here, each with their own keys and values What does OverwritingInputMerger do? It first takes the first piece of data and it just puts everything in a new data object So it’s an exact copy of this Then it takes a second piece of data and it copies it over, so overwriting anything that’s the same key So in this case, the name Alice becomes Bob, and the age of 30 becomes Three Days Note that it changed type So a number became a string here The scores key was new, so it just got added

Note that if we did this in reverse, instead of Bob, you would have Alice as the final output So this is something a little tricky You want to make sure that OverwritingInputMerger is the right tool for the job But it is very simple What about ArrayCreatingInputMerger? This is the one that actually takes care of those collision case So in this case, let’s go just key by key The name becomes an array of Alice and Bob Color becomes a singleton array of blue because it’s only defined in one of them Scores, notice that there is one integer and one array of integers These combine and they just concatenate together Order is not specified, but all the values come through What happens for age? So there’s an integer, and there’s a string This is an exception we do Expect it to be the same basic value type So let’s go back to that example I was telling you about, Sherlock Holmes Implicitly, there is an InputMerger before the stage So we combine all of that data And which InputMerger do we want to use? So we want to– we don’t want to throw away any of this calculation that we’ve done So we actually want to use an ArrayCreatingInputMerger, which preserves all of the data and gets it through So how do we do that? Well, we just say setInputMerger on the request builder of the findTop10Words So it merges data using an ArrayCreatingInputMerger So you say, beginWith the countWords workers, then do the findTop10Words worker, and then enqueue So for example, if the first book had 10 instances of the word Sherlock, 5 of Watson, and 30 of elementary, and the second one had 12, 15, and 5 You would get arrays like this Sherlock would be 10, 12 Watson would be 5, 15 Elementary would be 35 So in your findTop10Words worker, you would sum all of that up, sort them, find the top 10, and that’s your output And I just said that’s your output, right? So you can actually observe the output in your work status using that LiveData So you can actually get that output data So that’s super useful because you can display it in your UI How do you cancel work? So I just decided to send up a picture, but I’m like, wait a sec, this is not the picture I’m meant to send up How do you cancel that upload? Very simple, you just say, cancelWorkById But do note that cancellation is best effort So the work may have already finished These are all asynchronous things They may be happening in the background Before you have had a chance to do that cancelWork, it may already be running or finished So it’s best effort OK, so let’s talk a little bit more about tags And tags are solving this problem IDs that I just told you about are auto generated They’re not human readable So they’re actually UUIDs under the hood And you can’t really understand them They’re not useful for debugging If you log them, they’re not going to make sense What kind of work was running? I don’t know It’s just some big number I don’t know what that is Tags solve this issue Tags are a readable way to identify your work So tags are developer-specified strings by you, and each work request can have zero or more tags You can query and cancel work by tag Let’s look at an example So I used to work on the G+ team here And the G+ app supports multi-login So you can have multiple users logged in at the same time And each of those users could be doing several kinds of background work You could be getting favorites You could be getting preferences So if you have three users logged in on your phone, and they’re doing two kinds of work, you have six things happening How do you identify what you’re looking at any given time? Well, you can use tags So for example, in this workRequest builder, you can add tags to say this is user1, and this is the get_favorites operation So now you can actually identify that work And if you wanted to look at the statuses, you could say, give me all of the work for user1 And this will return a list of work statuses as a LiveData because each tag can correspond to more than one workRequest So this is a list of work statuses Similarly, you can also cancel all work by tag Cancellation is best effort, again But you can cancel all of one particular kind of work, in this case Tags are also useful for a couple of other reasons Tags namespace your type of work, as I just told you You can have tags for the kinds of operations you’re doing, get favorites, get preferences, et cetera But they also namespace libraries and modules So if you’re a library owner or a module owner,

you should always tag your work so you can get it later Let’s say that you have a library, and you move to a new version of that library in your app, maybe you want to cancel all the work you had You can cancel all work by your tag So always use tags when you are using a library And WorkStatus also has tags available in it So if you’re ever looking at a WorkStatus, you can get the tags for that work and see what you yourself called it in the past when you enqueued it One more thing I wanted to talk about is unique work So unique work solves a few different problems But one of the common ones that almost every app has is syncing You want to sync when you first launch the app You want to sync maybe every 12 to 24 hours to get the freshest data And you may also want to sync when your language changes Maybe you have a version of your data in a different language So you want to sync at that point, too So you’re doing all this syncing, but you really only want one sync active at a time You don’t want four sync operations running Which one is the right one? Which one wins? You don’t know You just want one Unique work can solve this So it is– a chain of work can be given a unique name You can enqueue, query, and cancel using that name, and there can only be one chain of work with that name Let’s take a look at that sync example So if I say, beginUniqueWork with my name, let’s say sync, in this case And that next argument is what I call the existing work policy So if there is work with this name, sync, what should I do with it? In this case, I say, keep it I want to keep the existing work, ignore what I’m doing right now The next argument is actually your workRequest, in this case, a syncRequest when you enqueue it So if there’s work with the name sync already in flight, it will keep that If there isn’t, it will enqueue this and execute it So this is how you dedupe your syncs So here at Google, we love chat apps And maybe you’re updating your chat status So you want to say, I’m bored And then 10 seconds later, I’m watching TV Then, I’m bored again OK, I’m going to sleep And you’re in a bad network connection state You have bad Wi-Fi, and maybe the first thing hasn’t gone through when you type your second chat status update And really the second one should win, and the third one should win over that So you want to make sure that the last one wins How would you solve this? Here’s a simple function You don’t even need to read the rest of it It’s the last line that I want you to care about, which is beginUniqueWork Your name is update_status, and you choose the REPLACE option REPLACE cancels and deletes any existing in flight operations off that name So the last one always does win In this case, if you have two update chat status calls, the last one will win And finally, I love music I love the Foo Fighters I was building a playlist the other day with all their songs There’s a lot of songs There’s like 150 or 200 songs And I was doing all of this I was adding a song I was shuffling two songs around I was moving something to the bottom of the list I was deleting a song because I had it already somewhere else These are all things that I want to do using WorkManager, but how would I do that? These things all have to execute in order And so since the order is important, we provided the ability to use the APPEND existing work policy that says, do this work at the end of the list of update_playlist operations So append this to the end of this thing, so everything else must successfully execute before this executes So you can add operations to the end So ExistingWorkPolicy, as a summary, there are three types, KEEP, REPLACE, and APPEND A few notes about PeriodicWork, it works very similarly to everything you’ve seen so far Just a couple of notes on it– so the minimum period length is the same as JobScheduler It is 15 minutes It is still subject to doze mode and OS background restrictions, just like any of the other work we’ve talked about It can’t be chained, and it can’t have initial delays And we think that that just sort of makes good API sense It’s much more reasonable to think of it in those terms All right, so we’ve talked a lot about code Let’s talk about how it all works under the hood So you’ve got a work You enqueue it We store it in our database What happens after that? Well, if the work is eligible for execution, we just send it to the executor right away By the way, this executor, you can actually specify it,

but we do provide one that’s default But let’s say that your process gets killed Well, what happens then? How does it get woken up, and how does this work run again? So if you’re on API 23+, we send it to JobScheduler as well And JobScheduler invokes an IPC, wakes up your process It goes to the same executor, and that’s where it runs If it’s an older device, and you use Firebase JobDispatcher and user optional dependency, we can send it to Firebase JobDispatcher Same thing, invokes an IPC, runs it on that executor What if you don’t have that or you’re not using a Google Play services device? So you’re using something else We have a custom AlarmManager and Broadcast Receivers implementation And the same thing, uses an IPC, wakes up your app when the time is right, and runs the job A couple of implementation details– so JobScheduler and Firebase JobDispatcher are through Google Play services They provide central load balancing mechanism for execution So if every app on your device is trying to run jobs, they’ll load balance them They’ll make sure that you’re not running too much work on your device and burning up your battery The AlarmManager implementation that we have, unfortunately, can’t do that because it’s only there off your own app Of course, your concepts, like content URI triggers, idle, doze mode, et cetera, are only available at the API levels that they were introduced at So those methods will be marked with, requires API with the appropriate API level We take care of obtaining wake locks when necessary So especially, this is true for the AlarmManager implementation Don’t take wake locks in your workers You don’t need to do that We take care of it for you Finally, let’s talk a little bit about testing You want to test this app We provide a testing library It has a synchronous executor Use WorkManager as normal to enqueue your requests And we provide a class called TestDriver, which executes enqueued work that has constraints So we can just pretend that the constraint has been meant Periodic and initial delay triggers are coming soon We don’t have them yet So if you wanted to look at the code for it, you can initializeTestWorkManager You can get the TestDriver Create and enqueue your work as you normally would, with a constraint, in this case And we can tell the TestDriver, hey, all constraints are met for this work Your work executes at that point, synchronously, and you can verify the state of your app and make sure that everything is right I also want to talk a little bit about best practices before I end here It’s very important to know when to use WorkManager WorkManager is for tasks that can survive process death It can even wake up your app and your app’s process to do the work So for example, it’s OK when you want to use it to upload media to a server It’s also OK when you want to parse data and store it in your database It’s not OK for that example I gave earlier You’re extracting the palette color from an image and updating an image view with it, because that’s foreground only work It’s also not OK when you’re parsing data and just updating the contents of a view Because you could switch screens You could go in the background It’s not work that needs to use WorkManager It doesn’t need to survive process death Also, it’s not OK to process payment transactions in it if they care about timing right then So if you click buy, and you want to update the state of the app, that really needs something else So that last one needs a foreground service The other two may just use thread pools or Rx Also, WorkManager is not your data store Instances of data are limited to 10 kilobytes each when serialized So data is really meant for light, intermediate transportation of information You can put file URIs or keys to other databases in there if you want You can put simple information to update your UI If you want to use a full data store, I would recommend using Room Yit would be very happy that I’m saying this It’s an awesome database Finally, be opportunistic with your work So here’s a filter compress upload example again The reason that these are not just one big job is because they all have different constraints So they can execute at different times Let’s say, I’m getting on an airplane, and I’m uploading a bunch of images and running this chain of work Well, I go into airplane mode Maybe I don’t have network for the next 12 hours because I’m flying across the world Well, the other work can still execute, and it should So if you architect like this, you can do that This also, by the way, makes your code a little bit more testable because you can write a test for filtering that isn’t conflated

with compression and upload All right, and I want to talk about a few next steps for you If you need to reach us and talk to us about WorkManager, we are in the Android Sandbox, just behind us, I think, over here, that’s more information on the official developer website about WorkManager These are all the greater dependencies The first one’s a required one The second one is if you use Firebase JobDispatcher, also include that There’s a testing library, and of course, we have Kotlin extensions as well WorkManager is part of the architectural components in Android Jetpack And we have a bunch of talks here tomorrow, Navigation Controller, 8:30 AM Hope you guys make it there And thanks for being part of this talk We look forward to hearing back from you soon Thank you [MUSIC PLAYING]