Simon Thulbourn: Docker containers for testing and previewing BBC News

hey that’s right right oh sorry once I so it’s called them significantly wrong okay let’s enjoy present my presentation oh yeah I can see what I’m meant to see now hi my name’s Simon 51 I’m a software engineer at BBC news I’ve been there for a couple of years now I’ve kind of been working on lots of things around BBC around AWS and how we’re moving there from a central infrastructure around in our own data centers we’ve not really thought too much about how we’re moving there before we actually moved so we’re still building all the tools and services I want to share some of the ways we’re going to do it and how we’re using docker so obviously we use a CI system of some sort the BBC is pretty embedded with Jenkins and so we built a system around how to use docker with it we also built some other little bits of tools and tried setting up a platform as a service a little bit of background for people that haven’t been at the BBC before we’ll see we have a whole ton of soft air servers and two data centers and everything is in a shared resource so all your rpms across however many projects we have so I play a news sport radio and such or gets installed on each of these 100 bhp servers and then we have another 40 or so for jbm and enough of you for data storage of some sort those are mirrored and it’s kind of ok it’s a bit ropey so when we do a release we deal at every two weeks on to live sometimes it goes really horribly wrong and things go comically wrong so in last year like I said we’ve sat its ploy and using AWS for pressure foot and push button deployments which is really great for us developers because we’re now the ones completely in control of everything and the BBC’s entered this brave new world without some of the tools like monitoring logging and search so we start buildin and we recently started investing a lot of time with in building a CI server that’s not centrally provisioned and horrible and so we started using Jenkins and docker we’re using Jenkins purely because that’s how the BBC’s always done it between Jenkins or Hudson so we kept with something that people they’ve used to in our teens so traditionally we’ve used essentially provisioned version of Hudson which has I did account this morning 4568 jobs on that’s just for the PHP apps that’s not getting into all that JVM style stuff so it’s quite depressing and and it’s static if you want a new version of Ruby that’s not happening so we’re not the first people to do this not by a long shot lots of people have done different versions of setting up docker with Jenkins or Jenkins inside vulgar or whatever variation of that we’ve gone through some of the use cases that people are put forward but decided they weren’t quite right for us and we didn’t really want some because complexes ebays system where it’s dr. inside otoko on apache my sauce although pretty awesome maybe in the future when we have time so we’ve got a pretty scalable and simple infrastructure of one Jenkins master which is deployed directly onto an instance on AWS and then use a practically unlimited number of slaves which kind of works and we have a docker registry to put all of our build images up and since we’re hosting all of this we get flexibility that comes with all of this and of see our slaves use auto scanning groups our slaves are able to scale based on whatever metric we define so if we decide that we need more instances base around CPU usage we spin up another and then another 4am and if we have long running jobs then another instance and isolated just for

that job out of the box Jenkins supports build slaves but they don’t auto register which sucks which is their problem since we don’t know how many slaves were going to have we don’t know names or any details about them they just going to exist at some point ultimately we got around this by using this warm plugin sorry I couldn’t find a good swarm gift but it does auto register which is a lot better BBC News has a lot of people doing a lot of things it’s not just PHP anymore it’s not just a JVM based language we’ve got different things running at all the time and sort of environments to maintain so this is where doctor comes in where instead of the traditional way that I guess you guys perhaps expecting of running everything inside of the container including Jenkins and Jenkins agents we’re letting the jobs just cool directly out to doctor because using a container to run our jobs inside of a lot simpler and I don’t really want to be the guy maintaining Jenkins forever so now we can have that big hurry and go team where teams are responsible for their own infrastructure and their own built environments so we’ve got workflow all traditional doctor 10 build push but the slave bullet and then run it into the terminates which have some issues for us to work around the container updates what do we do when the development team has their own infrastructure for people to clean up after themselves use cron and wipe it off every hour or so if a job running but you can’t do and gets rejected if the image is actually in use or hope our machine scales down and we just don’t deal with it which is sucky we have another issue with build artifacts and the very specific way that BBC does packaging of rpms we have to run every container on Jenkins in privileged mode if we’re building akin to building an RPM which is an ideal this is due to the fact we’re using red hats molt project don’t if you guys know it or not which essentially uses named namespace processes within a chroot which is kind of like dr. in dhaka but without vodka so not far from it and we also have our own belt chain which is unique to the BBC which is a load of Python scripts that we build as command line tools so each build environment has to inherit from this or we create an overlay which is something we haven’t solved yet it’s not perfect but it’s 100 times better from where we were this time a month ago where we now have customized environments and we can handle a lot of load and we manage everything we have another part of our dr strategy which is so this is a github link please go check in to help a contractor for BBC News built this tool of spurious he’s sitting in the back there somewhere near the camera so what is it allows developers who are using AWS to develop completely locally you don’t have to use AWS resources to use AWS resources meaning that beyond and if you set up and pulling down of images the Capellan we could have an all offline and we’re baking a lot of the of your services and so we’re got the ability to fake Q’s got the ability to fake dynamodb s free and elastic cash which has issues or what do we do with that state but we don’t really care it’s not production it’s not leaving the machine so trash it and if we care you can use volumes users for this are quite successful in a couple of projects we’ve used for the BBC we use it in the elections coverage which you may have seen a few months ago for the local and European elections and you’ll see it again next week for the Scottish referendum and we also use it for another project cool to kaleidoscope which is an image based website for BBC

News I’ll give you a quick overview of what kaleidoscope is other than just an image based website which is every website so an image based website to us is an screenshot of a BBC News website in a specific language which we then break into small clickable parts representing a story block or little fragments to make it easier to download for the user the BBC has this remit or we have to supply news in 28 different languages sorry 28 different languages and we do this image based thing because some of these devices that people are using in say for Arabic languages can’t display the required font and so we have to take screenshots and display that and there’s 10 editions that we’d have to do this for so we can take a seemingly simple architecture and replace the AWS specific components so this case we’re replacing SQS DynamoDB and s free because we don’t want to spin up all of these resources 10 times for different leopards and so now we’re saving license fee payers a bit of money by not running this we also have a platform as a service which is it weird because we’re running too yeah we jumped on the idea of this because we have a typical sign box of edelman and it would have been a good idea to move that as it from a virtual machine and host at multiple times within our cluster and so this came about from which jumped in this we saw docker come up repeatedly on stuff like a canoes and came fairly popular as we all know is we’re all here and is we could just replace our crusty one use virtual machines with hundreds of containers and it helps us defeat the problems we have with our century provision infrastructure of we can’t do certain things that we would like to do for instance we want to be able to test and preview things we’re working on and not wait until the very last moment for an error sorry I can find the spoof version of this on youtube for those that don’t know the BBC test cut from 1980s or so so our usual workflow is build merge unit tests install that to integration and run some more cucumber tests as a problem with that because we have to merge and have everything in our master branch in order to test it correctly on the environment which is a huge problem for us because we want to be able to see failure quickly and when you have 50 developers pushing every day it becomes a massive massive problem so with all of our developers when flat out we’re probably using four to five branches at any given point with AQ off about 75 things waiting as an experiment or some sort of feature to be checked by a stakeholder and so we decided we should find a way of sharing this with people we want to be able to show our product manager or whatever this new feature that we’ve just added as they’ve required it and so we came with the idea of running a platform as a service we have two options we can build something which seems lovely or we could just install an open-source one which everyone could do I mean there’s quite a few so we did both I wanted to build it i mean i’m a developer as silly but so sigh creating a powerful service system i came up with at least five different architectures tanzel are building one of these things is quite hard so if you’ve done it well done eventually i settled on creating a very thin layer around fleets HTTP experimental API if you’ve seen it at all which changes every week or two at the moment so we submit a basic free file as unifies I and does all the heavy lifting of cloning building or pulling and then running it and then it uses a H a proxy reverse proxy and we have a nice subdomain to get to our function our feature sorry

but there’s a lot of ways you can build a dock your container and run one and how do you manage all the different variances because whilst we want we have this idea for our sandbox we know exactly what’s going to happen and what things we need to do we nice to make it very generic for development services so if we want to run dashboards and such and containers which we are doing on different ways we don’t want to do that we just want to run make build or make run or fig up and fig destroy or whatever so we’d move to XT elsewhere and we also investigated using open source tools so we’ve got Flynn Deus and Doc you we evaluated these quite quickly as they popped up one by one and checking out their individual use case document Flynn are really cool but they limited to Heroku bill backs and no one wants to maintain bill packs because you don’t want to do that every time you have a new service so that left dais dais is actually really cool it does a lot of the functionality you want you can build from a docker file and to deploy you just do a quick get push the downside is that there’s no real simple bolt for running the same thing multiple times with slight variances so if you want to run 20 different branches you can’t do that without removing the app markers locally so removing the git remote and wherever else that leaves behind so you spit of a limitation and Di’s kind of liked running your stuff with its stuff so it runs a login container and it runs a controller and whatever else sometimes if your app contain it takes a while to start it will fall down and you have to restart everything and your application doesn’t get registered in the reverse proxy which has probably been fixed since writing these slides so where’s all this going I’m pretty sure you’re asking well BBC News now has these two clusters running docker containers one for the tool we’ve made and one using deus both are capable of running and showing branches but they’re equally performant about stable as each other despite the limitations deus might all is not perfect either but we obviously want to limit this down to one may be moved to something completely different like Halle oz by Spotify that looks pretty useful and then hopefully get to a point where other people in BBC News can use this so it’s not just developers it’s not just me its stakeholders or maybe we can cover so we push from when we push the github it automatically picks up the web hook and we just extend our tool sets so that’s actually the end of my slides does anyone have any relevant questions hey hey I was just wondering how you’re actually faking the AWS services like I imagine like a last occasion is just a local Redis that some of the office for us is a local memcache okay because that’s what we’re used to in the BBC but there’s quite a few gems out there or I mean amazon themselves provide the fake DynamoDB there’s a lot of rubygems to fake as free and eskew us and there’s also a java project for rescue us which we tried running for a little while ok cool take something nope okay how do you provide this Jenkins instances to your teams like you said you’re happening this Jenkins infrastructure and teams can provision their Jenkins infrastructure themselves are using your one of the two pass solutions you have a house or win houses we’re essentially running everything on bad virtual machines on AWS for instances the development teams ourselves get to write that docker file inside and just leave a taco file inside their repo and we can either build that for them or they can build it and push it to the registry so it’s not so much a case or we have to provision a different Jenkins system for everyone it’s a case of al Jenkins system between BBC News which we no longer sharing with sport I player and whoever else we’ve got one just for peebs news and the docker container should be enough for each

individual development team within news so they provisions a slave as a doctor or it’s more provisioning what they want inside of that container anywhere else next yay getting off easy they think we I’m Way ahead of schedule okay thank you two minutes