Vslive! keynote abel wang on microsoft’s devops journey — visual studio magazine electricity manipulation


"Notice I didn’t say ‘continuously delivery of code,’" he said, "because what does that give us? Piles and piles of code does us no good whatsoever. And notice that I didn’t say ‘continuous delivery of features,’ because we could be delivering feature after feature, sprint after sprint, but if it’s not what the users need or want, we’re just wasting our time. We have to make sure that we are continually delivering value. That’s what it’s all about."

That and speed, apparently. Wang showed attendees a short film clip that compared the time it took a Formula One pit crew to service a race car at the 1950 Indianapolis 500 — change the tires, gas up, and clean the windshield — with a similar pit stop during a 2013 race in Melbourne, Australia. The 50s crew took 67 seconds; the modern crew about 5 seconds. (The sound didn’t work, but the audience got the point.)

"Someone pointed out that, back in the day, they had to put gas in the car, and in the modern example, they didn’t," Wang said. "How do you explain that in the DevOps world? I said, it works perfectly in the DevOps world, because putting fuel in the car was a giant bottle neck. Guess what they did? They shifted left. Technology advanced to the point where they could build engines that are not only much more powerful, but also require much less fuel…."

"We have bottlenecks in the world of software, too," he added, "and one of the biggest is testing. Sure, we can build out these fancy CI/CD pipelines that deploy our code superfast into production. But how do we maintain quality, because we no longer have the luxury of spending weeks, if not months — if not years — doing end-to-end functional testing. So, we shifted left, where a lot of the responsibility for quality now with the developer. Instead of trying to tack on quality at the end, we build quality in from the start … by shifting left, just like the race car."

"We had a boxed product call TFS [Team Foundation Server]," Wang explained, "and every three to four years we would come out with a new version, and we thought that was great. And back in 2005, it may have been good enough. But about seven years ago we started realizing that we were getting out-innovated by our competitors. They were moving at a much faster pace, and we quickly saw that, if we did not change the way we worked, we would become obsolete."

The application would have to be re-architected from a boxed product — a CD that had to be installed on physical iron — to an app that would run in the cloud, Wang said. But even more challenging, the group itself would have to be re-organized, and familiar roles would have to be redefined or eliminated.

The new team structure now recognizes only two roles: program manager and engineer. The program manager is roughly the equivalent of a product owner in the Scrum process. Everyone else is an engineer, with no distinctions between developers and testers. Also, restructured: the teams themselves, which had operated in segregated environments: UI developers worked on the UI layer, for example, while database people worked on the database layer. The restructured teams now own the entire feature set from beginning to end, including the UI layer, the data layer, and the database itself, as well as installation, deployment, and quality. Even the workspace was reconfigured: individual offices were replaced by team rooms, where everyone works together, including the program managers.

"It was incredibly painful," he said. "We suffered a lot of attrition from all sides — management, developers, testers — because the new way of looking at things and doing things was very different from the way we did things before. And we all know no one really likes change. But if there’s one constant in our industry it’s change."

"Each team is autonomous in the sense that we get to decide what’s in the backlog," Wang explained. "We get to decide what the priority is — our program manager does. We decide which process we want to use, too, so we have some teams that are very strict Scrum users; other teams that are small-A agile; and other teams that use Kanban."

"When we first started, we all had this autonomy and we weren’t quite aligned, and everyone started using their own JavaScript framework," Wang said. "So that had to change pretty quickly. And can you imagine what it would be like trying to deploy and coordinate with 50 teams having different sprint lengths? I can tell you, because we tried it. It was a disaster. With our three-week sprints, we are in lockstep."

"VSTS services the entire globe, so deployments take a while," he said. "Because of that, we deploy through these different rings before it ends up all the way into production. It takes about a week, so we have an overlap between when we’re deploying and when the next sprint starts."

To manage these overlaps, the group sets aside a couple of days for sprint planning, and then sends an e-mail to every member of every team explaining the plan for that sprint. At the end of the sprint, a second email that includes a video of what was accomplished goes out to the teams.

After every third sprint, all the related feature teams get together for what Wang calls a "scrum of scrums" to talk about what they did and what they’re going to do. Every six months, the teams get together to check their progress against the long-term plan (18 months) and make sure they’re on the right track. At that time, they re-evaluate and re-prioritize and make a new long-term plan.

"Each individual feature team is, of course, responsible for the work they need to do within a sprint," Wang said. "They are also responsible for the three-week plans. And by virtue of that, they kind of have to know and figure out what they’re going to be doing every six months. Upper management, on the other hand, is looking at the big picture. They’re looking at our 18 month scenarios. They’re also looking at the six-month picture. But they’re not looking into our plans or our backlog sprints. No micro-managing, which would cause developers to lose their minds."

"Someone once asked me, Why three weeks?" Wang said. "Is that some magical number that Microsoft came up with? The answer is, no. We tried four weeks, and that was a really long time and didn’t seem very agile to us. We tried one week, and that was a disaster. We tried two, and I think if we were a slightly smaller organization, it would have been perfect, but there’s just too much overhead when you have 500 developers working together, trying to have this aligned autonomy. Three weeks just felt great."