Chalk Talk: ITIL 2011 and Best Practices on Operations for “The Cloud”

Document

May 23, 2013 11:47 AM
May 23rd, 2013

Introduction

Cloud or “The Cloud” is everywhere, on billboards, the TV, even our parents are asking about the cloud -- so what is it? There are many papers, institutions and vendors that will go into great detail about their view of cloud but we want to keep it simple here. The answer to “what is cloud?” is really dependant on who is asking the question, a cloud consumer or a cloud provider. As a cloud consumer the NIST[1] Cloud Computing Reference Architecture  paper gives a great view of what cloud means from a consumer point of view (it does cover some aspects of a cloud provider but more from a technical standpoint) and in short it’s about access to either IT infrastructure, IT development environments or software in a pay as you use model which can be hosted in your datacenter, in a public datacenter or a combination of both.

The word hosted is highlighted here as this is the real crux of what the cloud provider is providing. If you are an Infrastructure cloud provider, like Amazon Web Services (AWS), then you’re hosting resources such as Virtual Machines. If you’re a Software cloud provider, like Cisco Security Cloud Operations, you are hosting full blown applications. As well as hosting a resource or application, you’re also managing those applications on behalf of potentially multiple users or tenants and ensuring you deliver an acceptable level of services for those things. So a cloud provider is responsible for providing a hosted, managed service. For the remainder of this paper we will focus on how a provider achieves some best practices around hosting and managing service, using the IT Information Library (ITIL) which can be applied to both internal IT organization or Telecommunication Service Providers.

We will use the term service provider to define both internal IT Organizations and Telecommunication Service Providers, as this is how it is often used in ITIL literature. So if you’re a service provider who has fully implemented IT Service Management (ITSM), provides five nines availability and can apportion both fixed and variable costs to your users or tenants then stop reading, contact us and we will go into business with you! Nobody? Well that’s probably true of every service provider out there today, telecommunications companies will typically have a better track record in monetising their network and datacenters so normally are more mature when it comes to ITSM, but take a look at market trading companies or real time IT shops such as the ones that support formula 1 and you will see highly efficient operations that really understand how to manage and host services.

Could these successful service providers deal with a cloud infrastructure that flexes hour by hour? One hour there are 5000 virtual machines, another hour 10,000? Probably not, because the key to good operations has been stability. When you introduce cloud, you potentially have to cope with a much more dynamic infrastructure or application. For example, in Cisco ScanSafe, application features are rolled out to over 3000 hosts every two weeks, in Google it’s faster than that, so how do you support a cloud application where constant change is the only way to stay ahead of the market?

ITIL is part of the answer. It’s not the full answer and it’s not always the best answer, but it’s a lot better than nothing. ITIL is a good starting point for building a stable operations organization, it helps build stable processes and, most importantly, it helps build standard services that can be deployed and managed time and time again in a consistent manner. It won’t help you with service agility, it won’t help you with continuous service delivery. For those things, which you need in order to be successful as a cloud provider, you will need to borrow from CoBIT, Agile, SCRUM or Kanban, but it is an important part of being a cloud provider

ITIL 2011

Most IT Organizations have heard of or implemented ITIL in some form or other, and many telecommunications service providers have started using the ITIL. The Tele-Management Forum (TMF) which publishes Frameworx, a Telecom Management Framework, and ITU-T, has also aligned with ITIL.   The TMF eTOM , the process part of framework, is a descriptive catalog of processes for the service provider from top down, and ITU-T TMN was built on the requirements to manage infrastructure equipment from bottom up. ITIL is not a descriptive standard like eTOM and ITU-T TMN, it is very prescriptive when it comes to how you manage, change, release, etc. and provides best practices for IT management. ITIL version 1 was developed by the Office of Government (OGC) in the 1980’s and was mainly used by the Government agencies. From 2001-2006, ITIL became the cornerstone of IT service management by introducing service support and service delivery disciplines as part of ITIL version 2. ITIL version 3 was introduced in 2007, as a natural progression to introduce IT service Life Cycle. Its focus is much more on IT being a service, an entity in its own right, providing value to its users rather than being considered as a series of components, the whole IT service considered being greater than sum of its individual parts.

Figure 1 shows Cisco Services’ life cycle PPDIOO and its alignment to ITIL v3 life cycle phases. ITIL is concerned with consumable IT services, whereas Cisco Services is concerned with services offered around projects and delivering services based on an agreed SOW or contract, but the relationship is shown to set some context. The Cisco Services life cycle is a three phase effort that maps to ITIL’s five phase life cycle. Cisco Services are reducing some 400+ services into 10 business services that map around the Plan-Build-Manage Cisco life cycle. More on this written on Cisco CEC page along with a video on demand explaining the simplicity of Cisco life cycle, and how it is relevant to customer priorities and customer desired business outcomes.

fig1.png

FIgure 1: ITIL V3 and comparison of ITIV3 phases with Cisco phases

(click to enlarge picture)

From a cloud perspective, Service Strategy revolves around defining what services you want to offer, infrastructure, platform or software, defining who your customers and partners will be; and managing the portfolio of existing services (i.e. when do you want to retire redundant or non-profitable services?)

Service Design

Service design is about defining the standards for things like operating systems, for example, designing the manageability of the service, the capacity model etc. This is where things begin to diverge depending on the type of services you are offering. ITIL is focussed on operations, so if you’re delivering infrastructure services maybe this is all you need. But if your service is Software based, then typically a lot of interaction with the development team is required, and often what you want to build in operations are standard design patterns for High IO workloads, Secure Workloads, Physical Big Data workloads etc. rather than trying to build bespoke solutions every time. As you begin to offer more self-service aspects, then the ability for end users or tenants to select these patterns means all aspects of the service design domain need to be automated.

Service Transition

Service Transition is about delivering your service in a production environment. So if you’re building infrastructure, then maybe you can progress through the release, and change asset and configuration management processes. If your infrastructure is purely there to support a Software Service then the ITIL design processes need an agile or waterfall process to connect to, because you will have a development team somewhere building and QA’ing your software. ITIL cares only about the finished software, and if you’re looking to deliver features rapidly and in a continuous manner, you really have to consider the velocity you need to achieve in and out of service design to get your service out there. This is one of the biggest issues with ITIL in a cloud environment, balancing stability or process with the application velocity needed in a cloud SaaS.

Service Operations

Other than the more dynamic nature of cloud services, Service Operations is the same as normal IT operations, a higher level of automation is expected to compensate for the more dynamic nature of the service in cloud, and request fulfillment should be automated to support more self-service but incident management and other processes need to be stable and effective

Continual Service Improvement

Finally, Continual Service Improvement should be a goal for any cloud provider as that is the only way IT organizations will compete with external providers such as Amazon, and telecommunications providers will compete with other telecommunications providers. ITIL 2011 has attempted to address some of the issues with previous versions and also the more dynamic, self-service nature of cloud but it doesn’t really go far enough. Table 1 shows major changes in ITIL 2011.

Table 1: Major Changes in ITIL 2011

Service   Strategy: The following are the major changes in ITIL 2011:

  • Strategy   Management for IT services are introduced, and service strategy manager role   is introduced. Strategic assessment   and development of service strategy are removed, and service charter and service   model are introduced.
  • A dedicated   demand management process has been introduced as part of service management   strategy
  • Business   strategy drives IT strategy and IT strategy supports Business strategy. Hence, ITIL introduced Business   Relationship Management in this phase as process and it is done by Business   Relationship Manager (BRM) (new role), whose job is managing customer   portfolio and customer outcomes. BRM performs customer satisfaction survey   and problem management

The   addition of demand management means that managing customer demand for   services and the capacity to support these requests should become more   manageable, given the dynamic nature of virtualisation and cloud this helps   strengthen the strategy part of ITIL.

Service   Design: The following are the major changes in ITIL 2011

  • In service   design phase, new processes design coordination is added as a new process and   service design manager would be responsible for it.
  • The event   collection and correlation rules should be designed to aid in the detection   of capacity and availability issues.
  • Service   Level Management is completely re-done, following the introduction of design   coordination process activities, and is now mainly responsible for gathering   services requirements as well as monitoring and reporting with regards to   service levels.

Service   design changes mean that coordination of complex service design, the   manageability and SLA definitions are more comprehensive, supporting more   dynamic services seen with increased virtualisation or a move to cloud.

Service   Transition: The following are the major   changes in ITIL 2011:

  • Change   management is revised to recognize significant change require authorization   at several levels. Two sub processes are created: Assessment of change   proposals, and minor change deployment.
  • Change   evaluation process has been added to address major changes.
  • Additional   interfaces between service validation and testing and Project managed are   added to ensure PM is aware of the planning information.

Some   additional interfaces between testing and project management but the need to   deliver more continuous integration is not really addressed here

Service   Operate: The following are the major changes in ITIL 2011:

  • Improved   guidance on 1st and 2nd level correlation of events,   trends and patterns.
  • Incident   prioritization guidelines, closure and evaluation to identify any new   problems.
  • Request   fulfillment is expanded from simple change to an incident, and added five sub   processes: 1. Request fulfillment support, 2. Request logging and   categorization, 3. Request Model execution, 4. Request Monitoring and   Execution, 5. Request Closure and Evaluation.  
  • Interface   between access management and event management is enhanced to show that   correlation rules for access should be designed to prevent un-authorized   access.
  • A new   sub-process pro-active problem identification is added to show the importance   of problem management

The   need to address request fulfilment in a more structured manner will enable   providers to offer more self-service and enhanced event and event and problem   management will support escalation management in a more dynamic service   environment

Continual   Service Improvement (CSI) - The   following are the major changes in ITIL 2011:

  • No major   changes, except process evaluation programme has been added in 2011.

N/A

Conclusion

For the last few years many service providers and IT organizations have been deploying IT services and using ITIL as a best practice resource to manage those services. As the same organizations implement virtualization and begin to consider cloud in all of its guises, they are looking for direction for best practices to implement a more dynamic operations model. ITIL 2011 recognized certain aspects of Virtualisation/Cloud and the more dynamic nature of the services they offer as a fundamental trend that will play a vital role in the service provider’s strategy; however, ITIL 2011 does not go far enough to address dynamic services or cloud.

ITIL 2011 aligns its definition of Cloud with NIST document [1] without adding any significant value, in doing so, ITIL has lost an opportunity to lead in this area rather than follow. Hopefully, ITIL will add cloud best practices for guidance in future revisions. Organizations like the Cloud Security Alliance (CSA) have really stepped into the service governance hole that ITIL 2011 hasn’t filled, so it remains to be seen if ITIL will continue to be the de-facto IT governance framework in years to come.

Cisco Service has been deploying cloud services with many partners, and has written best practices for cloud along the ITIL service life cycle phases. It is beyond this article to detail all the cloud best practices, but if the reader is interested in building and automating technical cloud services, the Cisco Press Book “Automating the Virtualized Data Center” written by the authors of the paper goes into more details. The NIST [1], CSA [2] and TMF [3] websites also provide a host of background material that is very helpful.

References:

[1]  http://collaborate.nist.gov/twiki-cloud-computing/pub/CloudComputing/ReferenceArchitectureTaxonomy/NIST_SP_500-292_-_090611.pdf

[2]  https://cloudsecurityalliance.org/research/grc-stack/

[3]  http://www.tmforum.org/TMForumFrameworx/1911/home.html

About the Authors:

Malcolm Orr is the lead operations architect for Cisco’s Security Cloud & Operations business unit.

Venkata (Josh) Josyula is a Distinguished Services Engineer (DSE) in Cisco Services Technology Group (CSTG)

ShowCover.asp.jpg

Cloud Computing: Automating the Virtualized Data Center

By Venkata Josyula, Malcolm Orr, Greg Page.

Series: Networking Technology.

Published: Nov 29, 2011

Copyright 2012

ISBN-10: 1-58720-434-7

ISBN-13: 978-1-58720-434-0

Published by Cisco Press.

This article is featured in the June 2013 issue of the  Cisco TS Newsletter.  Are you subscribed?

Average Rating: 0 (0 ratings)

Actions

Login or Register to take actions

This Document

Posted May 23, 2013 at 11:47 AM
Stats:
Comments:0 Avg. Rating:0
Views:1180 Contributors:0
Shares:0

Related Content

Documents Leaderboard

Rank Username Points
1 19
2 10
3 5
4 5
5 5
Rank Username Points
5