Get in touch

BDQ | Knowledge Hub | ...

What is IT Service Management (ITSM)?

A Complete Guide for IT Directors

 

bdq-knowledge-hub-icon-neon-full-600x600

Introduction

 


IT Service Management (ITSM) is a strategic approach to designing, delivering, managing, and continually improving the IT services your organisation relies on. For IT Directors, ITSM is less about “running a service desk” and more about building an operating model where people, processes, and platforms work together to reduce risk, improve user experience, and make service performance measurable.

This guide covers the foundations of ITSM (including the essential ITIL concepts of Service Request, Incident, Change, and Problem), how tiered support models work in practice (Tier 1/2/3), and the organisational realities that often determine success or failure - especially adoption, communication, and the quality of information captured during support.

It also explores what to look for in ITSM platforms, how to right-size ITSM to your organisation’s scale, where AI and automation genuinely help, and how integrations (monitoring, identity, HR, collaboration, delivery tools) improve outcomes.

This article is written from BDQ’s practical experience helping organisations improve service management and work management operating models, adoption, and platform implementations/migrations across different sizes and maturity levels.

We've helped people who have been using anything from shared mailboxes and spreadsheets, through to enterprise ITSM systems that are not providing the expected results. Sometimes an existing desk needs some tweaks - an improved service catalog, a new integration, maybe some automation to deal with common use cases - or sometimes, the service management operating model needs to be looked at as a whole, and then the technology will be configured to support it.

We hope that you find this guide useful, and if you have any questions or comments, please let us know.

More Knowledge Hub Articles:

Understanding ITSM: Beyond the Help Desk


What ITSM Really Means

ITSM is the practice of managing IT as a set of services that enable business outcomes. It’s the difference between:

  • Reactive IT support: fixing problems as they arise
  • Service-led IT operations: delivering consistent, measurable service outcomes - while continually improving

A practical ITSM view for IT leaders is:

  • People: roles, skills, ownership, behaviours
  • Process: repeatable ways of handling demand and risk
  • Platform: tooling that enables consistency, visibility, automation, and reporting

ITSM works when all three align.

 
Why IT Directors Adopt ITSM

Most IT Directors invest in ITSM to achieve some combination of:

  • Better risk control (especially around change and major incidents)
  • Greater transparency (what’s happening, where work is stuck, why performance varies)
  • Improved user experience (predictable outcomes and communications)
  • Scalable operations (growth without linear headcount increases)
  • Better governance and auditability
 
The Business Case for ITSM

A well-run ITSM approach typically delivers value in three areas:

Operational efficiency

  • Less duplicated triage and “ticket ping-pong”
  • Faster routing and resolution through better information and knowledge reuse
  • Repeatable handling of common work like onboarding and access requests

Risk and governance

  • Clearer ownership and audit trails for changes and approvals
  • Stronger control of major incident response and communications
  • Better prioritisation of work that impacts critical services

User experience and trust

  • A clearer “front door” for IT and predictable communications
  • Reduced frustration from repeated questions and unclear ownership
  • Confidence that issues are being managed consistently

 

The most common leadership mistake is treating ITSM as a tool rollout. ITSM succeeds when it’s implemented as an operating model change, supported by technology.

Core ITIL Concepts Every IT Director Should Know


ITIL provides a widely-used foundation for ITSM. While ITIL 4 contains many practices, IT Directors typically focus on a few essential ones.

Service Request Management

Definition: handling requests for standard services or information.

Director-level focus:

  • Define a clear service catalogue (what you offer and what “good” looks like)
  • Automate repeatable requests and approvals where possible
  • Use request data to understand demand trends and capacity pressure

Example: joiner onboarding as a standard service request that triggers access, equipment, and approvals across teams.

 
Incident Management

Definition: restoring normal service operation as quickly as possible when disruption occurs.

Director-level focus:

  • Optimise for fast restoration and clear communications
  • Define major incident criteria and the roles/cadence to manage them
  • Reduce time wasted on unclear ownership and slow escalations
 
Change Management (Change Enablement)

Definition: controlling changes to minimise disruption and risk.

Note: ITIL 4 often refers to this as Change Enablement. Many organisations still say “Change Management” day-to-day - what matters is an approach that enables delivery while managing risk.

Director-level focus:

  • Use risk-based governance (not all changes need the same controls)
  • Make standard changes fast and safe
  • Ensure high-risk changes are visible, assessed, and communicated
 
Problem Management

Definition: identifying and removing the underlying causes of incidents.

Director-level focus:

  • Separate “restore service” from “remove root cause”
  • Use trend analysis to identify systemic issues
  • Make sure problem work results in visible, preventative change (not just documentation)
 
How These Concepts Fit Together
  • Service request → fulfilled via defined workflow
  • Incident → restore service quickly
  • Recurring incidents → trigger problem management
  • Problem → root cause + permanent fix via change

Organisational Structure and Service Delivery


The Tiered Support Model

Most organisations operate some variation of a three-tier support structure.

 

Tier 1 (First Line)
The first point of contact for users.

Responsibilities:

  • Logging and categorising issues
  • Fulfilling standard requests
  • Basic troubleshooting
  • Clear communication with users

The effectiveness of Tier 1 depends heavily on:

  • Quality of knowledge and tooling
  • Clear escalation criteria
  • Confidence and training
Tier 2 (Second Line)

Specialists who handle more complex issues.

Responsibilities:

  • Deeper technical troubleshooting
  • Documentation and knowledge contribution
  • Supporting and mentoring Tier 1

Common challenges:

  • Knowledge gaps between tiers
  • Poorly documented escalations
  • Repeated investigation of the same issues

Tier 3 (Third Line)
Experts and engineers.

Responsibilities:

  • Root cause analysis
  • Complex fixes and system changes
  • Vendor escalation
  • Architectural decisions

Tier 3 should focus on reducing future demand, not acting as a permanent escalation queue.

 
Communication Challenges and Solutions

The Tier-to-Tier Handoff Problem

Common issues:

  • lost context during escalation
  • unclear ownership
  • poor documentation and repeated questions
  • user frustration when they have to explain the issue multiple times

Practical solutions:

  1. Standardised escalation templates (minimum required information by category)
  2. Clear escalation criteria (when to escalate; what triggers it)
  3. Warm handoffs or swarming for complex issues (reduce serial delays)
  4. Knowledge capture expectations (repeatable fixes must become reusable)

 

Information Optimisation: Capture What Enables the Next Action

A useful operating principle:

Capture only what’s needed to enable the next step - and make it easy to capture the right information.

Tier 1 (essential):

  • user impact and urgency
  • clear problem statement/symptoms
  • basic diagnostics performed
  • correct category/service selection

Tier 2 (technical):

  • logs and errors
  • reproduction steps, environment context
  • technical diagnostics, workaround details

Tier 3 (root cause / long-term):

  • architecture implications
  • permanent fix recommendations
  • change approach and risk assessment
 
When Tier 3 Also Delivers Project Work: Managing “Run vs Change”

In many organisations, Tier 3 is expected to:

  1. Run the service (escalations, major incidents, operational change)
  2. Change the service (project work, improvements, migrations)

If this is not explicitly designed, you typically see:

  • BAU queues grow when projects dominate
  • projects slip when BAU spikes
  • nobody can plan because priorities shift daily
  • teams become “busy” without being predictable

This structure is not automatically wrong - but it must be managed deliberately. Effective patterns include:

Pattern A: Ring-fenced operational coverage (often the best “minimum change” option)

  • rota a duty engineer / on-call ownership
  • reserve capacity for unplanned demand
  • define what interrupts project work (e.g., P1/P2, security)

Pattern B: A single prioritisation mechanism for shared resources

Even when tiers are split-managed, decisions must be centralised:

  • one triage/prioritisation policy
  • weekly (or twice-weekly) Run vs Change review for conflicting priorities
  • explicit override rules and communications expectations

Pattern C: Swarming + shift-left to reduce Tier 3 interruptions

  • improve Tier 1/2 capability via knowledge, scripts, automation, training
  • swarm early on complex incidents to reduce handoff delay
  • trigger knowledge/process improvement when escalations repeat

Pattern D: Service-aligned teams (“you build it, you run it”)

Effective for mature organisations if supported by:

  • on-call/duty rotation
  • capacity planning expectations
  • service ownership and governance

Leadership rule of thumb:
If Tier 3 interruptions are constant and both BAU and project delivery are unpredictable, you likely need to rebalance capacity, adjust escalation flow (shift-left), or restructure ownership so both BAU and change become schedulable.

 
OLAs in a Tiered Model: Why They Matter

OLAs (Operational Level Agreements) are internal agreements between teams that make external SLAs achievable. In tiered models, OLAs reduce friction by defining what each tier commits to during handoffs.

Practical OLA components:

  • time to acknowledge/accept an escalation
  • update cadence expectations (especially for high-impact issues)
  • minimum escalation data requirements (what must be captured before acceptance)
  • rules for returning/rejecting escalations (and what “good” looks like)
  • knowledge contribution expectations for repeatable fixes

Keep OLAs lightweight initially. Start with the small set that removes the most friction (often acceptance time + escalation quality + update cadence), then mature over time.

The Adoption Challenge: Making ITSM Stick


Why ITSM Initiatives Fail

Most ITSM failures are not technical. Common causes include:

  1. treating ITSM as a technology project
  2. ignoring how people actually work
  3. over-engineering workflows early
  4. lack of executive sponsorship
  5. poor communication and training
 
Building Adoption Deliberately

Successful adoption strategies:

  • Start with visible pain points
  • Deliver quick, meaningful improvements
  • Involve teams in design decisions
  • Make the new way easier than the old
 
Culture Matters

Many IT teams operate in a “hero culture”, where individuals are rewarded for fixing crises.

ITSM shifts the focus to:

  • Prevention over reaction
  • Shared knowledge over individual expertise
  • Team outcomes over individual heroics

Leadership reinforcement is critical for this shift.

 

Make the new way easier than the old

If email and informal messages remain the path of least resistance, you’ll never get consistent adoption. We have seen situations where a carefully crafted service catalog is available, yet the email channel remained the most popular request by 60-70%.

Right-Sizing ITSM for Your Organisation


 

Small Organisations
  • Fewer roles, more overlap
  • Lightweight processes
  • Focus on automation and clarity
Mid-Market Organisations
  • Clear tier separation
  • Core ITIL practices
  • Department-specific SLAs
  • Integration becomes important
Large Enterprises
  • Formal governance
  • Multiple service desks or domains
  • Advanced reporting and analytics
  • Strong change and risk controls
Universal Principles

Regardless of size:

  • clear service definitions
  • measurable outcomes
  • visible ownership
  • continuous improvement
  • user-centric design
 
Right-sizing principle: start simple and earn complexity

A simpler process that is consistently adopted will outperform a complex process that looks good in workshops but is too heavy to use in practice. Early in your ITSM journey, aim for a “minimum viable process” that creates clarity and measurable outcomes - then iterate based on real usage, data, and feedback.

Adoption beats elegance. If the process is too complex to follow on a busy day, it won’t survive contact with reality, and people will start to complain that there are problems with the tools.

Essential System Features and Capabilities


When evaluating ITSM platforms (and the wider service/work management ecosystem), focus on fit to your operating model - not just feature lists.

 
Must-have capabilities
  1. Service portal and catalogue
  • clear service catalogue with requestable services
  • consistent communications and status updates
  • multi-channel intake (portal/email/chat) with consistent workflow outcomes
  1. Workflow and automation
  • conditional routing and approvals
  • SLA support (clocks, pause rules, escalation policies)
  • major incident patterns and stakeholder communications support
  1. Knowledge management

  • strong search and ownership
  • feedback loops and review cycles
  • knowledge suggestions during ticket handling
  1. Reporting and visibility

  • executive dashboards and operational dashboards
  • trend analysis: volume drivers, backlog health, service performance
  • service-level reporting (not only team-level)
  1. Integration capabilities

  • SSO/identity provisioning
  • monitoring/alert integration (context-rich incidents)
  • collaboration tools (without losing audit trail)
  • HR triggers for joiner/mover/leaver workflows
  • engineering/delivery tool integration where needed
  1. Security and audit

  • role-based access control and audit logs
  • suitable data retention/permissions for your governance requirements
 
SLAs vs OLAs: don’t skip the internal agreements
  • SLAs are commitments to users/customers
  • OLAs define internal responsibilities that make SLAs realistic

If you define SLAs without OLAs, you often end up with “invisible delays” where teams aren’t aligned on internal expectations.

 
Platform selection: match tools to your operating model

To use a non-IT phrase, “Don’t put the cart before the horse”.

A practical framing:

  • If your priority is service governance and auditability, emphasise structured workflows, approvals, traceability, and service reporting.
  • If your priority is cross-functional work visibility, emphasise adoption, templates, workload visibility, and lightweight collaboration.

Many organisations end up with one of these patterns:

  1. one platform for service + work management, or
  2. best-of-breed connected platforms with clear integration points and ownership

What matters most is selecting products which support the operating model of your business, and which fit how your teams work. Sometimes decisions get made because of price, or because “we only want one platform”, or “we know XYZ product”.

However if your business requirements are not met by the products within a one platform solution, you may be in for a lot of pain or customisation. Or if the cheaper solution causes inefficiencies with your most expensive resource (people), any license savings will generally go up in smoke. Knowing an existing product is a good reason for choosing it, if it does what the business requires. If not, you simply end up with a product that doesn’t do what you need, with the scant consolation that you are familiar with it.

To repeat the phrase from the beginning - “Don’t put the cart before the horse”.

AI and Automation in Modern ITSM


AI works best when it supports good process rather than replacing it.

Where AI Adds Value
  • Ticket categorisation and routing suggestions
  • Drafting user communications and summaries
  • Knowledge article suggestions and gap identification
  • Virtual agents for high-volume, low-risk requests
 
Automation Opportunities
  • Joiner/mover/leaver workflows
  • Standard access requests
  • Pre-approved standard changes
  • Incident enrichment from monitoring and asset data
 
A Sensible AI Approach
  1. Establish clean data and taxonomies
  2. Pilot a small number of use cases
  3. Measure outcomes and adoption
  4. Scale with governance in place

Integration Strategy: Connecting Your Service Ecosystem


Modern ITSM does not operate in isolation.

Common integration points include:

  • Monitoring and alerting systems
  • Identity and access management
  • Collaboration tools
  • HR systems for onboarding/offboarding
  • Development and delivery platforms

Well-designed integrations reduce manual work, improve context, and shorten resolution times.

Implementation Roadmap


Pre-implementation
  • current state assessment and pain points
  • service definitions and priority model
  • tool selection aligned to operating model
  • governance and adoption plan
Implementation (phased)

Phase 1: Foundation

  • incident/request handling
  • basic portal/catalogue
  • initial reporting and training

Phase 2: Expansion

  • change governance and approval workflows
  • knowledge base and shift-left enablement
  • broader adoption

Phase 3: Optimisation

  • automation and deeper integrations
  • continuous improvement cadence
  • targeted AI use cases with governance
 
Success factors for IT Directors
  • stay engaged with weekly steering
  • communicate progress and wins
  • measure outcomes and adjust
  • keep the initial scope realistic

Measuring Success


Why metrics matter

Metrics are not just a scoreboard. They should change how you operate.

A useful leadership principle:

Metrics should support decisions - not just reporting.
 
Leadership-level KPIs

Common ITSM leadership measures include:

  • service performance and availability trends
  • incident volume drivers and major incident frequency
  • backlog health (age and trend, not just counts)
  • SLA compliance (by service, not only by team)
  • user satisfaction and complaint drivers
  • change success rate and change failure impact
 
Internal performance metrics: where OLAs add real value

OLAs are extremely useful in tiered models because they help identify where work is getting stuck internally, even when the SLA breach is only visible at the end.

Track OLAs and flow metrics such as:

  • OLA compliance by tier/team (acknowledge / start work / update cadence targets)
  • Queue time vs work time (waiting vs actively worked)
  • Handoff rate / reassignment rate (“ping-pong” across teams)
  • Escalation acceptance rate (how often escalations are accepted first time)
  • Aging work by assignment group (oldest items and trend)
  • Time to first meaningful update (especially for high-impact issues)

Used well, these metrics pinpoint bottlenecks and support capacity and process decisions.

Practical Examples: metrics that drive decisions


Example 1: Reducing Tier 1 → Tier 2 escalation churn

  • Metric signal: high reassignment and frequent “returned” escalations
  • Decision: tighten escalation templates, improve Tier 1 scripts/knowledge, adjust required fields by category
  • Expected outcome: fewer handoffs, faster time-to-start-work, higher first-contact resolution over time

 

Example 2: Making Tier 3 “Run vs Change” schedulable

  • Metric signal: Tier 3 queue aging spikes during project peaks; time-to-start-work becomes unpredictable
  • Decision: ring-fence operational coverage (duty engineer), set interruption rules, establish a single prioritisation cadence
  • Expected outcome: reduced context switching, more predictable BAU response and project delivery

 

Example 3: Prioritising automation investments

  • Metric signal: top request types dominate volume and consume large agent time
  • Decision: automate request + approval workflows and integrate with IAM/HR where appropriate
  • Expected outcome: faster fulfilment, reduced manual work, capacity freed for higher-value tasks

 

Example 4: Fixing cross-team dependency delays

  • Metric signal: SLA breaches correlate to waiting on another team/approval step
  • Decision: define OLAs, clarify “ready for fulfilment” criteria, automate chasing and missing info
  • Expected outcome: reduced waiting time and fewer preventable breaches

 

Example 5: Improving major incident stakeholder confidence

  • Metric signal: complaints about poor comms even when resolution is fast
  • Decision: define comms cadence, implement major incident roles, standardise updates
  • Expected outcome: fewer escalations to leadership and better stakeholder trust

Conclusion: Your ITSM Journey


ITSM is not a destination - it’s continuous improvement. For IT Directors, success comes from balancing governance with pragmatism:

  • start with the minimum viable processes
  • prioritise adoption and clarity
  • measure what matters (including OLAs and flow)
  • invest in automation and integration where it removes friction
  • build predictable “run vs change” operating rhythms so both BAU and project delivery become schedulable

 

Next steps for IT Directors
  1. Assess your current operating model (people/process/platform)
  2. Define what “good” looks like for your services and risk profile
  3. Right-size governance and process complexity for your organisation
  4. Pick a platform strategy that matches your operating model
  5. Implement in phases and measure outcomes

This guide is part of BDQ’s Knowledge Hub, aimed at IT and service leaders looking to modernise internal service delivery, improve employee experience, and connect service management with the way work actually gets done.

 

How BDQ Can Help

 

BDQ helps organisations improve Service Management and Work Management in a way that is practical, measurable, and adopted.

We support organisations to:

  • Improve adoption and ways of working using their existing platforms

  • Assess maturity and define a realistic operating model

  • Select the most appropriate service and work management tools, or shortlist options across multiple vendors

  • Implement, integrate, and migrate platforms with minimal disruption

 

bdq-knowledge-hub-icon-purple-alt-600x600

 

Next step:

Book an ITSM & Work Management assessment or platform shortlisting workshop and leave with a clear roadmap across people, process, and platform.