Over the last four years UK incumbent operator BT has completely rebuilt large parts of its OSS estate using open-source software. This report outlines how BT went about this and what benefits it brought.
Over the last four years, UK incumbent operator BT has completely rebuilt large parts of its operations support system (OSS) estate using open-source software. This has enabled it to reduce cost and increase the agility of its operations.
A transformation in four years and five projects
When BT began its OSS transformation it was running around 350 different IT systems to support its network operations, much of it 15–20 years old. Each business unit (global, consumer, fixed, enterprise, etc.,) had accumulated its own systems over the years. There was significant scope for rationalization, and even today, after a 40% reduction, BT’s OSS stack has around 180 different systems.
Responsibility for OSS lies with BT Networks, an internal unit that builds and operates BT’s networks, data centers, platforms, and OSS systems. BT Networks serves the customer-facing units of the company: consumer, enterprise, and global (for multinationals). BT’s Vivek Murthy, director of Hybrid Cloud and OSS, led the design, build, and operation of the next-generation OSS that manages networks and data centers across BT’s UK and global infrastructure.
Under Murthy’s leadership, BT has built a talented team of around 300 engineers, designers, and architects across India and the UK. It has created several new software applications that have improved the way its networks are designed, built, and operated. For some complex areas, like 3D visualization of real-time inventory, they also worked with small development teams via the Topcoder crowdsourcing platform. A profile of BT’s Next Gen Network Management Visualization 3D Render Challenge can be found here. Other organizations that have turned to Topcoder for software development include Adobe, Microsoft, and NASA.
Before deciding to build its own applications, BT scoured the market for suitable commercial OSS offerings. Many of BT’s OSS are still supplied by vendors. However, the general thrust of BT’s OSS refresh is that by building its own software using open-source components, it has been able to create solutions that are more closely aligned with its business processes and workflows. Moreover, it has done this more economically than if it had continued buying commercial software. BT also studied the Linux Foundation’s ONAP project at one point but decided that it was simpler to architect its own solutions leveraging discrete, smaller scale open-source projects.
The OSS transformation included five key projects:
- Workflow management tool for the end-to-end plan and build process
- Unified inventory
- Catalog for service and resource management
- Service activation engine
Plan and build
Like most large telcos, BT had multiple systems for planning and managing the buildout of its networks. This disjointed approach slowed the evolution of their networks and reduced their time to market. BT needed more intuitive solutions that could handle end-to-end planning. It wanted a solution that could facilitate desk-based planning, allowing engineers to understand the state of network resources while working remotely. This would enable them to easily identify the need for additional switches and routers in a local exchange (central office) or datacenter.
The new workflow systems leverage open-source workflow and decision automation platforms such as Camunda and jBPM. By standardizing the network planning process and increasingly using TOSCA template for planning rules, they were able to achieve a high level of automation. With the click of a mouse an engineer can start a workflow that includes the physical capacity build and cabling, for example.
BT’s network delivery director, Martin Wood, who worked closely with Murthy and his team, believes that reimagining processes and using open-source tooling has yielded rich dividends. BT used to have many hundreds of people involved in these plan-and-build processes. Tasks involved many steps and human decision points. With the new software, this has now been reduced by 60% with increasing emphasis on doing more sophisticated DevOps-like work. The more automated approach has reduced the time for capacity planning by up to 50% and increased the accuracy of designs.
We often hear from operators that inventory is a major pain point for their operations and planning. Inventory data resides in isolated silos, is inaccurate, and out of date. To address this challenge BT developed a unified solution: Service and Resource Inventory Management System (SRIMS).
The team used the open-source NoSQL graph database for the solution, as opposed to the traditional relational database approach. Graph databases are good at modeling complex environments with multiple relationships between entities. As such they are highly suited to telecom networks, responding to queries much faster than relational databases.
With SRIMS, BT has been able to consolidate physical, logical, and service inventory into a single source of truth. The time to onboard new network capabilities has been significantly reduced, and with a real-time view of network capacity, resources can be quickly reallocated thereby reducing the time to roll out new services.
SRIMS has a 3D visualization tool that offers engineers a service view, customer view, product view, and physical view, all in real time. Network planners no longer need to rely on spreadsheets to find out how many ports on the back of a router are free. This has significantly simplified the design and planning process. The 3D visualization tool was developed at very low cost in collaboration with the Topcoder team, as previously discussed. After the initial competition, the winner was hired to scale up the solution to meet BT’s needs.
Thus far BT has consolidated multiple inventory systems into SRIMS, with the remaining systems planned for later this year and next. BT has saved millions of pounds with SRIMS by retiring multiple homegrown and commercial inventory systems. Paying for enterprise support is still an important part of the techno-commercial model with open-source projects, but the costs are a fraction of what BT used to pay for their legacy commercial inventory system.
Catalog, activation, and assurance
The other major OSS projects were for service catalog, activation, and assurance. They have a similar rationale of retiring multiple commercial and homegrown solutions and replacing them with newly developed applications based on open-source components.
BT had multiple catalogs that it has unified in a single catalog for service and resource management. The catalog is based on YANG data models that describe configuration information for network devices and services. TOSCA templates add modeling constructs to the catalog in support of orchestration.
For service activation, BT moved from multiple, siloed systems to what it calls its open-source-based, next-generation activation engine (NGAE), a cross-domain activation tool that can provision complex services over hybrid networks: fixed, wireless, IP/MPLS, SDN, and NFV, for example.
For service assurance, BT developed a Camunda-based fault management system to manage its incident and change-management process. It collects network telemetry during the day and works alongside another homegrown, open-source-based network performance management tool that recommends which customers should be moved to new VLANs to avoid congestion. It then triggers the workflow into the plan and build system to move those customers over the course of a few seconds. Operations teams were initially nervous about letting the system make these decisions autonomously but after a year of trials it had proved its accuracy and won their confidence. BT thinks the system could be improved further if network vendors would give it access to all the telemetry it would like. BT is still a year away from being able to use rich telemetry to move fully from reactive fault fixing to proactive and preemptive measures.
A reduced reliance on commercial OSS software
Overall, BT’s strategy is to simplify its OSS architecture, allowing the network infrastructure to be consumed through a set of well-defined APIs (infrastructure as code) so that BT’s customer-facing units can easily design and sell innovative services. Simplification does not mean throwing away all the existing systems and building everything again from scratch using open source. BT continues to use some commercial OSS applications. ServiceNow is increasingly finding traction for ticketing, incident, and change management. Other examples include Ciena Blue Planet (used for service orchestration), SevOne (network management), and EMC Smarts (alarm correlation and monitoring). However, the transformation has enabled BT to shut down many legacy commercial and homegrown applications in areas such as inventory and capacity build.
BT has found that commercial OSS products work well for business processes that are generic across the industry. However, BT couldn’t find off-the-shelf solutions for network inventory, planning, and activation that aligned well with its own processes. Examples include the ordering of spare parts or raising a purchase order. These processes are what allows one network operator to differentiate from another; hence BT took the decision to develop its own systems using open-source software components. These systems were built in conjunction with partners and BT relies on enterprise support for its open-source software to ensure reliability and get additional features that aren’t part of the community edition. However, overall, BT has reduced its reliance on software vendors for OSS, reducing cost and enhancing the agility of its operations through more tailored solutions.
BT Group Update (September 2020)
James Crawshaw, Principal Analyst, Telco IT & Operations