Problem Impact Analysis
Event Occurrence: 29 June 2022 13:40
Description of the service. Why is it important, and why do we care that it had a problem. Include environmental details which are relevant to the outage.
Campus VoIP phone system
Describe what happened, what is the problem we are wanting to fix. When/how was is reported. Actions taken to restore service. Duration of outage and time that service was restored. Also describe any workarounds which may have been used during outage.
During a Facilities Services scheduled power outage in the MBS complex the UPS batteries were drained causing the network equipment in that facility to go down. When power was restored the network equipment recovered except for one fiber optic circuit to the switch which provides network connectivity to the VoIP networks DHCP server. Since the DHCP server was not reachable, once DHCP leases for VoIP phones reached their TTL and expired, new DHCP leases were not able to be provided and the phones dropped their IP addresses and were effectively offline. This issue was identified and resolved at approximately 14:20.
Describe how the service should operate to provide an expected service level to the customer.
The VoIP DHCP service should be available 24/7, except during scheduled maintenance.
Detailed and technical description of the problem. Include the events which caused the outage to include failures in hardware, software, environment, processes and procedures.
It was found that there was a faulty fiber optic module installed in the core switch in MBS, providing service to the access switch in the facility.
Describe actions to take to prevent future occurrences of this outage and improve service provisioning.
The faulty fiber optic module was replaced and the connectivity of the access switch is being reengineered to provide redundant connectivity to the campus network core in an effort to eliminate future outages.
Schedule for the implementation of proposed actions.
29 June: Faulty fiber optic module located and power cycled, restoring connectivity.
Batteries in MBS UPS refreshed in order to extend up time.
30 June: Faulty fiber optic module replaced
Describe follow up to ensure actions are implemented and the expectation of improvements have been met.
Circuit redundancy to be implemented ASAP to provide increased connectivity to access switch.