OnBase Outage
Problem Impact Analysis Event Occurrence: July 21, 2025 | 4:00 PM – 7:45 PM AKDT
Background
OnBase is the university’s enterprise content management system supporting critical document management workflows across admissions, finance, human resources, and student services. The system relies on an Oracle database hosted on sw-onbase-dbl.alaska.edu
, with key dependencies on NetApp-managed storage. OnBase must remain highly available, with its database and file systems functioning without interruption.
Break Down of the Problem
- At 4:00 PM, a Grid Control alert was received indicating a database archiver error:
ORA-00257: Archiver error. Connect AS SYSDBA only until resolved.
- Attempts to access the OnBase database server failed due to connection denials.
- PAWS began investigating and identified that the Oracle Fast Recovery Area (FRA) was not mounted.
- NetApp storage appeared to be available, but the mount was not active on the database server.
- A Zoom bridge was initiated for real-time collaboration among the PAWS team.
- A critical support request was submitted to NetApp.
- The team discovered that a recent configuration change to automount parameters—related to another environment—had inadvertently impacted the OnBase server’s ability to mount the FRA storage.
- Once the correct automount configuration was restored, the storage mounted successfully.
- The Oracle database and listener were restarted, and service was fully restored by 7:45 PM.
Target State / Goal
OnBase services should be available 24/7 except during planned maintenance. All critical system dependencies, including mounted storage, must be stable, monitored, and protected from unrelated configuration changes.
Root Cause Analysis
An unintended change to NetApp automount parameters during configuration updates for another project caused the OnBase server’s FRA storage to fail to mount. As a result, the Oracle database halted archiving and became unavailable. The change bypassed impact assessment and change management procedures. Due to turnover and limited resources our team has not been able to start NetApp Training. No one on the PAWS team is formally trained in NetApp storage.
Develop Countermeasures
- Document all critical storage dependencies for OnBase and other database systems.
- Implement pre-change validation for NetApp configuration updates.
Implementation of Countermeasures
- July 23, 2025 – Storage dependencies for OnBase documented in CMDB.
- July 25, 2025 – NetApp change validation checklist updated with dependency checks.
- August 5, 2025 – Refresher on change control procedures delivered to PAWS.
- Future Training - We are looking into sending PAWS members to NETAPPS training based on member availability.
Follow Up / Review
- August 15, 2025 – Review implementation of countermeasures and validate improvements.
- September 1, 2025 – Update SOPs to include pre-change dependency impact assessments.