Cleaning up! How Northumbrian Water deployed big data to identify potential flood risks

Northumbrian Water's Adrian Holmes-Morris lifts the lid on Project SNIPeR, a big data project designed to identify potential floods before they happen

When Northumbrian Water needed to put together a system to help it better handle potential flooding incidents, business intelligence team leader Adrian Holmes-Morris found that the organisation already had pretty much all of the information it needed in-house, but that it simply wasn't being used as effectively as it could be.

"We have to provide some stats to Ofwat on performance. One of the key areas for Northumbrian Water that we needed to improve was in the pollution and flooding areas, where we scored quite low," says Holmes-Morris.

The problem, he continues, was that the information was often not being correlated and communicated quickly enough to head off, for example, potential flooding incidents before they happened. "In the initial discussion, we needed to ask, 'what data sets do we need to look at to give us the visibility we need?'

"We identified four main areas: our data loggers, attached to our CSOs - combined sewer overflows. Then there was our spatial data, our geographic information system, showing our sewer network. That needed to be combined with rainfall data, which we get from the Met Office via a service called Rain Radar, which they issue to us every five minutes.

"We amalgamate that into 15 minute reads, which we then associate with the data-logger information. And, finally, there's the asset details associated with the network," says Holmes-Morris.

"That combination enables us to manage and identify where we have issues within our network; where we have potential levels of overflow, and we can then set up alerts against them and identify where things are reaching a critical point and can then do something about it. That's essentially where this comes in," he adds.

The result of that analysis was Project SNIPeR - the Sewer Network Information Performance Reporting project, an early warning system that can highlight emerging blockages before spills occur and graphically highlight areas where flooding, or sewage spills, could be imminent.

The big data project was based around Oracle Business Intelligence and the Oracle Enterprise R statistical analysis language. While Oracle BI had fairly recently been introduced at Northumbrian, it had only been used on the financial side, and this was the first time that it has been deployed to support operations.

"The first thing was to get the various subject-matter experts together and to find out the process as to how they saw the data working together," says Holmes-Morris. That encompassed various business people from across the organisation, as well as analysts within the BI function, experienced in reporting tools. "Their knowledge then enabled us to identify where we could join all this information together and achieve what we wanted to do."

The company's IT and burgeoning data science teams used Oracle R, Oracle's implementation of the highly regarded, open-source statistical language, to determine "alert" statuses at the CSO locations on Northumbrian's sewer network.

"Graphical representation of the performance of individual CSOs was achieved by utilising Java plug-ins external to the Oracle BI application that provided a time series view. This gave greater flexibility in the ability of the presentation, and clearer understanding of the data provision in view," says Holmes-Morris.

He continues: "And, by giving the user the capability to query the timeframe and location of interest, specific incidents could be investigated and historical correlations made to activity within the network. 'Drillable' elements within the dashboard then enabled quick access to the critical data underlying the views presented to the user."

Implementation was an iterative process that required Holmes-Morris's team to get the structure right before it could be put in front of staff as a finished product. Then, the team worked on the presentation layer that would ultimately enable ordinary users to be able to interrogate the system - not just those proficient in the arcane language of R.

"We've got spatial representation of that information, tabular, and analytics that can provide alerts, show why something's happening and configure the information a user wants to draw out of the system."

Demonstrating, perhaps, how Apple and Android mobile phones and tablets have affected expectations of corporate IT, users also demanded "pinch to zoom" capabilities to enable them to quickly zero-in on spots they wished to highlight. A third-party tool was identified and incorporated into the application to enable them to provide this.

Key lessons learned during the development, according to Holmes-Morris, was the use of "indicative mocked-up dashboards to kick-off sprints. This was a valuable accelerator and engaged users from day one.

"Another was the highlighting of data cleansing activities after dashboard delivery. The project made the right decision not to clean data before the project, but to identify gaps after delivery, which provided more motivation for the business to correct source data," says Holmes-Morris.

"Rather than taking for granted that its there and that what they're reading is right, people were looking more closely and asking whether the data in front of them is right," he adds.

Next steps include predictive modelling, forecasting further ahead, tied into asset longevity - CSO that are reaching scheduled maintenance windows, examining what's occurred to them, and comparing that with other CSOs. In that way, the company could move to a predictive maintenance model, proactively fixing faults before they occur, for example.

Northumbrian Water is one of the finalists in this year's BCS/Computing UK IT Industry Awards, which will be held on 18 November 2015. Find out more about the biggest night in the IT industry, which will be bigger than ever this year - or book a table now, before they all go!