Transport as a data issue. An interview with TfL CDO Lauren Sager Weinstein

Efficiencies in London's transit systems will be made by joining more of the dots

London faces some formidable challenges when it comes to transport. Populations and public expectations are growing faster than the infrastructure can expand. Put crudely this means squeezing more people into and out of the capital's trains, underground, roads, bike lanes and taxis more quickly, more efficiently, and as comfortably as possible.

It was always thus, of course. What has changed are the tools available to the planners and operators of London's transport system to analyse how the system is being used and to take action accordingly.

Chief Data Officer Lauren Sager Weinstein started working at TfL in 2002. Illustrative of how long it takes to increase capacity by rolling out new infrastructure, her first job was to help make the business case for a new east-west line called Crossrail. A decade and a half later, Crossrail is due to open this year.

The key to serving more passengers more efficiently is understanding their movements. When Sager Weinstein arrived at TfL, this was done primarily using paper surveys which was slow and expensive. As well as speeding the passage through the gates, the introduction of Oyster cards on the Tube in 2003 and later the buses (a development in which she played a big part) provided an opportunity to bring in modern computer analytics to the problem.

"The first thing we did was to look at travel patterns on the underground because you touch and touch out so that gave us a whole set of new patterns that you could see on the network," she said. "And then we extended this to understand bus travel."

Bus passengers only touch in however, meaning that gaps need to be filled in algorithmically.

"We were able to use a combination of knowing where the bus location was and knowing where the passenger taps into the network to fill in and make assumptions about where people exited buses," she told Computing at a DataIQ event last week.

‘Don't get too esoteric'

Advances in processing power and the tools available have greatly increased the sort of analyses that Sager Weinstein's team are capable of these days. For example, they are able to differentiate between different types of users of the system - tourists, general users and commuters. Some commuters have cars, others don't. Each group has different needs and data is used to try to provide for them as best as possible within the transport system as a whole.

Data is useful for commercial purposes too, such as in the placing of retail stores or poster locations in stations.

"We can use our smart algorithms using clustering analysis to look at patterns and that helps us understand our stations. We use that for providing signage and thinking about where retail stores in the stations should be located, and thinking about the right the right retail units," she said. "So we use all this data to understand travel and to give information back to our customers,"

As a benefit, frequently passengers can now be reimbursed proactively when there is a major hold up in the system.

"When things go wrong we look at customer patterns and we refund individual customers that have suffered a significant disruption. My team will identify those who were significantly affected and we will send funds automatically so they don't have to go to the trouble of claiming it back."

Reporting to TfL's chief technology officer, Sager Weinstein manages a team of around 70 systems programmers, developers of data tools, data scientists and product managers - people who "own the data product and they work on and define it so it can be useful for the organisation".

One of the main things that she has learned is to appreciate the practical limitations of working in a complex multifaceted environment. There's no point in creating a clever algorithm to optimise train schedules when no-one's going to alter the timetable on the say-so of a data scientist. Instead, the secret is often forming good relationships with other groups.

"We need to be very close to the operational teams to make sure that we don't get too esoteric," she said.

So the efforts of Sager Weinstein's team need to be focused on everyday issues. They need to be practical problem solvers, but they do so with cutting-edge technologies. Current endeavours include predictive maintenance, advanced timetabling and using machine learning to optimise operations.

Efficiency with privacy - the next challenge

London already has one of the most integrated transport systems of any big city. Future efficiencies will mostly be made by joining more of the dots, understanding more precisely how passengers travel once they're within the system. A recent TfL experiment involved tracking Tube users' smartphone Wi-Fi signals to find out where they went within the underground system. While the data was anonymised, this scheme drew the attention of privacy groups and others concerned about possible Big Brother aspects.

However, Sager Weinstein's said that everything was done properly, that the experiment was well advertised in advance and that the scheme had the blessing of and won praise from the ICO and technology groups.

"We kept the data held in a particular area. We had very tight access control to it and we didn't combine it with other data sets. We take a very strongly protective view of our data. It is sensitive data so we need to treat it as such," she insisted.

Striking a balance between data-driven operational efficiency and the increasing wariness of the public about use of personal data for tracking is likely to be a big part of the next chapter in the story of London's transport evolution.