Skip to main content

Winning the numbers game

Can data science make railways more reliable?

  • Big data tools hold the key to improved reliability
  • New predictive insights combat delays
  • Multiple use cases for main line and metro operators

Eliminating delays is a priority for rail operators – and our customers devote huge amounts of time and money ensuring that everything runs like clockwork. 

Yet even the best run railways can still be brought to a standstill by equipment faults, train breakdowns and engineering works that overrun. A single holdup can trigger reactionary delays across an entire network, potentially affecting dozens of trains and inconveniencing thousands of passengers. 

Digital technology can be frustrating in this respect. On one hand, it’s helping to make railways much more reliable. On the other, sifting through the mountains of data generated by digital systems can be a real challenge – particularly if you’re trying to diagnose a specific problem. 

This isn’t just a problem for railways, but for all industrial sectors. Only a fraction of the data generated by today’s estimated four billion Internet of Things (IoT) assets is used to make useful predictions. Most data is discarded or forgotten once it has served its initial operational purpose.
 
Yet operational data – both real time and archival – is a potential goldmine for rail operators looking to boost the reliability and efficiency of their networks. 

Data scientists at Thales have been studying the question of how big data tools could help customers with predictive maintenance for several years. The solution they have come up with – TIRISTM – is now used to monitor tens of thousands of assets.

“We use algorithms to transform multiple data inputs into an output that delivers a benefit – and in the case of TIRISTM, that benefit is a prediction,” says Nathan Marlor, Technical Solution Architect and product owner of TIRISTM at Thales. “An example is when we monitor trackside devices such as point machines: we take all of the data, as well as things like weather data, and run it through an algorithm. The algorithm can tell you when the asset is likely to fail.”
 

A key point about TIRISTM is that it can be used to obtain insights into almost any type of asset, including trains. And it’s helping customers to tackle some of their most complex challenges. 

One of these is improving the reliability of metro systems that use communications-based train control (CBTC). Trains and trackside systems on CBTC networks are in constant communication. The volume of data generated is huge, running into billions of packets of data. 

“Monitoring the performance of the trains and trackside systems is an interesting challenge because the data volumes are so high,” says Marlor. “But there’s a limit to how much analysis you can do with conventional spreadsheets – an hour’s worth of data is about the maximum. What our customers need is to be able to look at trends over much longer periods of time – months rather than hours. And that takes special tools.”

First and foremost, CBTC data is used to ensure safe operations. But it also contains a wealth of clues that can be used to identify technical problems long before they cause delays.

One example is predicting problems with antenna equipment. If this becomes faulty, the train will stop automatically – potentially triggering lengthy delays. The secret is learning to recognise the earliest signs of trouble, so the antenna can be maintained before a service-affecting failure.

To achieve this, data from CBTC logs is uploaded to TIRISTM, where algorithms sift through millions of records to pinpoint the trains that need attention – achieving in minutes something that would otherwise take days or even weeks.  

Another example of how TIRISTM can help metro operators is by measuring the stopping performance of trains. On networks with platform screen doors, trains need to stop at exactly the right spot. If a train stops short by a few centimetres, for example, it then has to creep forward so the doors on the train and the platform coincide precisely. 

“It’s not a safety issue,” says Marlor. “But if you want to get more trains through, you need to make sure they’re aligning perfectly every time they stop, or you’re adding several seconds while they realign themselves. If that’s happening consistently across the network, it soon adds up.”

The challenge for operators is that they don’t always know this is happening unless they put somebody on the end of the platform. TIRISTM solves that problem. “It means they’re able to see exactly how each train is performing,” Marlor says. “It’s those kinds of insights that we’re able to deliver now.” 

Big data insights obtained from TIRISTM are also helping to reduce disruption when metros are re-signalled. CBTC systems rely on seamless radio coverage, which must be thoroughly tested before any trains can be run. This is a time-consuming job that is usually done during engineering shutdowns.

Shadow-mode monitoring, which uses TIRISTM to analyse radio data gathered from trains in revenue service, speeds-up the testing process. “It means you can be confident that the signal strength is above the desired threshold because you’ve been repeating that test hundreds of times a day in the weeks leading up to formal testing,” says Marlor.

These are just a few examples of how big data tools are helping to solve tough technical challenges and deliver predictive insights. The same techniques can also be used with point machines, track circuits and interlockings – underlining the practical ways that data science is helping to make railways more reliable. 

How do you build a new predictive capability?

Data science plays an increasingly important part in providing predictions and diagnostics for rail operators. So how’s it done? Nathan Marlor, Technical Solution Architect and product owner of TIRISTM explains how Thales sets about creating algorithms for specific tasks.

There are four players when you’re looking to develop a new predictive capability. The first is the software engineer. That’s the person who is going to make the big data available, store it in the best way, make sure that it’s cleansed and that it’s available for analysis. 

Then you have the data scientist who’s looking for the needle in the haystack. That will be the person pulling out all of the anomalies that are in that data and looking for any outliers.

Next you need to apply some context, so you need a domain expert. Their role is to say which anomalies are of genuine interest and which are simply expected behaviours. 

All of this gets wrapped up when we bring in the customer. Something may be a genuine anomaly. But is identifying it actually beneficial – could we do something about it? Does it have a business case behind it?
 
So between the software engineer, the data scientist, the domain expert and the customer, we triage the data and the findings, then decide if we’re going to take it forward to implement in the product.