Big data comes to transport

Telematics has become pretty universal among vehicle operators, and this means that a vast amount of data is being generated every day. Each is meaningless on its own, but could the data as a whole be mined for really useful information to improve road safety, minimise traffic disruption and reduce energy consumption, asks Toby Clark

The sheer amount of data being produced by vehicles and trailers is staggering: for example, telematics provider Michelin Connected Fleet covers more than 700,000 assets (of all types), and these are monitored in around 2.5 billion positions a month. Incidentally, this is the brand, just launched in the UK, which comprises the former companies Masternaut in Europe, NexTraq in North America, and Sascar in South America.

“But that’s just positions,” says chief technology officer Laurent Castellani. “We are also capturing 300 million frames a day. We take the information from the vehicle, we aggregate that into a small packet of data – the frame – and we send that, secure and encrypted, to the platform.

“We push data between every 30 seconds and every two minutes, to make sure we don’t increase our telecoms costs. But the sampling of the data is happening with higher frequency: normally between one and five times a second, but we could go up to 400 times a second if needed.”

Another tyre-related telematics company is Bridgestone Mobility Solutions, which owns Webfleet. At its recent mobility conference, Bridgestone’s vice president of data solutions & innovation, Raghunath Banerjee, said: “Connected vehicle data is vastly underutilised,” but the firm is trying to change that.

Banerjee is working on using live data to help traffic authorities direct vehicles away from congestion caused by incidents, and for longer-term studies and policymaking — to improve road maintenance, among other things. Banerjee says: “With our huge presence across Europe, we can map out the quality of the road for most cities and highways, and provide an early warning of road degradation.” Parking information (vehicle type, latitude, longitude, start and end time) is being used to inform parking provision, too.

Bridgestone Mobility Solutions is now releasing anonymised vehicle data through the HERE Marketplace. This includes latitude, longitude, time, heading, and vehicle type, averaging 15 billion data points per month. Other criteria such as temperature, fuel level, electric car battery usage and acceleration are also available.

Bridgestone says that data privacy is its ‘highest priority’, and that it has “implemented several layers and methods to anonymise and aggregate the data, including density-based spatial clustering of applications with noise” – essentially, adding a small random element to the data – “and in compliance with GDPR guidelines”.

The Data For Road Safety initiative (DFRS) was established by European transport ministers in 2017, to make use of this type of information. It says that “significantly improving road safety… requires the involvement of car manufacturers, traffic information service providers, automotive suppliers and public authorities”, and that its ultimate goal is “a collaborative safety-related traffic information ecosystem”.

Manufacturers contribute specific types of live data, including:

  • Event type (such as ABS application)
  • Event ID (a randomised identifying number)
  • Longitude and latitude
  • Heading/direction of travel
  • Time stamp.


One of DFRS’s initial goals is to deliver safety-related traffic information (SRTI) messages. These are sent to, for example, the dashboard of the vehicle to warn the driver about specific types of hazard, including:

  • Temporary slippery road
  • Animal, people, obstacles, debris on the road
  • Unprotected accident
  • Reduced visibility.


The SRTI message contains no data that can be directly linked to an individual, and DFRS is adamant that the data it uses has been properly anonymised.

The sheer amount of data is growing all the time, says Laurent Castellani: “There are two dimensions to the data: first of all the depth, so we’re capturing more and more information, sampling more physical data like temperature, pressure, acceleration, and on/off information like whether a door is locked, or a trailer is coupled.

“We’re also increasing the sampling rate, because technology is allowing us to increase the frequency with which we capture data — which is allowing us again to be more precise with whatever kind of prediction we make.”


So is the type of data going to change? The next big thing is vehicle-to-anything (V2X) communications, which include vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) exchanges. Castellani suggests that this will mainly be V2I: “Vehicle-to-vehicle [V2V] is going to be more complicated, because the mechanisms to exchange data with another vehicle will require some protocol which is far from the case today.

“What I’m seeing is more interaction between the vehicle and its static ecosystem: lights, roads, cities… We already have technology that allows us to capture information on the surroundings of the vehicle.”

So are telematics firms accumulating a vast amount of customer data? Laurent Castellani is adamant: “We are not sharing any of our customers’ data. By contract, the data we capture belongs to the customer, and stays with the customer. It’s a maximum of three months, and some customers are deleting the data after one month – they don’t want to keep the data.”

Castellani says: “We only store purely anonymised data, used to calculate, improve and refine our algorithms — predictive models that give the customer a call to action.” Castellani gives the example of predicting fuel consumption on a particular route, “considering the weather, the load and information about the road. We are creating insight – an algorithm we can use to help the customer to make a decision.”

He adds: “We are working on tyres, because we are part of the Michelin group. The data we capture is helping us build some models of predictive maintenance around tyres — when should I do something about them? If you do a data science exercise on this, you realise that you need to compare pressure, temperature, mileage and other factors, but you can be quite accurate in your predictions. This is really helping customers, because instead of stopping every month to check all of my tyres, we can stop every six months, because we have the algorithm telling us to check this tyre because there’s probably something going on.”

Should transport firms get their own data scientists? Not really, says Castellani: “I make a distinction between a business intelligence [BI] person who can run queries [on a database] and cross-reference data to get business insight, and a real data scientist who can create a mathematical model.

“We have customers that are very knowledgeable about digesting data, because they already have platforms, tools, IT teams… Some are using a transport management system [TMS] leveraging a fleet management system to allow them to make business decisions. Can I transport an item from Frankfurt to Madrid? I need all sorts of fleet information which is not in my TMS – drivers’ hours, for instance – to allow me to make that decision.

“There are also smaller customers that don’t want data: they want insights and alerts they can use instantly.”

Related content