By Stan Young, Michael Fontaine, Jeff Gonder, Michael Pack and Shawn Turner, Contributing Authors
In April, the authors of this story sat on a panel titled “The Mobility Data Revolution — Status and Concerns” at the ITS America Conference and Expo in Phoenix. These five experts were kind enough to share their main points here. Below are the highlights of that panel discussion, written this summer:
How Did We Get Here (by Stan Young, NREL)
The modern era for transportation mobility data began in approximately 2008. Prior to this, most traffic data came either from sensors or from “eyes in the sky” through local radio stations. I refer to this period as “Turning on the lights” for roadway traffic.
In the beginning, the lights were not bright, but it was still a major step forward.
It was the beginning of the era of vehicle probe data—a term indicating that data was created from vehicles in the traffic stream, not from sensors.
Probe data originally started with fleet telematics that covered a small percentage (less than 1%) of the vehicles on the roadway.
Originally, the data was derived from long-haul trucking with satellite communications reporting their location and speed perhaps once every 30 to 60 minutes. This was sufficient to provide travel time and speeds on the interstate network during times of peak use.
Probe data grew from there as smartphones with integrated GPS systems proliferated. Smartphones initially communicated on 2G/3G networks, and soon fleet logistics applications emerged for various types of local and regional delivery—not just long-haul trucking.
Not only did the number of probe vehicles grow, but also the reporting frequency—no longer every 30 to 60 minutes, but instead on the order of perhaps once a minute or once every five minutes.
In 2008, the Eastern Transportation Coalition deployed its first version of the Transportation Data Marketplace, known then as the Vehicle Probe Project. The Maryland State Highway Administration was in the midst of rolling out travel time for its critical roadway infrastructure via changeable message signs statewide.
The following paraphrases a conversation with a Maryland ITS data system technical lead at the time:
“This new data source is puzzling. It is showing a slow-down in western Maryland on a rural two-lane road. How can that be? ... [distinct pause] ... Wait, I recall there may be a work zone in the area. ... [another pause] ... Yes, I checked the construction and maintenance system, and there is a repaving operation right where the probe data is showing a slow down. Wow, this really works!”
Maryland SHA integrated probe data into their project and rolled out travel time on signs statewide at less than half the original projected cost and two years ahead of schedule.
More recently, 4G and 5G telecom networks have expanded, smartphones have become ubiquitous and most new cars have some form of digital connectivity.
The availability of probe-based data has grown from a small fraction of the fleet to now (by some estimates) approaching one-third of vehicles on the roadway. Tools like Google Maps and Waze provide the majority of traffic data, supplanting legacy 511 systems.
Impacts on Agencies (by Michael Fontaine, VTRC)
When private sector probe speed and travel time data began to be used by transportation agencies around 15 years ago, these data sets were subjected to extensive independent validation to see whether the information could be trusted.
Since that time, private sector data has become a cornerstone of the operations and planning programs in many states.
The data sets have provided agencies much broader spatial awareness of roadway conditions than was previously available when departments of transportation (DOTs) relied on infrastructure-based sensors, helping to improve real-time situational awareness and project planning and prioritization.
The types of available private sector mobility data have significantly expanded to include crowdsourced volume data, origin-destination data and waypoint data, among many others. These new data sets have created some new questions and challenges for agencies, including:
- What is “ground truth?” — There is no trusted benchmark to compare against some of the new data sets, like origin-destination data. The community is working to define alternatives to traditional blind validation studies to establish trust in the data.
- Workforce and IT challenges — Many agencies lack the internal skill sets and proper IT tools to work with extremely large private sector datasets. Agencies need to adjust by hiring data scientists, improving IT infrastructure and working with consultants and academic partners to take advantage of these new data sets.
- Instability in data and providers — Agencies rely on longitudinal data to make project planning and investment decisions and changes in algorithms or data providers can cause disruptions that might compromise analysis. Agencies need to work with vendors to ensure they are making consistent “apples to apples” comparisons over multiple years.
These new private sector data sets offer tremendous opportunity to improve agency analyses, but the proper use cases and applications need to be defined to ensure that a good return on investment is achieved.
Energy Applications (by Jeff Gonder, NREL)
Beyond its uses for traffic situational awareness and planning, high-fidelity probe data can enable enhanced transportation system energy and emissions analyses.
Stakeholders with these interests may reside in sister agencies or different departments within the same agency as more traditional transportation operations and planning stakeholders. This creates a potential opportunity to pool resources for purchasing and processing/analyzing large datasets to achieve multiple purposes.
Subtopics of potential interest to broader stakeholders may include quantifying transportation energy, greenhouse gas emissions, and/or criteria emissions under present-day and different future vehicle market penetration scenarios.
Further interests may include planning for electric vehicle charging infrastructure needs.
Answering such questions may require combining processed probe data with specialized modeling and simulation tools. Examples of publicly available tools for such analyses include the Future Automotive Systems Technology Simulator (FASTSim), the Route Energy Prediction Model (RouteE), and the modeling suite of Electric Vehicle Charging Infrastructure Analysis Tools (EVI-X).
Private sector probe data does have limitations for energy applications, which may necessitate using complementary sources and approaches to produce robust analyses. For instance, analyses may require contextual information that could only be collected through a travel survey (such as details on trip purpose, co-travelers, and alternate considered travel modes).
Establishing ground truth validation for energy related applications additionally requires data outputs from specific studies or experiments utilizing dedicated vehicle instrumentation and/or other measurements, such as from roadside emissions sensors.
Current Challenges (by Michael Pack, University of Maryland CATT Lab)
More connected vehicles and devices (probes) have entered service every year since the mid-2000s. The logical conclusion is that more and better data in the form of speeds, incidents, waypoints, origins and destinations, etc. should be available on a greater number of roadways—thus further lessening agency reliance on sensors.
However, data quantity and quality may not continue to grow in the trajectory that one might think. The location data landscape is currently in chaos.
Data providers are changing (and in some cases disappearing) every couple of months, often at the mercy of upstream suppliers and constant changes in relationships.
Some of this instability results from poor business decisions by data suppliers, together with federal and state governments continuing to undervalue data.
The transportation industry does not hesitate to drop billions on infrastructure and touchable assets, but routinely fails to justify spending a few hundred thousand dollars on necessary data for foundational decision-making and operations.
Without strong guidance and support for leveraging federal funds on data, agencies will continue to struggle to capitalize on the power of data and analytics.
This has ripple effects within the industry. Governments so drastically undervalue data and vendors struggle to set reasonable pricing that sustains growth and encourages innovation.
However, concerns over privacy and misuse of data may be the largest threat to our ability to monitor traffic, respond to emergencies, and make better-informed transportation investment decisions.
There is growing pressure in Congress to better regulate data vendors, which will ultimately impact the transportation industry and its ability to use these data sources.
Some OEMs and mobile device manufacturers have already made it more difficult to access location data, which has further disrupted the industry and led to some providers synthesizing, modeling, and even faking data.
It is in the best interest of federal, state and local governments to be proactive in supporting legislation that enables the continued access and use of these data for transportation operations and planning purposes. The loss of these data would be catastrophic.
Because the data landscape is constantly shifting, validation of data products (like volumes, origin-destination data, etc.) are imperative. These products are particularly sensitive to change and become invalid and potentially worthless as soon as the underlying data sources change.
What’s Ahead? (by Shawn Turner, TTI)
What further opportunities and threats are on the horizon in the coming years, and how can we best prepare—or hopefully even influence—our future?
The mobility data revolution will broaden to include several areas besides mobility.
The past decade has seen basic mobility data (e.g., speeds, travel times, trip patterns) reach maturity, due to widespread data sampling from smartphones and other mobile devices.
In the coming decade, widespread data sampling from hundreds of sensors on connected vehicles will enable much richer and more diverse data, such as:
- Pavement roughness and friction/traction data from wheel sensors.
- Roadway asset data (from traffic control devices, work zones, etc.) via onboard video and lidar sensors.
- Driver behavior and response to static road infrastructure and dynamic onboard messages, like speed reduction, lane positioning, or trip rerouting.
Dashboard fatigue and information overload are becoming more common, so we can expect to see many more analytic software tools and platforms using artificial intelligence (AI) to customize information provided to different users based on their saved profile.
Al will also be used to suggest recommended actions, just like Siri or Google Assistant tries to help us better manage our busy personal lives.
Finally, mobility data analytics tools will need to become more integrated into road authorities’ decision-making tools and workflows, to avoid having data spread across many disparate applications. In most state DOTs, data integration will involve conflating third-party mobility data to the DOTs’ linear referencing system.
Ideally, this geospatial conflation can be done once to easily allow systems with different location referencing methods to share data back and forth. RB
Stan Young, PE, PhD, serves as the Mobility Innovations and Equity team leader at the National Renewable Energy Laboratory (NREL) and is the Chief Data Officer for the Eastern Transportation Coalition.
Mike Fontaine is the Associate Director for the Safety, Operations, and Traffic Engineering team at the Virginia Transportation Research Council (VTRC), the research division of the Virginia Department of Transportation (VDOT).
Jeff Gonder leads the Mobility, Behavior, and Advanced Powertrains Group in the Center for Integrated Mobility Sciences at NREL.
Michael Pack is the founder and director of the CATT Laboratory.
Shawn Turner is a Senior Research Engineer at Texas A&M Transportation Institute.