To plan and execute transportation projects, agencies will need to leverage data in new and sophisticated ways. Using artificial intelligence (AI), machine learning, predictive modeling, data analytics and other advanced data science tools, transportation agencies can develop innovative solutions that help them more easily visualize and understand data, model and predict complex interactions, improve reporting, and optimize investments and infrastructure funding.
While the prospect of building out technology and processes to support data science programs may sound daunting, purpose-built hardware and prepackaged data science software stacks can get data science teams and remote workers up and running within days. These future-proof solutions are secure by design and built to accommodate ever-growing data sets and high-speed processing requirements.
Using these tools, transportation organizations will have a solid foundation for moving forward with data science programs that give them the insights they need to turn their vision — and their investments — into solutions that will serve communities and improve quality of life now and for generations to come.
Getting up to speed on data analytics
Transportation organizations face the following challenges when collecting data and performing advanced analytics.Massive data sets and insufficient processing power.
To plan transportation and infrastructure projects, organizations must be able to manipulate (sometimes in parallel) massive data sets coming from geographic information systems (GIS), video imaging, IoT sensors, public records and other sources. However, many of the device solutions that transportation agencies use for traditional planning and operations do not have sufficient bandwidth or computing power to quickly and reliably perform data analytics processes.Complexity of setting up and maintaining data science tools.
Getting multiple versions of different data science tools to work together properly and securely delays implementation of data science solutions. The risk of delays, integration errors and security vulnerabilities is compounded when IT teams lack experience with these tools.Getting the whole picture.
Complex graphics and 3D models require sharp images and sufficient size to be understood. The typical office desktop monitor or printer doesn’t clearly capture the detail required for advanced analytics. In addition, monitors may be too small to display images side by side, and printersCybersecurity risks.
As data science teams work with third-party data sets; collaborate with multiple entities; and collect data from a plethora of IoT sensors, geographic information systems, video cameras and other endpoints, the threat of breaches, ransomware and other attacks increases. Even if organizations have multiple layers of defense, device users can unwittingly introduce malware and other threats by doing something as simple as opening an email or clicking a link.Talent shortages.
The median salary for data scientists was $164,500 in 2020 and qualified practitioners are in short supply. To attract and retain these workers — and optimize their time — organizations need modern, high-performance devices that allow them to model, analyze and act on data whether at home, in the field or in an office.Roadmap for collecting and analyzing data
The following actions will help transportation organizations move in the right direction as they start on their journey to collecting and analyzing data.Begin with a clear research goal.
Clarify highlevel goals for data analysis. For example, the goal could be to improve mass transit routes and service; reduce traffic jams or roadway emissions; respond proactively to real-time road conditions; better maintain bridges and other assets or protect wildlife habitats and natural resources in transit corridors.Focus on an operational definition.
Translating the conceptual definition of what the project team wants to study into an operational definition of the data it will collect and what it will actually measure is a critical step for turning abstract ideas into measurable observations.Flesh out a plan to collect, process and analyze the correct data.
This detailed process includes identifying trustworthy data sources as well as a data storage location (e.g., a data warehouse or data lake) that meets requirements for security, compliance and data availability. It also includes planning how users will prepare data to ensure its completeness, accuracy, integrity, consistency, validity and timeliness; how they will process data; and how they will analyze and interpret data to understand and demonstrate a particular result or decision. Finally, it includes determining how users will present the data (e.g., in a spreadsheet, table or graph).“As a CIO, it was important to help leadership both in the political sphere and on the civil servant side understand what the ongoing costs would be and consider how we would pay for them.”
Center for Digital Government Senior Fellow Otto Doll
Leverage existing data sets.
The data.gov website is a federal repository for open data and has cataloged more than 9,500 data sets on transportation alone. Organizations can use this and similar tools to research and compare their initiatives to similar projects that other departments of transportation or transit agencies have done. Data sets frequently include information about what the agency did, how it did it, the results it found and lessons learned or best practices.Understand data standards.
As agencies embark on new types of projects, it’s important to understand existing and emerging data standards that will impact their application and technology choices. Understanding (and adopting) government and industry data standards will help agencies simplify implementations and be better positioned for future innovation. The data.gov site as well as organizations such as the American National Standards Institute (ANSI) are good sources of information and guidance on data standards.Conduct a technology readiness assessment.
The Federal Highway Administration (FHWA) Exploratory Advanced Research (EAR) program’s technology readiness level assessment is a popular and efficient tool for evaluating the maturity of highway research projects. The tool can help transportation agencies evaluate the maturity of a technology and better understand next steps for moving forward.Keep learning
Data analytics, its application in transportation agencies and the technology used to support analytics are all advancing at a rapid pace. To keep up, it’s best to regularly devote time to learning more, whether through government websites, industry associations, peers, conferences or vendors.Foundational devices and solutions: Sustainability is key
Having the right foundational devices helps ensure organizations have the processing power, graphic displays, printing capabilities and cybersecurity protection they need to perform modern data analytics tasks. When implementing or augmenting a device program to support data analytics, sustainability and cybersecurity are top considerations.Future-proof investments.
AI, machine learning, automation and other data-intensive tasks require devices that can rapidly process large — in some cases, massive — volumes of data. To make data-driven decisions as transportation projects are funded and to move with agility now and in the future, organizations must be prepared to handle these large data sets. When analyzing solution costs, it’s worth the time and research to compare costs for upgrades that could handle double or triple the program’s currently projected data volumes. Investing in devices with that extra memory processing power now will help extend the life of the device as data and complexity grow over time.Budget for long-term cost of ownership.
Planning and budgeting for device costs over time is another important step that helps ensure sustainability. While transportation organizations will likely use IIJA and other funding for device purchases and one-time capital expenses, they’ll need to consider how they fund long-term costs such as device management, maintenance, operating system updates/upgrades and application patches.Build in security from day one.
As an overall best practice, organizations should always ask vendors about their supply chain and how they protect it. When considering workstations, laptops and other devices, organizations should also ask about built-in security features. Advanced workstations and laptops have builtin deep-learning tools that look for behavioral anomalies and issue alerts when they detect a potential threat. They also include processes such as micro virtualization, which creates a virtual environment when a user clicks a web link. The link is opened in isolation from the network. If malware exists, it’s activated and contained within the virtual space, and then the space disappears.Choose purpose-built devices and solutions.
Organizations will need the following devices and technology solutions to support a modern data analytics program.Workstations and laptops designed for data science and complex computations. These devices are designed for high performance and reliability so data scientists and other data workers can quickly perform multiple data-intensive tasks simultaneously (i.e., when training data models or creating data visualizations). These devices also have builtin tools that detect and help thwart potential threats, as well as a data science stack manager that contains tightly integrated tools for performing computations, manipulating data, creating and sharing visualizations, and more.
Curved, high-resolution monitors for 3D design and modeling. Curved, high-resolution (i.e., 4K) displays allow data scientists and other data-focused workers to visualize data more clearly. Curving a large monitor reduces eye strain by putting more content within the viewer’s natural field of view. A 4K monitor offers about four times the resolution of a standard monitor, so images are sharper and more detailed.
Advanced printers. Understanding, revising and commenting on models, graphics, and the overall organization or workflow of a complex system is sometimes easier on paper. Data science programs should include large-format, highresolution printers for times when digital images need to be transferred to paper or 3D printing for rapid prototyping, collaboration and other needs.