Yashar Behzadi expores how synthetic data can help companies develop or acquire the necessary data to power AI applications cheaper and quickly
Distracted driving has been a problem for as long as vehicles have been on roads. However, the meteoric rise of technology and the ever-growing distractions of modern life have driven the issue to centre stage—and it’s no wonder why. In the US alone, more than 35,000 fatalities occurred in 2020 as a result of motor vehicle traffic crashes. In addition, more than 3,000 fatal crashes were caused by distracted driving in 2019.
Increased connectivity is a double-edged sword: while it helps the world be more informed and productive, phones and infotainment systems are typically the primary cause of drawing drivers’ attention away from the road, forming unsafe driving habits, and causing accidents to occur.
Driver safety mandates are here
Advanced driver assistance systems (ADAS) and increasingly autonomous technology can dramatically reduce collision rates resulting from distracted driving, and there is a great desire to get these systems on the road as soon as possible. Take the European Union (EU), for example, which has made distracted driving a top priority. Beginning in 2022, all new cars entering the EU market must be equipped with advanced safety systems. Among the mandatory safety measures is distraction recognition and alert systems on trucks and buses to warn when vulnerable road users, such as pedestrians or cyclists, are in close proximity. The European Commission expects that the proposed measures will help save over 25,000 lives and avoid at least 140,000 serious injuries by 2038.
Many industry experts agree that ADAS and vehicle safety technology can powerfully influence vehicle safety. However, car manufacturers will face significant roadblocks in implementing and developing the technology needed to meet the EU’s new safety regulations.
Increased connectivity is a double-edged sword: while it helps the world be more informed and productive, phones and infotainment systems are typically the primary cause of drawing drivers’ attention away from the road
Current barriers with ADAS development
Automakers and autonomous vehicle (AV) manufacturers use real-world data to train, test and validate driver safety monitoring systems for roadways. Developing these safety-critical perception systems requires enormous amounts of data, and a dizzying array of situations must be engineered to reflect real-world driving situations. These systems must also be designed to operate successfully in different environments, from highly congested cities to rural areas.
Current methods would require manufacturers to build and deploy cars loaded with sensors and cameras to potentially drive hundreds, if not thousands, of miles before carmakers feel the necessary data is obtained. Once all of the data is collected, the process of labelling real-world data is a gargantuan task that requires careful extraction of specific interesting events to identify annotations of interest.
Synthetic data enters as a game-changer
While a handful of companies may be able to afford the process of producing and testing millions of vehicles in various geographical environments, most OEMs do not have sufficient resources or vehicles with the capability to provide such datasets. Besides being an expensive and time-consuming process, it is difficult—if not impossible—to obtain sufficient examples of diverse sets of drivers across a wide variety of situations. For those reasons, synthetic data and simulation will become a quintessential element in the development of driver safety systems of the future.
With the rise of synthetic data, also known as computer-generated data, companies at all scales can easily develop or acquire the necessary data to power AI applications at a fraction of the time and cost of externally acquiring and hand labelling training data. For safety-critical applications like autonomous driving, synthetic data fills in the gaps of real-world data by modelling roadway environments, complete with people, traffic lights, empty parking spaces, and more. As a result, manufacturers will be able to mimic driver behaviour in virtual car environments to test and iterate their models across a broader set of settings and situations without having to build and deploy fleets of vehicles.
Enabling the development of AVs and driver safety systems
The emerging technology is already an essential component of autonomous driving and computer vision AI systems. Synthetic data combines techniques from the movie and gaming industries (simulation, CGI) with generative neural networks (GANs, VAEs), allowing car manufacturers to engineer realistic datasets and simulated environments at scale without driving in the real world or counting on luck.
Current methods would require manufacturers to build and deploy cars loaded with sensors and cameras to potentially drive hundreds, if not thousands, of miles before carmakers feel the necessary data is obtained
Synthetic data enables manufacturers to focus on specific objects of interest, for example, pedestrians. Carmakers can simulate millions of examples of pedestrians in a matter of hours–a project that would typically take several months to complete. These simulations could include examples under different lighting conditions, object locations, and degradations. Or, random noise can be interjected to simulate dirty cameras, fog, and other visual obstructions. In this way, manufacturers could use synthetic data in a complementary fashion to real data. Long-tail events identified in the real data could be used as a starting point to create thousands of variations around that event.
Synthetic data will play an increasingly important role for manufacturers to meet the demands of the in-cabin driver safety monitoring system without tapping into the data of real-world drivers. With increasing concerns around privacy, using synthetic data can increase the driver’s safety without compromising drivers’ privacy. Synthetic data can help automotive manufacturers build robust computer visions systems and give them an edge on monitoring driver behaviour. Automotive manufacturers can generate thousands of unique identities using synthetic data with granular control of emotion, gaze angle, head pose, accessories, environments, and camera systems on-demand. Since the data is generated, the image data comes with an expanded set of pixel-perfect labels, including facial landmarks, gaze, angle, depth maps, segmentation, surface normals, and facial meshes. As a result, manufacturers will be able to build more robust training models to monitor large-scale motions such as a driver taking their hands off the wheel and smaller-scale motions such as eye gaze.
While it may not be evident yet, the new EU regulations are ushering in a new era of driver safety that will require automotive manufacturers to redefine transportation at its core. Synthetic data will be instrumental in developing driver safety systems and autonomous technology in the mobility space, providing manufacturers with a cost-effective solution that is infinitely scalable and more effective than real-world data. Necessary for dataset augmentation and safety assurance, vehicle simulations are used daily by AV companies. The massive demand for high-quality synthetic data will continue to push the envelope to meet the demands of the modern connected vehicle.