What is a Digital Twin?
“Digital Twin” is one of those evocative terms that everybody having some familiarity with computers and the digital world will immediately understand at an intuitive level, but probably also one that only a few of us will be able to define precisely. The term has a growing list of definitions across domains, as the scope of the Digital Twin concept continues to evolve and expand over time. Stated simply, a digital twin is a virtual copy of a physical object where sensors placed on the latter are used to inform and update the former. NASA is credited for being the pioneer of the concept as they faced the problem of managing aircraft that they could not physically access. Mirroring the physical system with a virtual one fed with all the information available on the real one, allowed predicting potential problems before they even presented themselves and permitted a proactive approach to problem-solving, when lag and physical distance prevents a reactive approach.
From single objects to complex systems
As of today, many industries are using such virtual information construct in product lifecycle management: a digital twin of their products, starting from conception, through design and development until the end of its lifecycle. Virtual copies become more powerful as more information is collected and this opens up endless possibilities to improve the products and the customer experiences. But what appears already as an exciting reality is just the beginning. Can we apply the concept to much larger and complex systems? What about a whole city?
A city where sensors are ubiquitously deployed and capture all fundamental parameters of all its elements (infrastructure, buildings, assets, and individuals) and feed them back again in quasi-real time to improve (and eventually optimize) operations of all kinds…does it sound like a familiar concept? Smart city anyone? In fact, considering that the above is a possible definition of a smart city, all that is missing to make a digital twin is putting all the information together into a coherent whole. This may seem like a more formal question; however, we need to consider it carefully.
If all we want to do is to “represent” the city as a whole, this is indeed a formal question. If we want to solve problems at the city scale that consider all the potential trade-offs among its different elements, this is a completely different question as the computational burden could make it unfeasible. However, the relevant question may be another one: is now-casting, the norm in the context of smart-cities, the right time horizon to answer such city scale questions? We said that digital twins are used in product lifecycle management including the design phase, with the goal of finding for example what particular shape for an item will maximize its efficiency. At the city scale, this resembles a planning exercise (be that of infrastructure or of policies), which typically is done with a different temporal horizon in mind (years or even decades). Then maybe, for this kind of complex question, and moving our attention toward forecasting instead of now-casting – one could see it as trading off between the complexity of the question and the time allowed to get an answer – the concept of digital twin is indeed helpful. In fact, in transportation planning there is an emerging modeling technique that implements a similar concept: agent-based simulations.
Agent-based simulations as digital twins
Agent based simulations have been around for decades in other disciplines and in the last 20 years have been applied in transportation. Several examples, mostly at the academic level exist, like MATSim, or SimMobility. The general practice in transportation modeling is still to represent urban travel from the top-down as flows between areas acting as producers and attractors. Agent-based simulation goes beyond this paradigm and is a natural way to combine the advantages of activity-based transportation modeling with those of dynamic assignment.
The approach can capture emergent phenomena resulting from particle interaction that other approaches simply fail to predict. Concretely it means that if we want to model a complex system, as a transportation system and its users at the city scale, we can represent all its actors individually and let them act in a simulation according to some rules that fundamentally describe their behavior. Through this mechanism the simulation will mimic a plausible behavior of the system as a whole and answer the planning questions at hand.
“Fundamentally” in the previous description is a keyword. It means capturing the essence of behavior, in the case of transportation, human behavior. In a Digital Twin we also want to capture the behavior of the actual object, but without wanting to make any compromise. Some definitions go as far as saying that it mirrors the object “from the micro atomic level to the macro geometrical level, and any information that could be obtained from inspecting a physical manufactured product can be obtained from its Digital Twin”. This could seem like such definitions render impossible to reconcile the two concepts, nobody pretends to model individuals “from the micro atomic level” after all. However, in transportation planning models individuals actually are ‘the atomic level” (i.e. individuals are the smaller particles of the system). Additionally, one could claim that “any relevant information that could be obtained from inspecting the actual behavior of an individual, could be obtained from its agent counterpart”. Relevant in the case of an individual traveler should include position, mode of transport, purpose of the trip, among others, and these are typically known for each individual agent. Clearly, the meaning of “relevant” depends on the application and it is somewhat arbitrary to state that adding this word is enough to make the two concepts discussed here coincide. Nevertheless, the reasoning is useful to show that one concept is amenable to the other under certain conditions.
Into the Smart City: challenges and opportunities for agent-based simulations
Smart Cities, with their ubiquitous sensors, could help narrowing further the gap between digital twins and agent-based models. However, they also bring new challenges. In the smart city, we expect sensors to help recognize certain situations and eventually trigger adaptation strategies in real time. One of the simplest examples: traffic lights adapt to actual traffic conditions. More in general, supply elements with local intelligence will make choices according to a given strategy and to what they observe with their sensors. Therefore, for the first time, both sides will have to adapt to each other.
Although some of these smart triggers per se are not necessarily novel, deploying them everywhere in the city has the potential to change fundamentally the behavior of the city as a whole. If we want to get insight on such a scenario, agent-based simulation seems like a good option. At this point, though, there will be emergence on both sides and together. A complex adaptive system interacting with a complex adaptive system forming a larger system of its own. This raises several questions, in terms of the existence of an equilibrium and in terms of computation burden, that still need to be answered.
A smart city provides also new opportunities for agent-based simulation. While in a smart city we collect data using sensors (as in digital twins), agent-based simulations of transportation tend to employ travel diaries and census. Such kind of data is habitually used for transportation models and was the only option at the time agent-based approaches started being applied to transportation. It is easy to understand that it is also a very natural match, as one wants to model single individuals. However, this data has a shortcoming, as they are typically (relatively) small samples.
Passively collected data will not be attached to an individual identity in most cases, because this information is not collected (for example in traffic counts) or for privacy concerns (for example for smart transit cards) and we do not know much about its representativeness. Such data though, comes in massive amounts (Big Data) and guarantees a better spatial and temporal coverage. In other words, instead of having a snapshot of travel behavior of individuals of a certain region (what travel diaries provide) and this for a limited sample, we would have a continuously updated picture of large chunks of the populations and over the whole region. The typical way to use this data bonanza is to use machine learning or other artificial intelligence approaches.
Such algorithms based on a set of observations that contain inputs and outputs (training set) try to learn general rules that maps inputs to outputs. They are therefore predictive. An important limitation is that they might not be able to predict situations that have not been observed before, i.e. the algorithm has not been trained for. Behavioral models and agent-based models in particular, are more adapted to deal with situations that are different from what has been observed in the past, as they capture what fundamentally drives a certain behavior. However, they are not properly predictive. Instead, they produce plausible output scenarios given certain inputs.
The availability of Big Data steered a proliferation of purely data driven approaches in transportation but is also starting to inspire new solutions for transportation models. For example, coupling machine learning techniques with agent-based simulations. A possible way to do it is using machine learning to predict traffic flows at local level (for example on a single link) and use agent-based models to enforce known properties and constraints of transportation systems and human behavior into such predictions. Clearly, the ultimate goal of such attempts is to create tools that take all good properties of both methodologies and avoid their limitations. Here this would be creating more efficient, less computationally heavy ways to reproduce traffic patterns, as with machine learning methods, and do so while creating behaviorally sound city-scale scenarios.
A peek into the future:
As the adage says: it is difficult to make predictions, especially about the future. This is especially true in an era where technological change is happening at an unprecedented pace. However, if what we experienced in the last few decades is to any extent an anticipation of the future we can likely expect so much: more data and more computer power. Does it mean that computers and data of tomorrow will allow us to build agent-based simulations that will be perfect digital twins? Would we even want that?
We have seen that in the transportation system going to the atomic level means to go to the individual decision maker level and also that in near future, the decision maker category will include both humans and machines. Nevertheless, we can assume that we can have relatively easily digital twins of the machines and their algorithms, whereas modeling human behavior is where we can look to perfect the model. There are several factors that will allow this or not. Whatever approach we might want to use, everything starts with the data. Ideally, we would have more detailed and more abundant individual data to better capture behavior. For sure, it is already scary enough to know what big tech knows about us putting together all the data they collect, and this is no coincidence that their subsidiaries are in the simulation business besides, programs targeting the reading of the human brain, no less, do exist already. Nevertheless, let’s assume that we will keep a certain degree of privacy (and humanity) even in a more distant future and that we won’t have the information necessary to perfectly reproduce the reasoning of individuals (never mind the computer power at this point). We will have access to way more and more precise data not attached to individual information (counts on the network for all modes, frequentations of different facilities in the city, among others). That will help build a sharper picture of the city pulse but not, alone, better behavioral models.
The development of artificial intelligence could help further. In machine learning applied to transportation what we generally see is a change of paradigm in the sense that we are not relying on a model based on a theory. However, we still let an algorithm find a model that helps us reproduce the observation of a certain phenomenon. But what if we try to apply such a principle to create agent’s intelligence? If the intelligence of the agents becomes an additional layer between input and output in a machine learning process, we would create a real hybrid between machine learning and agent-based modeling. The algorithm should not find the laws governing the relationship between input and output but create models at the individual level that would reproduce such output. Such models would be behavioral in a way, but not based on the kind of theory they are currently based on, or any specific theory for that matter. Throwing at them different training sets, sensor observations under certain circumstances and the systemic outcome, their intelligence would be trained and would become more and more general.
Agent-based models are currently largely based on econometric models. If one wants to address a specific problem that the original model does not account for, the model needs to be amended. Possibly calibrated again. Typically, this produces long calibration processes that in turn produce different populations with different characteristics that are suitable for some specific problems. Training the population in a way that directly modifies the intelligence of the agents, should more naturally account for the learning over time that individuals exposed to different situations typically perform. So a population could be continuously updated and calibrated as the data comes in and this will allow solving more general sets of problems, and also to further reduce the need to have specific prior behavioral information (i.e. the behavior of the individuals in certain specific situations vs. more general behavioral norms). An interesting consequence is also that we would have a lesser need to know exactly what each individual has done in order to increase the realism of the behavioral model. For sure, we will still need to make sure that choices are consistent with certain constraints and patterns, but this could be done like we were discussing earlier regarding traffic flow.
For all this to come into fruition, it is easy to imagine that computer power will play a crucial role. The end of Moore’s law is not encouraging, the further development of cloud computing will help, and quantum computers are still a big unknown. In this context we might need to go back to the kind of coding parsimony that was common money until commercial strategies, more than actual needs, bloated computer programs and made continuous hardware upgrade a necessity.
But if all the pieces of this vision will fall the right way, a deep integration of machine learning techniques into agent-based modeling will be possible and at the same time agent-based simulations will come much closer to be perfect digital twins. GeoTwin is working to be at the forefront of its realisation.