Tag Archives: rigour

Policy modelling requires a multi-scale, multi-criteria and diverse-framing approach

Lessons from a session at SocSimFest 2023

By Gary Polhill and Juliette Rouchier

Bruce Edmonds organized a stimulating session at the SocSimFest held 15-16 March 2023. Entitled, “How to do wrong using Social Simulation – as a result of arrogance, laziness or ill intent.” One of the presentations (Rouchier 2023) covered the modelling used to justify lockdowns in various countries. This talk concentrated on the harms lockdowns caused and suggested that they were unnecessary; a discourse that is not the most present in the media and takes an alternative view to the idea that a scientific consensus exists in real-time and could lead to the best decision. There was some ‘vigorous’ debate afterwards, but here we expand on an important point that came out of that debate: Modelling the effects of Covid to inform policy on managing the disease requires much more than epidemiological modelling. We might speculate, then, whether in general, modelling for policy intervention means ensuring greater coverage of the wider system than might be deemed strictly necessary for the immediate policy question in hand. Though such speculation has apparent consequences for model complicatedness that go beyond Sun et al.’s (2016) ‘Medawar zone’ for empirical ABM, there is an interpretation of this requirement for extended coverage that is also compatible with preferences for simpler models.

Going beyond the immediate case of Covid would require the identification of commonalities in the processes of decision making that could be extrapolated to other situations. We are less interested in that here than making the case that simulation for policy analysis in the context of Covid entails greater coverage of the system than might be expected given the immediate questions in hand. The expertise of Rouchier means our focus is primarily on the experience of Covid in France. Generalisation of the principle to wider coverage beyond this case is a matter of conjecture that we propose making.

Handling Covid: an evaluation that is still in progress

Whether governments were right or wrong to implement lockdowns of varying severity is a matter that will be for historians to debate. During that time various researchers developed models, including agent-based models, that were used to advise policymakers on handling an emergency situation predicated on higher rates of mortality and hospitalisation.[1] Assessing the effectiveness of the lockdowns empirically would require us to be able to collect data from parallel universes in which they were not implemented. The fact that we cannot do this leaves us, as Rouchier pointed out, either comparing outcomes with models’ predictions – which is problematic if the models are not trusted – or comparing outcomes across countries with different lockdown policies – which has so far been inconclusive even if it weren’t problematic because of differences in culture and geography from one nation to another. Such comparison will nevertheless be the most fruitful in time, although the differences of implementation among countries will doubtless induce long discussions about the most important factors to consider for defining relevant Non-Pharmaceutical Interventions (NPI).[2]

The effects of the lockdowns themselves on people’s mental and physical health, child development, and on the economy and working practices, are also the subject of emerging data post-lockdown. Some of these consequences have been severe – not least for the individuals concerned. Though not germane to the central argument of this brief document, it is worth noting that the same issue with unobservable parallel universes means that scientific rather than historical assessment of whether these outcomes are better or worse than any outcomes for those individuals and society at large in the absence of lockdowns is also impossible.

For our purposes, the most significant aspect of this second point is that the discussion has arisen after the epidemic emergency: First, it is noteworthy that these matters could perfectly well have been considered in models during the crisis. Indeed, contrasting the positive effect (saving lives or saving a public service) with negative effects (children’s withdrawal from education,[3] increasing psychological distress, not to mention domestic abuse – Usta et al. 2021) is typically what cost-benefit analysis, based on multi-criteria modelling, is supposed to elicit (Roy, 1996). In modelling for public policy decision-making, it is particularly clear that there is no universally ‘superior’ or ‘optimum’ indicator to be used for comparing options; but several indicators to evaluate diverse alternative policies. A discussion about the best decision for a population has to be based on the best description of possible policies and their evaluations according to the chosen indicators (Pluchinotta et al., 2022). This means that a hierarchy of values has to be made explicit to justify the hierarchy of most important indicators. During the Covid crisis one question that could have been asked (should it not have been) is: who is the most vulnerable population to protect? Is it old people because of disease or young people because of potential threats to their future chances in life?

Second, it is clear that this answer could vary in time with information and the dynamics of variant of Covid. For example, as soon as Omicron was announced by South Africa’s doctors, it was said to be less dangerous than earlier variants.[4] In that sense, the discussion of balancing priorities, in a dynamic way, in this historical period is very typical of what could also be central in other public discussions where the whole population is facing a highly uncertain future, and where the evolution of knowledge is rapid. But it is difficult to know in advance which indicators should be considered since some signals can be very weak at some point in time, but then be confirmed as highly relevant later on – essentially this is the problem of the omitted-variable bias.

The discussion about risks to mental health was vivid in 2020 already: some psychologists were soon showing the risk for people with mental health issues or women with violent husbands;[5] while the discussion about effects on children started early in 2020 (Singh et al., 2020). However this issue only started to be considered publicly by the French government a year and a half later. One interpretation of the time differential is that the signal seemed too weak for non-specialists early on, when the specialists had already seen the disturbing signs.

In science, we have no definitive rule to decide when a weak signal at present will later turn out to be truly significant. Rather, it is ‘society’ as a whole that decides on the value of different indicators (sometimes only with the wisdom of hindsight) and scientists should provide knowledge on these. This goes back to classical questions of hierarchy of values about the diverse stakes people hold in questions that recur perennially in decision science.

Modelling for policy making: tension between complexity and elegance?

Edmonds (2022) presented a paper at SSC 2022 outlining four ‘levels’ of rigour needed when conducting social simulation exercises, reserving the highest level for using agent-based models to inform public policy. Page limitations for conference submissions meant he was unable to articulate in the paper as full a list of the stipulations for rigour in the fourth level as he was for the other three. However, Rouchier’s talk at the SocSimFest brought into sharp focus that at least one of those stipulations is that models of public policy should always have broader coverage of the system than is strictly necessary for the immediate question in hand. This has the strange-seeming consequence that exclusively epidemiological models are inadequate to the task of modelling how a contagious illness should be controlled. For any control measure that is proposed, such a stipulation entails that the model be capable of exploring not only the effect on disease spread, but also potential wider effects of relevance to societal matters generally in the domain of other government departments: such as, energy, the environment, business, justice, transportation, welfare, agriculture, immigration, and international relations.

The conjecture that for any modelling challenge in complex or wicked systems, thorough policy analysis entails broader system coverage than the immediate problem in hand (KIDS-like – see Edmonds & Moss 2005), is controversial for those who like simple, elegant, uncomplicated models (KISS-like). Worse than that, while Sun et al. (2016), for example, acknowledge that the Medawar zone for empirical models is at a higher level of complicatedness than for theoretical models, the coverage implied by this conjecture is broader still. The level of complicatedness implied will also be controversial for those who don’t mind complex, complicated models with large numbers of parameters. It suggests that we might need to model ‘everything’, or that policy models are then too complicated for us to understand, and as a consequence, perhaps using simulations to analyse policy scenarios is inappropriate. The following considers each of these objections in turn with a view to developing a more nuanced analysis of the implications of such a conjecture.

Modelling ‘everything’ is a matter that is the easiest to reject as a necessary implication of modelling ‘more things’. Modelling, say, the international relations implications of proposed national policy on managing a global pandemic, does not mean one is modelling the lifecycle of extremophile bacteria, or ocean-atmosphere interactions arising from climate change, or the influence of in-home displays on domestic energy consumption, to choose a few random examples of a myriad things that are not modelled. It is not even clear what modelling ‘everything’ really means – phenomena in social and environmental systems can be modelled at diverse levels of detail, at scales from molecular to global. Fundamentally, it is not even clear that we have anything like a perception of ‘everything’, and hence no basis for representing ‘everything’ in a model. Further, the Borges argument[6] holds in that having a model that would be the same as reality makes it useless to study as it is then wiser to study reality directly. Neither universal agreement nor objective criteria[7] exist for the ‘correct’ level of complexity and complication at which to model phenomena, but failing to engage with a broader perspective on the systemic effects of phenomena leaves one open to the kind of excoriating criticism exemplified by Keen’s (2021) attack on economists’ analysis of climate change.

At the other end of the scale, doing no modelling at all is also a mistake. As Polhill and Edmonds (2023) argue, leaving simulation models out of policy analysis essentially makes the implicit assumption that human cognition is adequate to the task of deciding on appropriate courses of action facing a complex situation. There is no reason (besides hubris) to believe that this is necessarily the case, and plenty of evidence that it is not. Not least of such evidence is that many of the difficult decisions we now face around such things as managing climate change and biodiversity have been forced upon us by poor decision-making in the past.

Cognitive constraints and multiple modellers

This necessity to consider many dimensions of social life within models that are ‘close enough’ to the reality to convince decision-makers induces a risk of ‘over’-complexity. Its main default is the building of models that are too complicated for us to understand. This is a valid concern in the sense that building an artificial system that, though simpler than the real world, is still beyond human comprehension, hardly seems a worthwhile activity. The other concern is that of the knowledge needed by the modeller: how can one person be able to imagine an integrative model which includes (for example) employment, transportation, food, schools, international economy, and any other issue which is needed for a serious analysis of the consequences of policy decisions?

Options that still entail broader coverage but not a single overcomplicated integrated model are: 1/ step-by-step increase in the complexity of the model in a community of practitioners; 2/ confrontation of different simple models with different hypotheses and questions; 3/ superposition and integration of simple models into one, through a serious work on the convergence of ontologies (with a nod to Voinov and Shugart’s (2013) warnings).

  1. To illustrate this first approach, let us stay with the case of the epidemic model. One can start with an epidemiological simulation where we fit to the fact that if we tell people to stay at home then we will cut hospitalizations by enough that health services will not be overwhelmed. But then we are worried that this might have a negative impact on the economy. So we bring in modelling components that simulate all four combinations of person/business-to-person/business transactions, and this shows that if we pay businesses to keep employees on their books, we have a chance of rebooting the economy after the pandemic is over.[8] But then we are concerned that businesses might lie about who their employees are, that office-workers who can continue to work at home are privileged over those with other kinds of job, that those with a child-caring role in their households are disadvantaged in their ability to work at home if the schools are closed, and that the mental health of those who live alone is disproportionately impacted through cutting off their only means of social intercourse. And so more modelling components are brought in. In a social context, this incremental addition of the components of a complicated model may mean it is more comprehensible to the team of modellers.

    If the policy maker really wants to increase her capacity to understand her possible actions with models, she would also have to make sure to invite several researchers for each modelled aspect, as no single social science is free of controversy, and the discussions about consequences should rely on contradictory theories. If a complex model has to be built, it can indeed propose different hypotheses on behaviours, functioning of economy, sanitary risks depending on the type of encounter.[9] It is then more of a modelling ‘framework’ with several options for running various different specific models with different implementation options. One advantage of modelling that applies even in cases where Borges argument applies, is that testing out different hypotheses is harmless for humans (unlike empirical experiments) and can produce possible futures, seen as trajectories that can then be evaluated in real time with relevant indicators. With a serious group of modellers and statisticians, providing contradicting views, not only can the model be useful for developing prospective views, but also the evaluation of hypotheses could be done rapidly.

  2. The CoVprehension Collective (2020) showed another approach, more fluid in its organisation. The idea is “one question, one model”, and the constraint is to have a pedagogic result where a simple phenomenon would be illustrated. Different modellers could realise one or several models on simple issues, so that to explain one simple phenomenon, paradox or show a tautological affirmation. In the process, the CoVprehension team would create moving sub-teams, associate on one specific issue and propose their hypotheses and results in a very simple manner. Such a protocol was purely oriented for explanation to the public, but the idea would be to organise a similar dynamic for policy makers. The system is cheap (it was self-organised with researchers and engineers, with zero funding but their salary) and it sustained lively discussions, with different points of view. Questions could go from differences between possible NPI, with an algorithmic description of these NPI that could make the understanding of processes more precise, to an explanation of the reason why French supermarkets were missing toilet paper. Twenty questions were answered in two months, thus indicating that such a working dynamic is feasible in real-time and provides useful and interesting inputs to discussion.

  3. To avoid too complicated a model, the fusion of both approaches could also be conceived: the addition of dimensions to a large central model could be first tested through simple models, the main process of explanation could be found and this process reproduced within the theoretical framework of the large model. This would constitute both a production of diversity of points of view and models and the aggregation of all points of view in one large model. The fact that the model should be large is important, as ‘size matters’ in diffusion models (e.g. Gotts & Polhill 2010), and thus simple, small models would benefit from this work as well.

As some modellers like complex models (and can think with the help of these models) and others rely on simple stories to increase their understanding of the world, only the creation of an open community of diverse specialists and modellers, KISS as well as KIDS, such a collective step-by-step elaboration could resolve the central problem that ‘too complicated to understand’ is a relative, rather than absolute, assessment. One very important prerequisite of such collaboration is that there is genuine ‘horizontality’ of the community: where each participant is listened to seriously whatever their background, which can be an issue in interdisciplinary work, especially involving people of mixed career stage. Be that as it may, the central conjecture remains: agent-based modelling for policy analysis should be expected to involve even more complicated (assemblages of) models than empirical agent-based modelling.

Endnotes

[1] This point is the one that is the most disputed ex-post in France, where lockdowns were justified (as in other countries) to “protect hospitals”. In France, the idea was not to avoid deaths of older people (90% of deaths were people older than 60, this demographic being 20% of the population), but to avoid hospitals being overwhelmed with Covid cases taking the place of others. In France, the official data regarding hospital activity states that Covid cases represented 2% of hospitalizations and 5% of Intensive Care Unit (ICU) utilizations. Further, hospitals halved their workload from March to May 2020 because of almost all surgery being blocked to keep ICUs free. (In October-December 2020, although the epidemic was more significant at that time, the same decision was not taken). Arguably, 2% of 50% not an increase that should destroy a functioning system – https://www.atih.sante.fr/sites/default/files/public/content/4144/aah_2020_analyse_covid.pdf – page 2. Fixing dysfunction in the UK’s National Health Services has been a long-standing, and somewhat tedious, political and academic debate in the country for years, even before Covid (e.g. Smith 2007; Mannion & Braithwaite 2012; Pope & Burnes 2013; Edwards & Palmer 2019).

[2] An interesting difference that French people heard about was that in the UK, people could wander on the beaches during lockdowns, whereas in France it was forbidden to go to any natural area – indeed, it was forbidden to go further than one kilometre from home. Whereas, in fact, in the UK the lockdown restrictions were a ‘devolved matter’, with slightly different policies in each of the UK’s four member nations, though very similar legislation. In England, Section 6 paragraph (1) of The Health Protection (Coronavirus, Restrictions) (England) Regulations 2020 stated that “no person may leave the place where they are living without reasonable excuse”, with paragraph (2) covering examples of “reasonable excuses” including for exercise, obtaining basic necessities, and accessing public services. Similar wording was used by other devolved nations. None of the regulations stipulated any maximum distance from a person’s residence that these activities had to take place – interpretation of the UK’s law is based on the behaviour of the ‘reasonable person’ (the so-called ‘man on the Clapham omnibus’ – see Łazowski 2021). However, differing interpretations of what ‘resonable people’ would do between the citizenry and the constabulary led to fixed penalty notices being issued for taking exercise more than five miles (eight kilometres) from home – e.g. https://www.theguardian.com/uk-news/2021/jan/09/covid-derbyshire-police-to-review-lockdown-fines-after-walkers-given-200-penalties In Scotland, though the Statutory Instrument makes no mention of any distance, people were ‘given guidance’ not to travel more than five miles from home for leisure and recreation, and were still advised to stay “within their local area” after this restriction was lifted (see https://www.gov.scot/news/travel-restrictions-lifted/).

[3] A problem which seems to be true in various countries https://www.unesco.org/en/articles/new-academic-year-begins-unesco-warns-only-one-third-students-will-return-school
https://www.kff.org/other/report/kff-cnn-mental-health-in-america-survey/
https://eu.usatoday.com/in-depth/news/health/2023/05/15/school-avoidance-becomes-crisis-after-covid/11127563002/#:~:text=School%20avoidant%20behavior%2C%20also%20called,since%20the%20COVID%2D19%20pandemic
https://www.bbc.com/news/health-65954131

[4] https://www.cityam.com/omicron-mild-compared-to-delta-south-african-doctors-say/

[5] https://www.terrafemina.com/article/coronavirus-un-psy-alerte-sur-les-risques-du-confinement-pour-la-sante-mentale_a353002/1

[6] In 1946, in El hacedor, Borges described a country where the art of building maps is so excessive in the need for details that the whole country is covered by the ideal map. This leads to obvious troubles and the disappearance of geographic science in this country.

[7] See Brewer et al. (2016) if the Akaike Information Criterion is leaping to your mind at this assertion.

[8]  Although this assumption might not be stated that way anymore, as the hypothesis that many parts of the economy would hugely suffer started to reveal its truth even before the end of the crisis: a problem that had only been anticipated by a few prominent economists (e.g. Boyer, 2020). This failure shows mainly that the description that most economists make of the economy is too simplistic – as often reproached – to be able to anticipate massive disruptions. Everywhere in the world the informal sector was almost completely stopped as people could neither work in their job nor meet for information market exchange, which causes misery for a huge part of the earth population, among the most vulnerable (ILO, 2022).

[9] A real issue that became obvious is that the nosocomial infections are (still) extremely important in hospitals, as the evaluation of the number of infections in hospitals for Covid19 are estimated to be 20 to 40% during the first epidemic (Abbas et al. 2021).

Acknowledgements

GP’s work is supported by the Scottish Government Rural and Environment Science and Analytical Services Division (project reference JHI-C5-1).

References

Abbas, M., Nunes, T. R., Martischang, R., Zingg, W., Iten, A., Pittet, D. & Harbarth, S. (2021) Nosocomial transmission and outbreaks of coronavirus disease 2019: the need to protect both patients and healthcare workers. Antimicrobial Resistance & Infection Control 10, 7. doi:10.1186/s13756-020-00875-7

Boyer, R. (2020) Les capitalismes à l’épreuve de la pandémie, La découverte, Paris.

Brewer, M., Butler, A. & Cooksley, S. L. (2016) The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods in Ecology and Evolution 7 (6), 679-692. doi:10.1111/2041-210X.12541

the CoVprehension Collective (2020) Understanding the current COVID-19 epidemic: one question, one model. Review of Artificial Societies and Social Simulation, 30th April 2020. https://rofasss.org/2020/04/30/covprehension/

Edmonds, B. (2022) Rigour for agent-based modellers. Presentation to the Social Simulation Conference 2022, Milan, Italy. https://cfpm.org/rigour/

Edmonds, B. & Moss, S. (2005) From KISS to KIDS – an ‘anti-simplistic’ modelling approach. Lecture Notes in Artificial Intelligence 3415, pp. 130-144. doi:10.1007/978-3-540-32243-6_11

Edwards, N. & Palmer, B. (2019) A preliminary workforce plan for the NHS. British Medical Journal 365 (8203), I4144. doi:10.1136/bmj.l4144

Gotts, N. M. & Polhill, J. G. (2010) Size matters: large-scale replications of experiments with FEARLUS. Advances in Complex Systems 13 (04), 453-467. doi:10.1142/S0219525910002670

ILO Brief (2020) Impact of lockdown measures on the informal economy, https://www.ilo.org/global/topics/employment-promotion/informal-economy/publications/WCMS_743523/lang–en/index.htm

Keen, S. (2021) The appallingly bad neoclassical economics of climate change. Globalizations 18 (7), 1149-1177. doi:10.1080/14747731.2020.1807856

Łazowski, A. (2021) Legal adventures of the man on the Clapham omnibus. In Urbanik, J. & Bodnar, A. (eds.) Περιμένοντας τους Bαρβάρους. Law in a Time of Constitutional Crisis: Studies Offered to Mirosław Wyrzykowski. C. H. Beck, Munich, Germany, pp. 415-426. doi:10.5771/9783748931232-415

Mannion, R. & Braithwaite, J. (2012) Unintended consequences of performance measurement in healthcare: 20 salutary lessons from the English National Health Service. Internal Medicine Journal 42 (5), 569-574. doi:10.1111/j.1445-5994.2012.02766.x

Pluchinotta I., Daniell K.A., Tsoukiàs A. (2002), “Supporting Decision Making within the Policy Cycle: Techniques and Tools”, In M. Howlett (ed.), Handbook of Policy Tools, Routledge, London, 235 – 244. https://doi.org/10.4324/9781003163954-24.

Polhill, J. G. & Edmonds, B. (2023) Cognition and hypocognition: Discursive and simulation-supported decision-making within complex systems. Futures 148, 103121. doi:10.1016/j.futures.2023.103121

Pope, R. & Burnes, B. (2013) A model of organisational dysfunction in the NHS. Journal of Health Organization and Management 27 (6), 676-697. doi:10.1108/JHOM-10-2012-0207

Rouchier, J. (2023) Presentation to SocSimFest 23 during session ‘How to do wrong using Social Simulaion – as a result of arrogance, laziness or ill intent’. https://cfpm.org/slides/JR-Epi+newTINA.pdf

Roy, B. (1996) Multicriteria methodology for decision analysis, Kluwer Academic Publishers.

Singh, S, Roy, D., Sinha, K., Parveen, S., Sharma G. & Joshic, G. (2020) Impact of COVID-19 and lockdown on mental health of children and adolescents: A narrative review with recommendations, Psychiatry Research. 2020 Nov; 293: 113429, 10.1016/j.psychres.2020.113429

Smith, I. (2007). Breaking the dysfunctional dynamics. In: Building a World-Class NHS. Palgrave Macmillan, London, pp. 132-177. doi:10.1057/9780230589704_5

Sun, Z., Lorscheid, I., Millington, J. D., Lauf, S., Magliocca, N. R., Groeneveld, J., Balbi, S., Nolzen, H., Müller, B., Schulze, J. & Buchmann, C. M. (2016) Simple or complicated agent-based models? A complicated issue. Environmental Modelling & Software 86, 56-67. doi:10.1016/j.envsoft.2016.09.006

Usta, J., Murr, H. & El-Jarrah, R. (2021) COVID-19 lockdown and the increased violence against women: understanding domestic violence during a pandemic. Violence and Gender 8 (3), 133-139. doi:10.1089/vio.2020.0069

Voinov, A. & Shugart, H. H. (2013) ‘Integronsters’, integral and integrated modeling. Environmental Modelling & Software 39, 149-158. doi:10.1016/j.envsoft.2012.05.014


Polhill, G. and Rouchier, J. (2023) Policy modelling requires a multi-scale, multi-criteria and diverse-framing approach. Review of Artificial Societies and Social Simulation, 31 Jul 2023. https://rofasss.org/2023/07/31/policy-modelling-necessitates-multi-scale-multi-criteria-and-a-diversity-of-framing


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

The Poverty of Suggestivism – the dangers of “suggests that” modelling

By Bruce Edmonds

Vagueness and refutation

A model[1] is basically composed of two parts (Zeigler 1976, Wartofsky 1979):

  1. A set of entities (such as mathematical equations, logical rules, computer code etc.) which can be used to make some inferences as to the consequences of that set (usually in conjunction with some data and parameter values)
  2. A mapping from this set to what it aims to represent – what the bits mean

Whilst a lot of attention has been paid to the internal rigour of the set of entities and the inferences that are made from them (1), the mapping to what that represents (2) has often been left as implicit or incompletely described – sometimes only indicated by the labels given to its parts. The result is a model that vaguely relates to its target, suggesting its properties analogically. There is not a well-defined way that the model is to be applied to anything observed, but a new map is invented each time it is used to think about a particular case. I call this way of modelling “Suggestivism”, because the model “suggests” things about what is being modelled.

This is partly a recapitulation of Popper’s critique of vague theories in his book “The Poverty of Historicism” (1957). He characterised such theories as “irrefutable”, because whatever the facts, these theories could be made to fit them. Irrefutability is an indicator of a lack of precise mapping to reality – such vagueness makes refutation very hard. However, it is only an indicator; there may be other reasons than vagueness for it not being possible to test a theory – it is their disconnection from well-defined empirical reference that is the issue here.

Some might go as far as suggesting that any model or theory that is not refutable is “unscientific”, but this goes too far, implying a very restricted definition of what ‘science’ is. We need analogies to think about what we are doing and to gain insight into what we are studying, e.g. (Hartman 1997) – for humans they are unavoidable, ‘baked’ into the way language works (Lakoff 1987). A model might make a set of ideas clear and help map out the consequences of a set of assumptions/structures/processes. Many of these suggestivist models relate to a set of ideas and it is the ideas that relate to what is observed (albeit informally) (Edmonds 2001). However, such models do not capture anything reliable about what they refer to, and in that sense are not part of the set of the established statements and theories that is at the core of science  (Arnold 2014).

The dangers of suggestivist modelling

As above, there are valid uses of abstract or theoretical modelling where this is explicitly acknowledged and where no conclusions about observed phenomena are made. So what are the dangers of suggestivist modelling – why am I making such a fuss about it?

Firstly, that people often seem to confuse a model as an analogy – a way of thinking about stuff – and a model that tells us reliably about what we are studying. Thus they give undue weight to the analyses of abstract models that are, in fact, just thought experiments. Making models is a very intimate way of theorising – one spends an extended period of time interacting with one’s model: developing, checking, analysing etc. The result is a particularly strong version of “Kuhnian Spectacles” (Kuhn 1962) causing us to see the world though our model for weeks after. Under this strong influence it is natural to confuse what we can reliably infer about the world and how we are currently perceiving/thinking about it. Good scientists should then pause and wait for this effect to wear off so that they can effectively critique what they have done, its limitations and what its implications are. However, often in the rush to get their work out, modellers often do not do this, resulting in a sloppy set of suggestive interpretations of their modelling.

Secondly, empirical modelling is hard. It is far easier (and, frankly, more fun) to play with non-empirical models. A scientific culture that treats suggestivist modelling as substantial progress and significantly rewards modellers that do it, will effectively divert a lot of modelling effort in this direction. Chattoe-Brown (2018) displayed evidence of this in his survey of opinion dynamics models – abstract, suggestivist modelling got far more reward (in terms of citations) than those that tried to relate their model to empirical data in a direct manner. Abstract modelling has a role in science, but if it is easier and more rewarding then the field will become unbalanced. It may give the impression of progress but not deliver on this impression. In a more mature science, researchers working on measurement methods (steps from observation to models) and collecting good data are as important as the theorists (Moss 1998).

Thirdly, it is hard to judge suggestivist models. Given their connection to the modelling target is vague there cannot be any decisive test of its success. Good modellers should declare the exact purpose of their model, e.g. that is analogical or merely exploring the consequences of theory (Edmonds et al. 2019), but then accept the consequences of this choice – namely, that it excludes  making conclusions about the observed world. If it is for a theoretical exploration then the comprehensiveness of the exploration, the scope of the exploration and the applicability of the model can be judged, but if the model is analogical or illustrative then this is harder. Whilst one model may suggest X, another may suggest the opposite. It is quite easy to fix a model to get the outcomes one wants. Clearly, if a model makes startling suggestions – illustrating totally new ideas or making a counter-example to widely held assumptions – then this helps science by widening the pool of theories or hypotheses that are considered. However most suggestivist modelling does not do this.

Fourthly, their sheer flexibility of as to application causes problems – if one works hard enough one can invent mappings to a wide range of cases, the limits are only those of our imagination. In effect, having a vague mapping from model to what it models adds in huge flexibility in a similar way to having a large number of free (non-empirical) parameters. This flexibility gives an impression of generality, and many desire simple and general models for complex phenomena. However, this is illusory because a different mapping is needed for each case, to make it apply. Given the above (1)+(2) definition of a model this means that, in fact, it is a different model for each case – what a model refers to, is part of the model. The same flexibility makes such models impossible to refute, since one can just adjust the mapping to save them. The apparent generality and lack of refutation means that such models hang around in the literature, due to their surface attractiveness.

Finally, these kinds of model are hugely influential beyond the community of modellers to the wider public including policy actors. Narratives that start in abstract models make their way out and can be very influential (Vranckx 1999). Despite the lack of rigorous mapping from model to reality, suggestivist models look impressive, look scientific. For example, very abstract models from the Neo-Classical ‘Chicago School’ of economists supported narratives about the optimal efficiency of markets, leading to a reluctance to regulate them (Krugman 2009). A lack of regulation seemed to be one of the factors behind the 2007/8 economic crash (Baily et al 2008). Modellers may understand that other modellers get over-enthusiastic and over-interpret their models, but others may not. It is the duty of modellers to give an accurate impression of the reliability of any modelling results and not to over-hype them.

How to recognise a suggestivist model

It can be hard to detangle how empirically vague a model is, because many descriptions about modelling work do not focus on making the mapping to what it represents precise. The reasons for this are various, for example: the modeller might be conflating reality and what is in the model in their minds, the researcher is new to modelling and has not really decided what the purpose of their model is, the modeller might be over-keen to establish the importance of their work and so is hyping the motivation and conclusions, they might simply not got around to thinking enough about the relationship between their model and what it might represent, or they might not have bothered to make the relationship explicit in their description. Whatever the reason the reader of any description of such work is often left with an archaeological problem: trying to unearth what the relationship might be, based on indirect clues only. The only way to know for certain is to take a case one knows about and try and apply the model to it, but this is a time consuming process and relies upon having a case with suitable data available. However, there are some indicators, albeit fallible ones, including the following.

  • A relatively simple model is interpreted as explaining a wide range of observed, complex phenomena
  • No data from an observed case study is compared to data from the model (often no data is brought in at all, merely abstract observations) – despite this, conclusions about some observed phenomena are made
  • The purpose of the model is not explicitly declared
  • The language of the paper seems to conflate talking about the model with what is being modelled
  • In the paper there are sudden abstraction ‘jumps’ between the motivation and the description of the model and back again to the interpretation of the results in terms of that motivation. The abstraction jumps involved are large and justified by some a priori theory or modelling precedents rather than evidence.

How to avoid suggestivist modelling

How to avoid the dangers of suggestivist modelling should be clear from the above discussion, but I will make them explicit here.

  • Be clear about the model purpose – that is does the model aim to achieve, which indicates how it should be judged by others (Edmonds et al 2019)
  • Do not make any conclusions about the real world if you have not related the model to any data
  • Do not make any policy conclusions – things that might affect other people’s lives – without at least some independent validation of the model outcomes
  • Document how a model relates (or should relate) to data, the nature of that data and maybe even the process whereby that data should be obtained (Achter et al 2019)
  • Be explicit as possible about what kinds of phenomena the model applies to – the limits of its scope
  • Keep the language about the model and what is being modelled distinct – for any statement it should be clear whether it is talking about the model or what it models (Edmonds 2020)
  • Highlight any bold assumptions in the specification of the model or describe what empirical foundation there is for them – be honest about these

Conclusion

Models can serve many different purposes (Epstein 2008). This is fine as long as the purpose of models are always made clear, and model results are not interpreted further than their established purpose allows. Research which gives the impression that analogical, illustrative or theoretical modelling can tell us anything reliable about observed complex phenomena is not only sloppy science, but can have a deleterious impact – giving an impression of progress whilst diverting attention from empirically reliable work. Like a bad investment: if it looks too good and too easy to be true, it probably isn’t.

Notes

[1] We often use the word “model” in a lazy way to indicate (1) rather than (1)+(2) in this definition, but a set of entities without any meaning or mapping to anything else is not a model, as it does not represent anything. For example, a random set of equations or program instructions does not make a model.

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. & Siebers, P.-O. (2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4th June 2019. https://rofasss.org/2019/06/04/rat/

Arnold, E. (2014). What’s wrong with social simulations?. The Monist, 97(3), 359-377. DOI:10.5840/monist201497323

Baily, M. N., Litan, R. E., & Johnson, M. S. (2008). The origins of the financial crisis. Fixing Finance Series – Paper 3, The Brookings Institution. https://www.brookings.edu/wp-content/uploads/2016/06/11_origins_crisis_baily_litan.pdf

Chattoe-Brown, E. (2018) What is the earliest example of a social science simulation (that is nonetheless arguably an ABM) and shows real and simulated data in the same figure or table? Review of Artificial Societies and Social Simulation, 11th June 2018. https://rofasss.org/2018/06/11/ecb/

Edmonds, B. (2001) The Use of Models – making MABS actually work. In. Moss, S. and Davidsson, P. (eds.), Multi Agent Based Simulation, Lecture Notes in Artificial Intelligence, 1979:15-32. http://cfpm.org/cpmrep74.html

Edmonds, B. (2020) Basic Modelling Hygiene – keep descriptions about models and what they model clearly distinct. Review of Artificial Societies and Social Simulation, 22nd May 2020. https://rofasss.org/2020/05/22/modelling-hygiene/

Edmonds, B., le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Montañola-Sales, C., Ormerod, P., Root H. & Squazzoni. F. (2019) Different Modelling Purposes. Journal of Artificial Societies and Social Simulation, 22(3):6. http://jasss.soc.surrey.ac.uk/22/3/6.html.

Epstein, J. M. (2008). Why model?. Journal of artificial societies and social simulation, 11(4), 12. https://jasss.soc.surrey.ac.uk/11/4/12.html

Hartmann, S. (1997): Modelling and the Aims of Science. In: Weingartner, P. et al (ed.) : The Role of Pragmatics in Contemporary Philosophy: Contributions of the Austrian Ludwig Wittgenstein Society. Vol. 5. Wien und Kirchberg: Digi-Buch. pp. 380-385. https://epub.ub.uni-muenchen.de/25393/

Krugman, P. (2009) How Did Economists Get It So Wrong? New York Times, Sept. 2nd 2009. https://www.nytimes.com/2009/09/06/magazine/06Economic-t.html

Kuhn, T.S. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Lakoff, G. (1987) Women, fire, and dangerous things. University of Chicago Press, Chicago.

Morgan, M. S., & Morrison, M. (1999). Models as mediators. Cambridge: Cambridge University Press.

Moss, S. (1998) Social Simulation Models and Reality: Three Approaches. Centre for Policy Modelling  Discussion Paper: CPM-98-35, http://cfpm.org/cpmrep35.html

Popper, K. (1957). The poverty of historicism. Routledge.

Vranckx, An. (1999) Science, Fiction & the Appeal of Complexity. In Aerts, Diederik, Serge Gutwirth, Sonja Smets, and Luk Van Langehove, (eds.) Science, Technology, and Social Change: The Orange Book of “Einstein Meets Magritte.” Brussels: Vrije Universiteit Brussel; Dordrecht: Kluwer., pp. 283–301.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.


Edmonds, B. (2022) The Poverty of Suggestivism – the dangers of "suggests that" modelling. Review of Artificial Societies and Social Simulation, 28th Feb 2022. https://rofasss.org/2022/02/28/poverty-suggestivism


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!

By Mike Bithell and Bruce Edmonds

Model Intercomparison

The recent Covid crisis has led to a surge of new model development and a renewed interest in the use of models as policy tools. While this is in some senses welcome, the sudden appearance of many new models presents a problem in terms of their assessment, the appropriateness of their application and reconciling any differences in outcome. Even if they appear similar, their underlying assumptions may differ, their initial data might not be the same, policy options may be applied in different ways, stochastic effects explored to a varying extent, and model outputs presented in any number of different forms. As a result, it can be unclear what aspects of variations in output between models are results of mechanistic, parameter or data differences. Any comparison between models is made tricky by differences in experimental design and selection of output measures.

If we wish to do better, we suggest that a more formal approach to making comparisons between models would be helpful. However, it appears that this is not commonly undertaken most fields in a systematic and persistent way, except for the field of climate change, and closely related fields such as pollution transport or economic impact modelling (although efforts are underway to extend such systematic comparison to ecosystem models –  Wei et al., 2014, Tittensor et al., 2018⁠). Examining the way in which this is done for climate models may therefore prove instructive.

Model Intercomparison Projects (MIP) in the Climate Community

Formal intercomparison of atmospheric models goes back at least to 1989 (Gates et al., 1999)⁠ with the first atmospheric model inter-comparison project (AMIP), initiated by the World Climate Research Programme. By 1999 this had contributions from all significant atmospheric modelling groups, providing standardised time-series of over 30 model variables for one particular historical decade of simulation, with a standard experimental setup. Comparisons of model mean values with available data helped to reveal overall model strengths and weaknesses: no single model was best at simulation of all aspects of the atmosphere, with accuracy varying greatly between simulations. The model outputs also formed a reference base for further inter-comparison experiments including targets for model improvement and reduction of systematic errors, as well as a starting point for improved experimental design, software and data management standards and protocols for communication and model intercomparison. This led to AMIPII and, subsequently, to a series of Climate model inter-comparison projects (CMIP) beginning with CMIP I in 1996. The latest iteration (CMIP 6) is a collection of 23 separate model intercomparison experiments covering atmosphere, ocean, land surface, geo-engineering, and the paleoclimate. This collection is aimed at the upcoming 2021 IPCC process (AR6). Participating projects go through an endorsement process for inclusion, (a process agreed with modelling groups), based on 10 criteria designed to ensure some degree of coherence between the various models – a further 18 MIPS are also listed as currently active (https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6). Groups contribute to a central set of common experiments covering the period 1850 to the near-present. An overview of the whole process can be found in (Eyring et al., 2016).

The current structure includes a set of three overarching questions covering the dynamics of the earth system, model systematic biases and understanding possible future change under uncertainty. Individual MIPS may build on this to address one or more of a set of 7 “grand science challenges” associated with the climate. Modelling groups agree to provide outputs in a standard form, obtained from a specified set of experiments under the same design, and to provide standardised documentation to go with their models. Originally (up to CMIP 5), outputs were then added to a central public repository for further analysis, however the output grew so large under CMIP6 that now the data is held dispersed over repositories maintained by separate groups.

Other Examples

Two further more recent examples of collective model  development may also be helpful to consider.

Firstly, an informal network collating models across more than 50 research groups has already been generated as a result of the COVID crisis –  the Covid Forecast Hub (https://covid19forecasthub.org). This is run by a small number of research groups collaborating with the US Centre for Disease Control and is strongly focussed on the epidemiology. Participants are encouraged to submit weekly forecasts, and these are integrated into a data repository and can be vizualized on the website – viewers can look at forward projections, along with associated confidence intervals and model evaluation scores, including those for an ensemble of all models. The focus on forecasts in this case arises out of the strong policy drivers for the current crisis, but the main point is that it is possible to immediately view measures of model performance and to compare the different model types: one clear message that rapidly becomes apparent is that many of the forward projections have 95% (and at some times, even 50%) confidence intervals for incident deaths that more than span the full range of the past historic data. The benefit of comparing many different models in this case is apparent, as many of the historic single-model projections diverge strongly from the data (and the models most in error are not consistently the same ones over time), although the ensemble mean tends to be better.

As a second example, one could consider the Psychological Science Accelerator (PSA: Moshontz et al 2018, https://psysciacc.org/). This is a collaborative network set up with the aim of addressing the “replication crisis” in psychology: many previously published results in psychology have proved problematic to replicate as a result of small or non-representative sampling or use of experimental designs that do not generalize well or have not been used consistently either within or across studies. The PSA seeks to ensure accumulation of reliable and generalizable evidence in psychological science, based on principles of inclusion, decentralization, openness, transparency and rigour. The existence of this network has, for example, enabled the reinvestigation of previous  experiments but with much larger and less nationally biased samples (e.g. Jones et al 2021).

The Benefits of the Intercomparison Exercises and Collaborative Model Building

More specifically, long-term intercomparison projects help to do the following.

  • Build on past effort. Rather than modellers re-inventing the wheel (or building a new framework) with each new model project, libraries of well-tested and documented models, with data archives, including code and experimental design, would allow researchers to more efficiently work on new problems, building on previous coding effort
  • Aid replication. Focussed long term intercomparison projects centred on model results with consistent standardised data formats would allow new versions of code to be quickly tested against historical archives to check whether expected results could be recovered and where differences might arise, particularly if different modelling languages were being used
  • Help to formalize. While informal code archives can help to illustrate the methods or theoretical foundations of a model, intercomparison projects help to understand which kinds of formal model might be good for particular applications, and which can be expected to produce helpful results for given desired output measures
  • Build credibility. A continuously updated set of model implementations and assessment of their areas of competence and lack thereof (as compared with available datasets) would help to demonstrate the usefulness (or otherwise) of ABM as a way to represent social systems
  • Influence Policy (where appropriate). Formal international policy organisations such as the IPCC or the more recently formed IPBES are effective partly through an underpinning of well tested and consistently updated models. As yet it is difficult to see whether such a body would be appropriate or effective for social systems, as we lack the background of demonstrable accumulated and well tested model results.

Lessons for ABM?

What might we be able to learn from the above, if we attempted to use a similar process to compare ABM policy models?

In the first place, the projects started small and grew over time: it would not be necessary, for example, to cover all possible ABM applications at the outset. On the other hand, the latest CMIP iterations include a wide range of different types of model covering many different aspects of the earth system, so that the breadth of possible model types need not be seen as a barrier.

Secondly, the climate inter-comparison project has been persistent for some 30 years – over this time many models have come and gone, but the history of inter-comparisons allows for an overview of how well these models have performed over time – data from the original AMIP I models is still available on request, supporting assessments concerning  long-term model improvement.

Thirdly, although climate models are complex – implementing a variety of different mechanisms in different ways – they can still be compared by use of standardised outputs, and at least some (although not necessarily all) have been capable of direct comparison with empirical data.

Finally, an agreed experimental design and public archive for documentation and output that is stable over time is needed; this needs to be done via a collective agreement among the modelling groups involved so as to ensure a long-term buy-in from the community as a whole, so that there is a consistent basis for long-term model development, building on past experience.

The need for aligning or reproducing ABMs has long been recognised within the community (Axtell et al. 1996; Edmonds & Hales 2003), but on a one-one basis for verifying the specification of models against their implementation, although (Hales et al. 2003) discusses a range of possibilities. However, this is far from a situation where many different models of basically the same phenomena are systematically compared – this would be a larger scale collaboration lasting over a longer time span.

The community has already established a standardised form of documentation in the ODD protocol. Sharing of model code is also becoming routine, and can be easily achieved through COMSES, Github or similar. The sharing of data in a long-term archive may require more investigation. As a starting project COVID-19 provides an ideal opportunity for setting up such a model inter-comparison project – multiple groups already have running examples, and a shared set of outputs and experiments should be straightforward to agree on. This would potentially form a basis for forward looking experiments designed to assist with possible future pandemic problems, and a basis on which to build further features into the existing disease-focussed modelling, such as the effects of economic, social and psychological issues.

Additional Challenges for ABMs of Social Phenomena

Nobody supposes that modelling social phenomena is going to have the same set of challenges that climate change models face. Some of the differences include:

  • The availability of good data. Social science is bedevilled by a paucity of the right kind of data. Although an increasing amount of relevant data is being produced, there are commercial, ethical and data protection barriers to accessing it and the data rarely concerns the same set of actors or events.
  • The understanding of micro-level behaviour. Whilst the micro-level understanding of our atmosphere is very well established, those of the behaviour of the most important actors (humans) is not. However, it may be that better data might partially substitute for a generic behavioural model of decision-making.
  • Agreement upon the goals of modelling. Although there will always be considerable variation in terms of what is wanted from a model of any particular social phenomena, a common core of agreed objectives will help focus any comparison and give confidence via ensembles of projections. Although the MIPs and Covid Forecast Hub are focussed on prediction, it may be that empirical explanation may be more important in other areas.
  • The available resources. ABM projects tend to be add-ons to larger endeavours and based around short-term grant funding. The funding for big ABM projects is yet to be established, not having the equivalent of weather forecasting to piggy-back on.
  • Persistence of modelling teams/projects. ABM tends to be quite short-term with each project developing a new model for a new project. This has made it hard to keep good modelling teams together.
  • Deep uncertainty. Whilst the set of possible factors and processes involved in a climate change model are well established, which social mechanisms need to be involved in any model of any particular social phenomena is unknown. For this reason, there is deep disagreement about the assumptions to be made in such models, as well as sharp divergence in outcome due to changes brought about by a particular mechanism but not included in a model. Whilst uncertainty in known mechanisms can be quantified, assessing the impact of those due to such deep uncertainty is much harder.
  • The sensitivity of the political context. Even in the case of Climate Change, where the assumptions made are relatively well understood and done on objective bases, the modelling exercise and its outcomes can be politically contested. In other areas, where the representation of people’s behaviour might be key to model outcomes, this will need even more care (Adoha & Edmonds 2017).

However, some of these problems were solved in the case of Climate Change as a result of the CMIP exercises and the reports they ultimately resulted in. Over time the development of the models also allowed for a broadening and updating of modelling goals, starting from a relatively narrow initial set of experiments. Ensuring the persistence of individual modelling teams is easier in the context of an internationally recognised comparison project, because resources may be easier to obtain, and there is a consistent central focus. The modelling projects became longer-term as individual researchers could establish a career doing just climate change modelling and importance of the work increasingly recognised. An ABM modelling comparison project might help solve some of these problems as the importance of its work is established.

Towards an Initial Proposal

The topic chosen for this project should be something where there: (a) is enough public interest to justify the effort, (b) there are a number of models with a similar purpose in mind being developed.  At the current stage, this suggests dynamic models of COVID spread, but there are other possibilities, including: transport models (where people go and who they meet) or criminological models (where and when crimes happen).

Whichever ensemble of models is focussed upon, these models should be compared on a core of standard, with the same:

  • Start and end dates (but not necessarily the same temporal granularity)
  • Covering the same set of regions or cases
  • Using the same population data (though possibly enhanced with extra data and maybe scaled population sizes)
  • With the same initial conditions in terms of the population
  • Outputting a core of agreed measures (but maybe others as well)
  • Checked against their agreement against a core set of cases, with agreed data sets
  • Reported on in a standard format (though with a discussion section for further/other observations)
  • well documented and with code that is open access
  • Run a minimum of times with different random seeds

Any modeller/team that had a suitable model and was willing to adhere to the rules would be welcome to participate (commercial, government or academic) and these teams would collectively decide the rules, development and write any reports on the comparisons. Other interested stakeholder groups could be involved including professional/academic associations, NGOs and government departments but in a consultative role providing wider critique – it is important that the terms and reports from the exercise be independent or any particular interest or authority.

Conclusion

We call upon those who think ABMs have the potential to usefully inform policy decisions to work together, in order that the transparency and rigour of our modelling matches our ambition. Whilst model comparison exercises of the kind described are important for any simulation work, particular care needs to be taken when the outcomes can affect people’s lives.

References

Aodha, L. & Edmonds, B. (2017) Some pitfalls to beware when applying models to issues of policy relevance. In Edmonds, B. & Meyer, R. (eds.) Simulating Social Complexity – a handbook, 2nd edition. Springer, 801-822. (A version is at http://cfpm.org/discussionpapers/236)

Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. Computational & Mathematical Organization Theory, 1(2), 123-141. https://link.springer.com/article/10.1007%2FBF01299065

Edmonds, B., & Hales, D. (2003). Replication, replication and replication: Some hard lessons from model alignment. Journal of Artificial Societies and Social Simulation, 6(4), 11. http://jasss.soc.surrey.ac.uk/6/4/11.html

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., & Taylor, K. E. (2016). Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development, 9(5), 1937–1958. https://doi.org/10.5194/gmd-9-1937-2016

Gates, W. L., Boyle, J. S., Covey, C., Dease, C. G., Doutriaux, C. M., Drach, R. S., Fiorino, M., Gleckler, P. J., Hnilo, J. J., Marlais, S. M., Phillips, T. J., Potter, G. L., Santer, B. D., Sperber, K. R., Taylor, K. E., & Williams, D. N. (1999). An Overview of the Results of the Atmospheric Model Intercomparison Project (AMIP I). In Bulletin of the American Meteorological Society (Vol. 80, Issue 1, pp. 29–55). American Meteorological Society. https://doi.org/10.1175/1520-0477(1999)080<0029:AOOTRO>2.0.CO;2

Hales, D., Rouchier, J., & Edmonds, B. (2003). Model-to-model analysis. Journal of Artificial Societies and Social Simulation, 6(4), 5. http://jasss.soc.surrey.ac.uk/6/4/5.html

Jones, B.C., DeBruine, L.M., Flake, J.K. et al. To which world regions does the valence–dominance model of social perception apply?. Nat Hum Behav 5, 159–169 (2021). https://doi.org/10.1038/s41562-020-01007-2

Moshontz, H. + 85 others (2018) The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network ,  1(4) 501-515. https://doi.org/10.1177/2515245918797607

Tittensor, D. P., Eddy, T. D., Lotze, H. K., Galbraith, E. D., Cheung, W., Barange, M., Blanchard, J. L., Bopp, L., Bryndum-Buchholz, A., Büchner, M., Bulman, C., Carozza, D. A., Christensen, V., Coll, M., Dunne, J. P., Fernandes, J. A., Fulton, E. A., Hobday, A. J., Huber, V., … Walker, N. D. (2018). A protocol for the intercomparison of marine fishery and ecosystem models: Fish-MIP v1.0. Geoscientific Model Development, 11(4), 1421–1442. https://doi.org/10.5194/gmd-11-1421-2018

Wei, Y., Liu, S., Huntzinger, D. N., Michalak, A. M., Viovy, N., Post, W. M., Schwalm, C. R., Schaefer, K., Jacobson, A. R., Lu, C., Tian, H., Ricciuto, D. M., Cook, R. B., Mao, J., & Shi, X. (2014). The north american carbon program multi-scale synthesis and terrestrial model intercomparison project – Part 2: Environmental driver data. Geoscientific Model Development, 7(6), 2875–2893. https://doi.org/10.5194/gmd-7-2875-2014


Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models - It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/


 

Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM

By Sebastian Achter, Melania Borit, Edmund Chattoe-Brown, Christiane Palaretti and Peer-Olaf Siebers

The initiative presented below arose from a Lorentz Center workshop on Integrating Qualitative and Quantitative Evidence using Social Simulation (8-12 April 2019, Leiden, the Netherlands). At the beginning of this workshop, the attenders divided themselves into teams aiming to work on specific challenges within the broad domain of the workshop topic. Our team took up the challenge of looking at “Rigour, Transparency, and Reuse”. The aim that emerged from our initial discussions was to create a framework for augmenting rigour and transparency (RAT) of data use in ABM when both designing, analysing and publishing such models.

One element of the framework that the group worked on was a roadmap of the modelling process in ABM, with particular reference to the use of different kinds of data. This roadmap was used to generate the second element of the framework: A protocol consisting of a set of questions, which, if answered by the modeller, would ensure that the published model was as rigorous and transparent in terms of data use, as it needs to be in order for the reader to understand and reproduce it.

The group (which had diverse modelling approaches and spanned a number of disciplines) recognised the challenges of this approach and much of the week was spent examining cases and defining terms so that the approach did not assume one particular kind of theory, one particular aim of modelling, and so on. To this end, we intend that the framework should be thoroughly tested against real research to ensure its general applicability and ease of use.

The team was also very keen not to “reinvent the wheel”, but to try develop the RAT approach (in connection with data use) to augment and “join up” existing protocols or documentation standards for specific parts of the modelling process. For example, the ODD protocol (Grimm et al. 2010) and its variants are generally accepted as the established way of documenting ABM but do not request rigorous documentation/justification of the data used for the modelling process.

The plan to move forward with the development of the framework is organised around three journal articles and associated dissemination activities:

  • A literature review of best (data use) documentation and practice in other disciplines and research methods (e.g. PRISMA – Preferred Reporting Items for Systematic Reviews and Meta-Analyses)
  • A literature review of available documentation tools in ABM (e.g. ODD and its variants, DOE, the “Info” pane of NetLogo, EABSS)
  • An initial statement of the goals of RAT, the roadmap, the protocol and the process of testing these resources for usability and effectiveness
  • A presentation, poster, and round table at SSC 2019 (Mainz)

We would appreciate suggestions for items that should be included in the literature reviews, “beta testers” and critical readers for the roadmap and protocol (from as many disciplines and modelling approaches as possible), reactions (whether positive or negative) to the initiative itself (including joining it!) and participation in the various activities we plan at Mainz. If you are interested in any of these roles, please email Melania Borit (melania.borit@uit.no).

Acknowledgements

Chattoe-Brown’s contribution to this research is funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

References

Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J. and Railsback, S. F. (2010) ‘The ODD Protocol: A Review and First Update’, Ecological Modelling, 221(23):2760–2768. doi:10.1016/j.ecolmodel.2010.08.019


Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. and Siebers, P.-O.(2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4th June 2019. https://rofasss.org/2019/06/04/rat/


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)