Tag Archives: rigour

The Poverty of Suggestivism – the dangers of “suggests that” modelling

By Bruce Edmonds

Vagueness and refutation

A model[1] is basically composed of two parts (Zeigler 1976, Wartofsky 1979):

  1. A set of entities (such as mathematical equations, logical rules, computer code etc.) which can be used to make some inferences as to the consequences of that set (usually in conjunction with some data and parameter values)
  2. A mapping from this set to what it aims to represent – what the bits mean

Whilst a lot of attention has been paid to the internal rigour of the set of entities and the inferences that are made from them (1), the mapping to what that represents (2) has often been left as implicit or incompletely described – sometimes only indicated by the labels given to its parts. The result is a model that vaguely relates to its target, suggesting its properties analogically. There is not well-defined way that a model is to be applied to anything observed, but a new map is invented each time it is used to think about a particular case. I call this way of modelling “Suggestivism”, because the model “suggests” things about what is being modelled.

This is partly a recapitulation of Popper’s critique of vague theories in his book “The Poverty of Historicism” (1957). He characterised such theories as “irrefutable”, because whatever the facts, these theories could be made to fit them. Irrefutability is an indicator of a lack of precise mapping to reality – such vagueness makes refutation very hard. However, it is only an indicator; there may be other reasons than vagueness for it not being possible to test a theory – it is their disconnection from well-defined empirical reference that is the issue here.

Some might go as far as suggesting that any model or theory that is not refutable is “unscientific”, but this goes too far, implying a very restricted definition of what ‘science’ is. We need analogies to think about what we are doing and to gain insight into what we are studying, e.g. (Hartman 1997) – for humans they are unavoidable, ‘baked’ into the way language works (Lakoff 1987). A model might make a set of ideas clear and help map out the consequences of a set of assumptions/structures/processes. Many of these suggestivist models relate to a set of ideas and it is the ideas that relate to what is observed (albeit informally) (Edmonds 2001). However, such models do not capture anything reliable about what they refer to, and in that sense are not part of the set of the established statements and theories that is at the core of science  (Arnold 2014).

The dangers of suggestivist modelling

As above, there are valid uses of abstract or theoretical modelling where this is explicitly acknowledged and where no conclusions about observed phenomena are made. So what are the dangers of suggestivist modelling – why am I making such a fuss about it?

Firstly, that people often seem to confuse a model as an analogy – a way of thinking about stuff – and a model that tells us reliably about what we are studying. Thus they give undue weight to the analyses of abstract models that are, in fact, just thought experiments. Making models is a very intimate way of theorising – one spends an extended period of time interacting with one’s model: developing, checking, analysing etc. The result is a particularly strong version of “Kuhnian Spectacles” (Kuhn 1962) causing us to see the world though our model for weeks after. Under this strong influence it is natural to confuse what we can reliably infer about the world and how we are currently perceiving/thinking about it. Good scientists should then pause and wait for this effect to wear off so that they can effectively critique what they have done, its limitations and what its implications are. However, often in the rush to get their work out, modellers often do not do this, resulting in a sloppy set of suggestive interpretations of their modelling.

Secondly, empirical modelling is hard. It is far easier (and, frankly, more fun) to play with non-empirical models. A scientific culture that treats suggestivist modelling as substantial progress and significantly rewards modellers that do it, will effectively divert a lot of modelling effort in this direction. Chattoe-Brown (2018) displayed evidence of this in his survey of opinion dynamics models – abstract, suggestivist modelling got far more reward (in terms of citations) than those that tried to relate their model to empirical data in a direct manner. Abstract modelling has a role in science, but if it is easier and more rewarding then the field will become unbalanced. It may give the impression of progress but not deliver on this impression. In a more mature science, researchers working on measurement methods (steps from observation to models) and collecting good data are as important as the theorists (Moss 1998).

Thirdly, it is hard to judge suggestivist models. Given their connection to the modelling target is vague there cannot be any decisive test of its success. Good modellers should declare the exact purpose of their model, e.g. that is analogical or merely exploring the consequences of theory (Edmonds et al. 2019), but then accept the consequences of this choice – namely, that it excludes  making conclusions about the observed world. If it is for a theoretical exploration then the comprehensiveness of the exploration, the scope of the exploration and the applicability of the model can be judged, but if the model is analogical or illustrative then this is harder. Whilst one model may suggest X, another may suggest the opposite. It is quite easy to fix a model to get the outcomes one wants. Clearly, if a model makes startling suggestions – illustrating totally new ideas or making a counter-example to widely held assumptions – then this helps science by widening the pool of theories or hypotheses that are considered. However most suggestivist modelling does not do this.

Fourthly, their sheer flexibility of as to application causes problems – if one works hard enough one can invent mappings to a wide range of cases, the limits are only those of our imagination. In effect, having a vague mapping from model to what it models adds in huge flexibility in a similar way to having a large number of free (non-empirical) parameters. This flexibility gives an impression of generality, and many desire simple and general models for complex phenomena. However, this is illusory because a different mapping is needed for each case, to make it apply. Given the above (1)+(2) definition of a model this means that, in fact, it is a different model for each case – what a model refers to, is part of the model. The same flexibility makes such models impossible to refute, since one can just adjust the mapping to save them. The apparent generality and lack of refutation means that such models hang around in the literature, due to their surface attractiveness.

Finally, these kinds of model are hugely influential beyond the community of modellers to the wider public including policy actors. Narratives that start in abstract models make their way out and can be very influential (Vranckx 1999). Despite the lack of rigorous mapping from model to reality, suggestivist models look impressive, look scientific. For example, very abstract models from the Neo-Classical ‘Chicago School’ of economists supported narratives about the optimal efficiency of markets, leading to a reluctance to regulate them (Krugman 2009). A lack of regulation seemed to be one of the factors behind the 2007/8 economic crash (Baily et al 2008). Modellers may understand that other modellers get over-enthusiastic and over-interpret their models, but others may not. It is the duty of modellers to give an accurate impression of the reliability of any modelling results and not to over-hype them.

How to recognise a suggestivist model

It can be hard to detangle how empirically vague a model is, because many descriptions about modelling work do not focus on making the mapping to what it represents precise. The reasons for this are various, for example: the modeller might be conflating reality and what is in the model in their minds, the researcher is new to modelling and has not really decided what the purpose of their model is, the modeller might be over-keen to establish the importance of their work and so is hyping the motivation and conclusions, they might simply not got around to thinking enough about the relationship between their model and what it might represent, or they might not have bothered to make the relationship explicit in their description. Whatever the reason the reader of any description of such work is often left with an archaeological problem: trying to unearth what the relationship might be, based on indirect clues only. The only way to know for certain is to take a case one knows about and try and apply the model to it, but this is a time consuming process and relies upon having a case with suitable data available. However, there are some indicators, albeit fallible ones, including the following.

  • A relatively simple model is interpreted as explaining a wide range of observed, complex phenomena
  • No data from an observed case study is compared to data from the model (often no data is brought in at all, merely abstract observations) – despite this, conclusions about some observed phenomena are made
  • The purpose of the model is not explicitly declared
  • The language of the paper seems to conflate talking about the model with what is being modelled
  • In the paper there is sudden abstraction ‘jump’ between the motivation and the description of the model and back again to the interpretation of the results in terms of that motivation. The abstraction jumps involved are large and justified by some a priori theory or modelling precedents rather than evidence.

How to avoid suggestivist modelling

How to avoid the dangers of suggestivist modelling should be clear from the above discussion, but I will make them explicit here.

  • Be clear about the model purpose – that is what kind of thing the model aims to achieve which indicates how it should be judged by others (Edmonds et al 2019)
  • Do not make any conclusions about the real world if you have not related the model to any data
  • Do not make any policy conclusions – things that might affect other people’s lives – without at least some independent validation of the model outcomes
  • Document how a model relates (or should relate) to data, the nature of that data and maybe even the process whereby that data should be obtained (Achter et al 2019)
  • Be explicit as possible about what kinds of phenomena the model applies to – the limits of its scope
  • Keep the language about the model and what is being modelled distinct – for any statement it should be clear whether it is talking about the model or what it models (Edmonds 2020)
  • Highlight any bold assumptions in the specification of the model or describe what empirical foundation there is for them – be honest about these

Conclusion

Models can serve many different purposes (Epstein 2008). This is fine as long as the purpose of models are always made clear, and model results are not interpreted further than their established purpose allows. Research which gives the impression that analogical, illustrative or theoretical modelling can tell us anything reliable about observed complex phenomena is not only sloppy science, but can have a deleterious impact – giving an impression of progress whilst diverting attention from empirically reliable work. Like a bad investment: if it looks too good and too easy to be true, it is probably isn’t.

Notes

[1] We often use the word “model” in a lazy way to indicate (1) rather than (1)+(2) in this definition, but a set of entities without any meaning or mapping to anything else is not a model, as it does not represent anything. For example, a random set of equations or program instructions does not make a model.

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, grant number ES/S015159/1.

References

Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. & Siebers, P.-O. (2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4th June 2019. https://rofasss.org/2019/06/04/rat/

Arnold, E. (2014). What’s wrong with social simulations?. The Monist, 97(3), 359-377. DOI:10.5840/monist201497323

Baily, M. N., Litan, R. E., & Johnson, M. S. (2008). The origins of the financial crisis. Fixing Finance Series – Paper 3, The Brookings Institution. https://www.brookings.edu/wp-content/uploads/2016/06/11_origins_crisis_baily_litan.pdf

Chattoe-Brown, E. (2018) What is the earliest example of a social science simulation (that is nonetheless arguably an ABM) and shows real and simulated data in the same figure or table? Review of Artificial Societies and Social Simulation, 11th June 2018. https://rofasss.org/2018/06/11/ecb/

Edmonds, B. (2001) The Use of Models – making MABS actually work. In. Moss, S. and Davidsson, P. (eds.), Multi Agent Based Simulation, Lecture Notes in Artificial Intelligence, 1979:15-32. http://cfpm.org/cpmrep74.html

Edmonds, B. (2020) Basic Modelling Hygiene – keep descriptions about models and what they model clearly distinct. Review of Artificial Societies and Social Simulation, 22nd May 2020. https://rofasss.org/2020/05/22/modelling-hygiene/

Edmonds, B., le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Montañola-Sales, C., Ormerod, P., Root H. & Squazzoni. F. (2019) Different Modelling Purposes. Journal of Artificial Societies and Social Simulation, 22(3):6. http://jasss.soc.surrey.ac.uk/22/3/6.html.

Epstein, J. M. (2008). Why model?. Journal of artificial societies and social simulation, 11(4), 12. https://jasss.soc.surrey.ac.uk/11/4/12.html

Hartmann, S. (1997): Modelling and the Aims of Science. In: Weingartner, P. et al (ed.) : The Role of Pragmatics in Contemporary Philosophy: Contributions of the Austrian Ludwig Wittgenstein Society. Vol. 5. Wien und Kirchberg: Digi-Buch. pp. 380-385. https://epub.ub.uni-muenchen.de/25393/

Krugman, P. (2009) How Did Economists Get It So Wrong? New York Times, Sept. 2nd 2009. https://www.nytimes.com/2009/09/06/magazine/06Economic-t.html

Kuhn, T.S. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Lakoff, G. (1987) Women, fire, and dangerous things. University of Chicago Press, Chicago.

Morgan, M. S., & Morrison, M. (1999). Models as mediators. Cambridge: Cambridge University Press.

Moss, S. (1998) Social Simulation Models and Reality: Three Approaches. Centre for Policy Modelling  Discussion Paper: CPM-98-35, http://cfpm.org/cpmrep35.html

Popper, K. (1957). The poverty of historicism. Routledge.

Vranckx, An. (1999) Science, Fiction & the Appeal of Complexity. In Aerts, Diederik, Serge Gutwirth, Sonja Smets, and Luk Van Langehove, (eds.) Science, Technology, and Social Change: The Orange Book of “Einstein Meets Magritte.” Brussels: Vrije Universiteit Brussel; Dordrecht: Kluwer., pp. 283–301.

Wartofsky, M. W. (1979). The model muddle: Proposals for an immodest realism. In Models (pp. 1-11). Springer, Dordrecht.

Zeigler, B. P. (1976). Theory of Modeling and Simulation. Wiley Interscience, New York.


Edmonds, B. (2022) The Poverty of Suggestivism – the dangers of "suggests that" modelling. Review of Artificial Societies and Social Simulation, 28th Feb 2022. https://rofasss.org/2022/02/28/poverty-suggestivism


 

The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!

By Mike Bithell and Bruce Edmonds

Model Intercomparison

The recent Covid crisis has led to a surge of new model development and a renewed interest in the use of models as policy tools. While this is in some senses welcome, the sudden appearance of many new models presents a problem in terms of their assessment, the appropriateness of their application and reconciling any differences in outcome. Even if they appear similar, their underlying assumptions may differ, their initial data might not be the same, policy options may be applied in different ways, stochastic effects explored to a varying extent, and model outputs presented in any number of different forms. As a result, it can be unclear what aspects of variations in output between models are results of mechanistic, parameter or data differences. Any comparison between models is made tricky by differences in experimental design and selection of output measures.

If we wish to do better, we suggest that a more formal approach to making comparisons between models would be helpful. However, it appears that this is not commonly undertaken most fields in a systematic and persistent way, except for the field of climate change, and closely related fields such as pollution transport or economic impact modelling (although efforts are underway to extend such systematic comparison to ecosystem models –  Wei et al., 2014, Tittensor et al., 2018⁠). Examining the way in which this is done for climate models may therefore prove instructive.

Model Intercomparison Projects (MIP) in the Climate Community

Formal intercomparison of atmospheric models goes back at least to 1989 (Gates et al., 1999)⁠ with the first atmospheric model inter-comparison project (AMIP), initiated by the World Climate Research Programme. By 1999 this had contributions from all significant atmospheric modelling groups, providing standardised time-series of over 30 model variables for one particular historical decade of simulation, with a standard experimental setup. Comparisons of model mean values with available data helped to reveal overall model strengths and weaknesses: no single model was best at simulation of all aspects of the atmosphere, with accuracy varying greatly between simulations. The model outputs also formed a reference base for further inter-comparison experiments including targets for model improvement and reduction of systematic errors, as well as a starting point for improved experimental design, software and data management standards and protocols for communication and model intercomparison. This led to AMIPII and, subsequently, to a series of Climate model inter-comparison projects (CMIP) beginning with CMIP I in 1996. The latest iteration (CMIP 6) is a collection of 23 separate model intercomparison experiments covering atmosphere, ocean, land surface, geo-engineering, and the paleoclimate. This collection is aimed at the upcoming 2021 IPCC process (AR6). Participating projects go through an endorsement process for inclusion, (a process agreed with modelling groups), based on 10 criteria designed to ensure some degree of coherence between the various models – a further 18 MIPS are also listed as currently active (https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6). Groups contribute to a central set of common experiments covering the period 1850 to the near-present. An overview of the whole process can be found in (Eyring et al., 2016).

The current structure includes a set of three overarching questions covering the dynamics of the earth system, model systematic biases and understanding possible future change under uncertainty. Individual MIPS may build on this to address one or more of a set of 7 “grand science challenges” associated with the climate. Modelling groups agree to provide outputs in a standard form, obtained from a specified set of experiments under the same design, and to provide standardised documentation to go with their models. Originally (up to CMIP 5), outputs were then added to a central public repository for further analysis, however the output grew so large under CMIP6 that now the data is held dispersed over repositories maintained by separate groups.

Other Examples

Two further more recent examples of collective model  development may also be helpful to consider.

Firstly, an informal network collating models across more than 50 research groups has already been generated as a result of the COVID crisis –  the Covid Forecast Hub (https://covid19forecasthub.org). This is run by a small number of research groups collaborating with the US Centre for Disease Control and is strongly focussed on the epidemiology. Participants are encouraged to submit weekly forecasts, and these are integrated into a data repository and can be vizualized on the website – viewers can look at forward projections, along with associated confidence intervals and model evaluation scores, including those for an ensemble of all models. The focus on forecasts in this case arises out of the strong policy drivers for the current crisis, but the main point is that it is possible to immediately view measures of model performance and to compare the different model types: one clear message that rapidly becomes apparent is that many of the forward projections have 95% (and at some times, even 50%) confidence intervals for incident deaths that more than span the full range of the past historic data. The benefit of comparing many different models in this case is apparent, as many of the historic single-model projections diverge strongly from the data (and the models most in error are not consistently the same ones over time), although the ensemble mean tends to be better.

As a second example, one could consider the Psychological Science Accelerator (PSA: Moshontz et al 2018, https://psysciacc.org/). This is a collaborative network set up with the aim of addressing the “replication crisis” in psychology: many previously published results in psychology have proved problematic to replicate as a result of small or non-representative sampling or use of experimental designs that do not generalize well or have not been used consistently either within or across studies. The PSA seeks to ensure accumulation of reliable and generalizable evidence in psychological science, based on principles of inclusion, decentralization, openness, transparency and rigour. The existence of this network has, for example, enabled the reinvestigation of previous  experiments but with much larger and less nationally biased samples (e.g. Jones et al 2021).

The Benefits of the Intercomparison Exercises and Collaborative Model Building

More specifically, long-term intercomparison projects help to do the following.

  • Build on past effort. Rather than modellers re-inventing the wheel (or building a new framework) with each new model project, libraries of well-tested and documented models, with data archives, including code and experimental design, would allow researchers to more efficiently work on new problems, building on previous coding effort
  • Aid replication. Focussed long term intercomparison projects centred on model results with consistent standardised data formats would allow new versions of code to be quickly tested against historical archives to check whether expected results could be recovered and where differences might arise, particularly if different modelling languages were being used
  • Help to formalize. While informal code archives can help to illustrate the methods or theoretical foundations of a model, intercomparison projects help to understand which kinds of formal model might be good for particular applications, and which can be expected to produce helpful results for given desired output measures
  • Build credibility. A continuously updated set of model implementations and assessment of their areas of competence and lack thereof (as compared with available datasets) would help to demonstrate the usefulness (or otherwise) of ABM as a way to represent social systems
  • Influence Policy (where appropriate). Formal international policy organisations such as the IPCC or the more recently formed IPBES are effective partly through an underpinning of well tested and consistently updated models. As yet it is difficult to see whether such a body would be appropriate or effective for social systems, as we lack the background of demonstrable accumulated and well tested model results.

Lessons for ABM?

What might we be able to learn from the above, if we attempted to use a similar process to compare ABM policy models?

In the first place, the projects started small and grew over time: it would not be necessary, for example, to cover all possible ABM applications at the outset. On the other hand, the latest CMIP iterations include a wide range of different types of model covering many different aspects of the earth system, so that the breadth of possible model types need not be seen as a barrier.

Secondly, the climate inter-comparison project has been persistent for some 30 years – over this time many models have come and gone, but the history of inter-comparisons allows for an overview of how well these models have performed over time – data from the original AMIP I models is still available on request, supporting assessments concerning  long-term model improvement.

Thirdly, although climate models are complex – implementing a variety of different mechanisms in different ways – they can still be compared by use of standardised outputs, and at least some (although not necessarily all) have been capable of direct comparison with empirical data.

Finally, an agreed experimental design and public archive for documentation and output that is stable over time is needed; this needs to be done via a collective agreement among the modelling groups involved so as to ensure a long-term buy-in from the community as a whole, so that there is a consistent basis for long-term model development, building on past experience.

The need for aligning or reproducing ABMs has long been recognised within the community (Axtell et al. 1996; Edmonds & Hales 2003), but on a one-one basis for verifying the specification of models against their implementation, although (Hales et al. 2003) discusses a range of possibilities. However, this is far from a situation where many different models of basically the same phenomena are systematically compared – this would be a larger scale collaboration lasting over a longer time span.

The community has already established a standardised form of documentation in the ODD protocol. Sharing of model code is also becoming routine, and can be easily achieved through COMSES, Github or similar. The sharing of data in a long-term archive may require more investigation. As a starting project COVID-19 provides an ideal opportunity for setting up such a model inter-comparison project – multiple groups already have running examples, and a shared set of outputs and experiments should be straightforward to agree on. This would potentially form a basis for forward looking experiments designed to assist with possible future pandemic problems, and a basis on which to build further features into the existing disease-focussed modelling, such as the effects of economic, social and psychological issues.

Additional Challenges for ABMs of Social Phenomena

Nobody supposes that modelling social phenomena is going to have the same set of challenges that climate change models face. Some of the differences include:

  • The availability of good data. Social science is bedevilled by a paucity of the right kind of data. Although an increasing amount of relevant data is being produced, there are commercial, ethical and data protection barriers to accessing it and the data rarely concerns the same set of actors or events.
  • The understanding of micro-level behaviour. Whilst the micro-level understanding of our atmosphere is very well established, those of the behaviour of the most important actors (humans) is not. However, it may be that better data might partially substitute for a generic behavioural model of decision-making.
  • Agreement upon the goals of modelling. Although there will always be considerable variation in terms of what is wanted from a model of any particular social phenomena, a common core of agreed objectives will help focus any comparison and give confidence via ensembles of projections. Although the MIPs and Covid Forecast Hub are focussed on prediction, it may be that empirical explanation may be more important in other areas.
  • The available resources. ABM projects tend to be add-ons to larger endeavours and based around short-term grant funding. The funding for big ABM projects is yet to be established, not having the equivalent of weather forecasting to piggy-back on.
  • Persistence of modelling teams/projects. ABM tends to be quite short-term with each project developing a new model for a new project. This has made it hard to keep good modelling teams together.
  • Deep uncertainty. Whilst the set of possible factors and processes involved in a climate change model are well established, which social mechanisms need to be involved in any model of any particular social phenomena is unknown. For this reason, there is deep disagreement about the assumptions to be made in such models, as well as sharp divergence in outcome due to changes brought about by a particular mechanism but not included in a model. Whilst uncertainty in known mechanisms can be quantified, assessing the impact of those due to such deep uncertainty is much harder.
  • The sensitivity of the political context. Even in the case of Climate Change, where the assumptions made are relatively well understood and done on objective bases, the modelling exercise and its outcomes can be politically contested. In other areas, where the representation of people’s behaviour might be key to model outcomes, this will need even more care (Adoha & Edmonds 2017).

However, some of these problems were solved in the case of Climate Change as a result of the CMIP exercises and the reports they ultimately resulted in. Over time the development of the models also allowed for a broadening and updating of modelling goals, starting from a relatively narrow initial set of experiments. Ensuring the persistence of individual modelling teams is easier in the context of an internationally recognised comparison project, because resources may be easier to obtain, and there is a consistent central focus. The modelling projects became longer-term as individual researchers could establish a career doing just climate change modelling and importance of the work increasingly recognised. An ABM modelling comparison project might help solve some of these problems as the importance of its work is established.

Towards an Initial Proposal

The topic chosen for this project should be something where there: (a) is enough public interest to justify the effort, (b) there are a number of models with a similar purpose in mind being developed.  At the current stage, this suggests dynamic models of COVID spread, but there are other possibilities, including: transport models (where people go and who they meet) or criminological models (where and when crimes happen).

Whichever ensemble of models is focussed upon, these models should be compared on a core of standard, with the same:

  • Start and end dates (but not necessarily the same temporal granularity)
  • Covering the same set of regions or cases
  • Using the same population data (though possibly enhanced with extra data and maybe scaled population sizes)
  • With the same initial conditions in terms of the population
  • Outputting a core of agreed measures (but maybe others as well)
  • Checked against their agreement against a core set of cases, with agreed data sets
  • Reported on in a standard format (though with a discussion section for further/other observations)
  • well documented and with code that is open access
  • Run a minimum of times with different random seeds

Any modeller/team that had a suitable model and was willing to adhere to the rules would be welcome to participate (commercial, government or academic) and these teams would collectively decide the rules, development and write any reports on the comparisons. Other interested stakeholder groups could be involved including professional/academic associations, NGOs and government departments but in a consultative role providing wider critique – it is important that the terms and reports from the exercise be independent or any particular interest or authority.

Conclusion

We call upon those who think ABMs have the potential to usefully inform policy decisions to work together, in order that the transparency and rigour of our modelling matches our ambition. Whilst model comparison exercises of the kind described are important for any simulation work, particular care needs to be taken when the outcomes can affect people’s lives.

References

Aodha, L. & Edmonds, B. (2017) Some pitfalls to beware when applying models to issues of policy relevance. In Edmonds, B. & Meyer, R. (eds.) Simulating Social Complexity – a handbook, 2nd edition. Springer, 801-822. (A version is at http://cfpm.org/discussionpapers/236)

Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. Computational & Mathematical Organization Theory, 1(2), 123-141. https://link.springer.com/article/10.1007%2FBF01299065

Edmonds, B., & Hales, D. (2003). Replication, replication and replication: Some hard lessons from model alignment. Journal of Artificial Societies and Social Simulation, 6(4), 11. http://jasss.soc.surrey.ac.uk/6/4/11.html

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., & Taylor, K. E. (2016). Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development, 9(5), 1937–1958. https://doi.org/10.5194/gmd-9-1937-2016

Gates, W. L., Boyle, J. S., Covey, C., Dease, C. G., Doutriaux, C. M., Drach, R. S., Fiorino, M., Gleckler, P. J., Hnilo, J. J., Marlais, S. M., Phillips, T. J., Potter, G. L., Santer, B. D., Sperber, K. R., Taylor, K. E., & Williams, D. N. (1999). An Overview of the Results of the Atmospheric Model Intercomparison Project (AMIP I). In Bulletin of the American Meteorological Society (Vol. 80, Issue 1, pp. 29–55). American Meteorological Society. https://doi.org/10.1175/1520-0477(1999)080<0029:AOOTRO>2.0.CO;2

Hales, D., Rouchier, J., & Edmonds, B. (2003). Model-to-model analysis. Journal of Artificial Societies and Social Simulation, 6(4), 5. http://jasss.soc.surrey.ac.uk/6/4/5.html

Jones, B.C., DeBruine, L.M., Flake, J.K. et al. To which world regions does the valence–dominance model of social perception apply?. Nat Hum Behav 5, 159–169 (2021). https://doi.org/10.1038/s41562-020-01007-2

Moshontz, H. + 85 others (2018) The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network ,  1(4) 501-515. https://doi.org/10.1177/2515245918797607

Tittensor, D. P., Eddy, T. D., Lotze, H. K., Galbraith, E. D., Cheung, W., Barange, M., Blanchard, J. L., Bopp, L., Bryndum-Buchholz, A., Büchner, M., Bulman, C., Carozza, D. A., Christensen, V., Coll, M., Dunne, J. P., Fernandes, J. A., Fulton, E. A., Hobday, A. J., Huber, V., … Walker, N. D. (2018). A protocol for the intercomparison of marine fishery and ecosystem models: Fish-MIP v1.0. Geoscientific Model Development, 11(4), 1421–1442. https://doi.org/10.5194/gmd-11-1421-2018

Wei, Y., Liu, S., Huntzinger, D. N., Michalak, A. M., Viovy, N., Post, W. M., Schwalm, C. R., Schaefer, K., Jacobson, A. R., Lu, C., Tian, H., Ricciuto, D. M., Cook, R. B., Mao, J., & Shi, X. (2014). The north american carbon program multi-scale synthesis and terrestrial model intercomparison project – Part 2: Environmental driver data. Geoscientific Model Development, 7(6), 2875–2893. https://doi.org/10.5194/gmd-7-2875-2014


Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models - It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/


 

Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM

By Sebastian Achter, Melania Borit, Edmund Chattoe-Brown, Christiane Palaretti and Peer-Olaf Siebers

The initiative presented below arose from a Lorentz Center workshop on Integrating Qualitative and Quantitative Evidence using Social Simulation (8-12 April 2019, Leiden, the Netherlands). At the beginning of this workshop, the attenders divided themselves into teams aiming to work on specific challenges within the broad domain of the workshop topic. Our team took up the challenge of looking at “Rigour, Transparency, and Reuse”. The aim that emerged from our initial discussions was to create a framework for augmenting rigour and transparency (RAT) of data use in ABM when both designing, analysing and publishing such models.

One element of the framework that the group worked on was a roadmap of the modelling process in ABM, with particular reference to the use of different kinds of data. This roadmap was used to generate the second element of the framework: A protocol consisting of a set of questions, which, if answered by the modeller, would ensure that the published model was as rigorous and transparent in terms of data use, as it needs to be in order for the reader to understand and reproduce it.

The group (which had diverse modelling approaches and spanned a number of disciplines) recognised the challenges of this approach and much of the week was spent examining cases and defining terms so that the approach did not assume one particular kind of theory, one particular aim of modelling, and so on. To this end, we intend that the framework should be thoroughly tested against real research to ensure its general applicability and ease of use.

The team was also very keen not to “reinvent the wheel”, but to try develop the RAT approach (in connection with data use) to augment and “join up” existing protocols or documentation standards for specific parts of the modelling process. For example, the ODD protocol (Grimm et al. 2010) and its variants are generally accepted as the established way of documenting ABM but do not request rigorous documentation/justification of the data used for the modelling process.

The plan to move forward with the development of the framework is organised around three journal articles and associated dissemination activities:

  • A literature review of best (data use) documentation and practice in other disciplines and research methods (e.g. PRISMA – Preferred Reporting Items for Systematic Reviews and Meta-Analyses)
  • A literature review of available documentation tools in ABM (e.g. ODD and its variants, DOE, the “Info” pane of NetLogo, EABSS)
  • An initial statement of the goals of RAT, the roadmap, the protocol and the process of testing these resources for usability and effectiveness
  • A presentation, poster, and round table at SSC 2019 (Mainz)

We would appreciate suggestions for items that should be included in the literature reviews, “beta testers” and critical readers for the roadmap and protocol (from as many disciplines and modelling approaches as possible), reactions (whether positive or negative) to the initiative itself (including joining it!) and participation in the various activities we plan at Mainz. If you are interested in any of these roles, please email Melania Borit (melania.borit@uit.no).

Acknowledgements

Chattoe-Brown’s contribution to this research is funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

References

Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J. and Railsback, S. F. (2010) ‘The ODD Protocol: A Review and First Update’, Ecological Modelling, 221(23):2760–2768. doi:10.1016/j.ecolmodel.2010.08.019


Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. and Siebers, P.-O.(2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4th June 2019. https://rofasss.org/2019/06/04/rat/