Tag Archives: Computational models

Nigel Gilbert

March 4, 2025 thesubmissionauthor Leave a comment

By Corinna Elsenbroich & Petra Ahrweiler

The first piece on winners of the European Social Simulation Association’s Rosaria Conte Outstanding Contribution Award for Social Simulation.

“Gilbert, a former sociologist of science, has been one of the chief links in Britain between computer scientists and sociologists of science” [1, p. 294]

Nigel has always been and still is a sociologist – not only of science, but also of technology, innovation, methods and many other subfields of sociology with important contributions in theory, empirical research and sociological methods.

He has pioneered a range of sociological areas such as Sociology of Scientific Knowledge, Secondary Analysis of Government Datasets, Access to Social Security Information, Social Simulation, and Complexity Methods of Policy Evaluation.

Collins is right, however, that Nigel is one of the chief links between sociologists and computer scientists in the UK and beyond. This earned him to be elected as the first practising social scientist elected as a Fellow of the Royal Academy of Engineering (1999). As the principal founding father of agent-based modelling as a method for the social sciences in Europe, he initiated, promoted and institutionalised a completely novel way of doing social sciences through the Centre for Research in Social Simulation (CRESS) at the University of Surrey, the Journal of Artificial Societies and Social Simulation (JASSS), founded Sociological Research Online (1993) and Social Research Update. Nigel has 100s of publications on all aspects of social simulation and seminal books like: Simulating societies: the computer simulation of social phenomena (Gilbert & Doran 1994), Artificial Societies: The Computer Simulation of Social Phenomena (Gilbert & Conte 1995), Simulation for the Social Scientist (Gilbert &Troitzsch 2005), and Agent-based Models (Gilbert 2019). His entrepreneurial spirit and acumen resulted in over 25 large project grants (across the UK and Europe), often in close collaboration with policy and decision makers to ensure real life impact, a simulation platform on innovation networks called SKIN, and a spin off company CECAN Ltd, training practitioners in complexity methods and bringing their use to policy evaluation projects.

Nigel is a properly interdisciplinary person, turning to the sociology of scientific knowledge in his PhD under Michael Mulkay after graduating in Engineering from Cambridge’s Emmanuel College. He joined the Sociology Department at the University of Surrey in 1976 where he became professor of sociology in 1991. Nigel was appointed Commander of the Order of the British Empire (CBE) in 2016 for contributions to engineering and social sciences.

He was the second president of the European Social Simulation Association ESSA, the originator of the SIMSOC mailing list, launched and edited the Journal of Artificial Societies and Social Simulation from 1998-2014 and he was the first holder of the Rosaria Conte Outstanding Contribution Award for Social Simulation in 2016, an unanimous decision by the ESSA Management Committee.

Despite all of this, all these achievements and successes, Nigel is the most approachable, humble and kindest person you will ever meet. In any peril he is the person that will bring you a step forward when you need a helping hand. On asking him, after getting a CBE etc. what is the recognition that makes him most happy, he said, with the unique Nigel Gilbert twinkle in his eye, “my Rosaria Conte Award”.

References

Collins, H. (1995). Science studies and machine intelligence. In Handbook of Science and Technology Studies, Revised Edition (pp. 286-301). SAGE Publications, Inc., https://doi.org/10.4135/9781412990127

Gilbert, N., & Doran, R. (Eds.). (1994). Simulating societies: the computer simulation of social phenomena. Routledge.

Gilbert, N. & Conte, R. (1995) Artificial Societies: the computer simulation of social life. Routeledge. https://library.oapen.org/handle/20.500.12657/24305

Gilbert, N. (2019). Agent-based models. Sage Publications.

Gilbert, N., & Troitzsch, K. (2005). Simulation for the social scientist. Open University Press; 2^nd edition.

Elsenbroich, C. & Ahrweiler, P. (2025) Nigel Gilbert. Review of Artificial Societies and Social Simulation, 3 Mar 2025. https://rofasss.org/2025/04/03/nigel-gilbert

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

The inevitable “layering” of models to extend the reach of our understanding

February 9, 2023 thesubmissionauthor Leave a comment

By Bruce Edmonds

“Just as physical tools and machines extend our physical abilities, models extend our mental abilities, enabling us to understand and control systems beyond our direct intellectual reach” (Calder & al. 2018)

Motivation

There is a modelling norm that one should be able to completely understand one’s own model. Whilst acknowledging there is a trade-off between a model’s representational adequacy and its simplicity of formulation, this tradition assumes there will be a “sweet spot” where the model is just tractable but also good enough to be usefully informative about the target of modelling – in the words attributed to Einstein, “Everything should be made as simple as possible, but no simpler”¹. But what do we do about all the phenomena where to get an adequate model² one has to settle for a complex one (where by “complex” I mean a model that we do not completely understand)? Despite the tradition in Physics to the contrary, it would be an incredibly strong assumption that there are no such phenomena, i.e. that an adequate simple model is always possible (Edmonds 2013).

There are three options in these difficult cases.

Do not model the phenomena at all until we can find an adequate model we can fully understand. Given the complexity of much around us this would mean to not model these for the foreseeable future and maybe never.
Accept inadequate simpler models and simply hope that these are somehow approximately right³. This option would allow us to get answers but with no idea whether they were at all reliable. There are many cases of overly simplistic models leading policy astray (Adoha & Edmonds 2017; Thompson 2022), so this is dangerous if such models influence decisions with real consequences.
Use models that are good for our purpose but that we only partially understand. This is the option examined in this paper.

When the purpose is empirical the last option is equivalent to preferring empirical grounding over model simplicity (Edmonds & Moss 2005).

Partially Understood Models

In practice this argument has already been won – we do not completely understand many computer simulations that we use and rely on. For example, due to the chaotic nature of the dynamics of the weather, forecasting models are run multiple times with slightly randomised inputs and the “ensemble” of forecasts inspected to get an idea of the range of different outcomes that could result (some of which might be qualitatively different from the others)⁴. Working out the outcomes in each case requires the computational tracking of a huge numbers of entities in a way that is far beyond what the human mind can do⁵. In fact, the whole of “Complexity Science” can be seen as different ways to get some understanding of systems for which there is no analytic solution⁶.

Of course, this raises the question of what is meant by “understand” a model, for this is not something that is formally defined. This could involve many things, including the following.

That the micro-level – the individual calculations or actions done by the model each time step – is understood. This is equivalent to understanding each line of the computer code.
That some of the macro-level outcomes that result from the computation of the whole model is understood in terms of partial theories or “rules of thumb”.
That all the relevant macro-level outcomes can be determined to a high degree of accuracy without simulating the model (e.g. by a mathematical model).

Clearly, level (1) is necessary for most modelling purposes in order to know the model is behaving as intended. The specification of this micro-level is usually how such models are made, so if this differs from what was intended then this would be a bug. Thus this level would be expected of most models⁷. However, this does not necessarily mean that this is at the finest level of detail possible – for example, we usually do not bother about how random number generators work, but simply rely on its operation, but in this case we have very good level (3) of understanding for these sub-routines.

At the other extreme, a level (3) understanding is quite rare outside the realm of physics. In a sense, having this level of understanding makes the model redundant, so would probably not be the case for most working models (those used regularly)⁸. As discussed above, there will be many kinds of phenomena for which this level of understanding is not feasible.

Clearly, what many modelers find useful is a combination of levels (1) & (2) – that is, the detailed, micro-level steps that the model takes are well understood and the outcomes understood well enough for the intended task. For example, when using a model to establish a complex explanation⁹ (of some observed pattern in data using certain mechanisms or structures) then one might understand the implementation of the candidate mechanisms and verify that the outcomes fit the target pattern for a range of parameters, but not completely understand the detail of the causation involved. There might well be some understanding, for example how robust this is to minor variations in the initial conditions or the working of the mechanisms involved (e.g. by adding some noise to the processes). A complete understanding might not be accessible but this does not stop an explanation being established (although a better understanding is an obvious goal for future research or avenue for critiques of the explanation).

Of course, any lack of a complete, formal understanding leaves some room for error. The argument here is not deriding the desirability of formal understanding, but is against prioritising that over model adequacy. Also the lack of a formal, level (3), understanding of a model does not mean we cannot take more pragmatic routes to checking it. For example: performing a series of well-designed simulation experiments that intend to potentially refute the stated conclusions, systematically comparing to other models, doing a thorough sensitivity analysis and independently reproducing models can help ensure their reliability. These can be compared with engineering methods – one may not have a proof that a certain bridge design is solid over all possible dynamics, but practical measures and partial modelling can ensure that any risk is so low as to be negligible. If we had to wait until bridge designs were proven beyond doubt, we would simply have to do without them.

Layering Models to Leverage some Understanding

As a modeller, if I do not understand something my instinct is to model it. This instinct does not change if what I do not understand is, itself, a model. The result is a model of the original model – a meta-model. This is, in fact, common practice. I may select certain statistics summarising the outcomes and put these on a graph; I might analyse the networks that have emerged during model runs; I may use maths to approximate or capture some aspect of the dynamics; I might cluster and visualise the outcomes using Machine Learning techniques; I might make a simpler version of the original and compare them. All of these might give me insights into the behaviour of the original model. Many of these are so normal we do not think of this as meta-modelling. Indeed, empirically-based models are already, in a sense, meta-models, since the data that they represent are themselves a kind of descriptive model of reality (gained via measurement processes).

This meta-modelling strategy can be iterated to produce meta-meta-models etc. resulting in “layers” of models, with each layer modelling some aspect of the one “below” until one reaches the data and then what the data measures. Each layer should be able to be compared and checked with the layer “below”, and analysed by the layer “above”.

An extended example of such layering was built during the SCID (Social Complexity of Immigration and Diversity) project¹⁰ and illustrated in Figure 1. In this a complicated simulation (Model 1) was built to incorporate some available data and what was known concerning the social and behavioural processes that lead people to bother to vote (or not). This simulation was used as a counter-example to show how assumptions about the chaining effect of interventions might be misplaced (Fieldhouse et al. 2016). A much simpler simulation was then built by theoretical physicists (Model 2), so that it produced the same selected outcomes over time and aa range of parameter values. This allowed us to show that some of the features in the original (such as dynamic networks) were essential to get the observed dynamics in it (Lafuerza et al. 2016a). This simpler model was in turn modelled by an even simpler model (Model 3) that was amenable to an analytic model (Model 4) that allowed us to obtain some results concerning the origin of a region of bistability in the dynamics (Lafuerza et al. 2016b).

Layering fig 1

Figure 1. The Layering of models that were developed in part of the SCID project

Although there are dangers in such layering – each layer could introduce a new weakness – there are also methodological advantages, including the following. (A) Each model in the chain (except model 4) is compared and checked against both the layer below and that above. Such multiple model comparisons are excellent for revealing hidden assumptions and unanticipated effects. (B) Whilst previously what might have happened was a “heroic” leap of abstraction from evidence and understanding straight to Model 3 or 4, here abstraction happens over a series of more modest steps, each of which is more amenable to checking and analysis. When you stage abstraction the introduced assumptions are more obvious and easier to analyse.

One can imagine such “layering” developing in many directions to leverage useful (but indirect) understanding, for example the following.

Using an AI algorithm to learn patterns in some data (e.g. medical data for disease diagnosis) but then modelling its working to obtain some human-accessible understanding of how it is doing it.
Using a machine learning model to automatically identify the different “phase spaces” in model results where qualitatively different model behaviour is exhibited, so one can then try to simplify the model within each phase.
Automatically identifying the processes and structures that are common to a given set of models to facilitate the construction of a more general, ‘umbrella’ model that approximates all the outcomes that would have resulted from the set, but within a narrower range of conditions.

As the quote at the top implies, we are used to settling for partial control of what machines do because it allows us to extend our physical abilities in useful ways. Each time we make their control more indirect, we need to check that this is safe and adequate for purpose. In the cars we drive there are ever more layers of electronic control between us and the physical reality it drives through which we adjust to – we are currently adjusting to more self-drive abilities. Of course, the testing and monitoring of these systems is very important but that will not stop the introduction of layers that will make them safer and more pleasant to drive.

The same is true of our modelling, which we will need to apply in ever more layers in order to leverage useful understanding which would not be accessible otherwise. Yes, we will need to use practical methods to test their fitness for purpose and reliability, and this might include the complete verification of some components (where this is feasible), but we cannot constrain ourselves to only models we completely understand.

Concluding Discussion

If the above seems obvious, then why am I bothering to write this? I think for a few reasons. Firstly, to answer the presumption that understanding one’s model must have priority over all other considerations (such as empirical adequacy) so that sometimes we must accept and use partially understood models. Secondly, to point out that such layering has benefits as well as difficulties – especially if it can stage abstraction into more verifiable steps and thus avoid huge leaps to simple but empirically-isolated models. Thirdly, because such layering will become increasingly common and necessary.

In order to extend our mental reach further, we will need to develop increasingly complicated and layered modelling. To do this we will need to accept that our understanding is leveraged via partially understood models, but also to develop the practical methods to ensure their adequacy for purpose.

Notes

[1] These are a compressed version of his actual words during a 1933 lecture, which were: “It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” (Robinson 2018)
[2] Adequate for whatever our purpose for it is (Edmonds & al. 2019).
[3]The weasel words I once heard from a Mathematician excusing an analytic model he knew to be simplistic were: that, although he knew it was wrong, it was useful for “capturing core dynamics” (though how he knew that they were not completely wrong eludes me).
[4] For an introduction to this approach read the European Centre for Medium-Range Weather Forecasts’ fact sheet on “Ensemble weather forecasting” at: https://www.ecmwf.int/en/about/media-centre/focus/2017/fact-sheet-ensemble-weather-forecasting
[5] In principle, a person could do all the calculations involved in a forecast but only with the aid of exterior tools such as pencil and paper to keep track of it all so it is arguable whether the person doing the individual calculations has an “understanding” of the complete picture. Lewis Fry Richardson, who pioneered the idea of numerical forecasting of weather in the 1920s, did a 1-day forecast by hand to illustrate his method (Lynch 2008), but this does not change the argument.
[6] An analytic solution is when one can obtain a closed-form equation that characterises all the outcomes by manipulating the mathematical symbols in a proof. If one has to numerically calculate outcomes for different initial conditions and parameters this is a computational solution.
[7] For purely predictive models, whose purpose is only to anticipate an unknown value to a useful level of accuracy, this is not strictly necessary. For example, how some AI/Machine learning models work may not clear at the micro-level, but as long as it works (successfully predicts) this does not matter – even if its predictive ability is due to a bug.
[8] Models may still be useful in this case, for example to check the assumptions made in the matching mathematical or other understanding.
[9] For more on this use see (Edmonds et al. 2019).
[10] For more about this project see http://cfpm.org/scid

Acknowledgements

Bruce Edmonds is supported as part of the ESRC-funded, UK part of the “ToRealSim” project, 2019-2023, grant number ES/S015159/1 and was supported as part of the EPSRC-funded “SCID” project 2010-2016, grant number EP/H02171X/1.

References

Calder, M., Craig, C., Culley, D., de Cani, R., Donnelly, C.A., Douglas, R., Edmonds, B., Gascoigne, J., Gilbert, N. Hargrove, C., Hinds, D., Lane, D.C., Mitchell, D., Pavey, G., Robertson, D., Rosewell, B., Sherwin, S., Walport, M. and Wilson, A. (2018) Computational modelling for decision-making: where, why, what, who and how. Royal Society Open Science, DOI:10.1098/rsos.172096.

Edmonds, B. (2013) Complexity and Context-dependency. Foundations of Science, 18(4):745-755. DOI:10.1007/s10699-012-9303-x

Edmonds, B. and Moss, S. (2005) From KISS to KIDS – an ‘anti-simplistic’ modelling approach. In P. Davidsson et al. (Eds.): Multi Agent Based Simulation 2004. Springer, Lecture Notes in Artificial Intelligence, 3415:130–144. DOI:10.1007/978-3-540-32243-6_11

Edmonds, B., le Page, C., Bithell, M., Chattoe-Brown, E., Grimm, V., Meyer, R., Montañola-Sales, C., Ormerod, P., Root H. & Squazzoni. F. (2019) Different Modelling Purposes. Journal of Artificial Societies and Social Simulation, 22(3):6. DOI:10.18564/jasss.3993

Fieldhouse, E., Lessard-Phillips, L. & Edmonds, B. (2016) Cascade or echo chamber? A complex agent-based simulation of voter turnout. Party Politics. 22(2):241-256. DOI:10.1177/1354068815605671

Lafuerza, LF, Dyson, L, Edmonds, B & McKane, AJ (2016a) Simplification and analysis of a model of social interaction in voting, European Physical Journal B, 89:159. DOI:10.1140/epjb/e2016-70062-2

Lafuerza L.F., Dyson L., Edmonds B., & McKane A.J. (2016b) Staged Models for Interdisciplinary Research. PLoS ONE, 11(6): e0157261. DOI:10.1371/journal.pone.0157261

Lynch, P. (2008). The origins of computer weather prediction and climate modeling. Journal of Computational Physics, 227(7), 3431-3444. DOI:10.1016/j.jcp.2007.02.034

Robinson, A. (2018) Did Einstein really say that? Nature, 557, 30. DOI:10.1038/d41586-018-05004-4

Thompson, E. (2022) Escape from Model Land. Basic Books. ISBN-13: 9781529364873

Edmonds, B. (2023) The inevitable “layering” of models to extend the reach of our understanding. Review of Artificial Societies and Social Simulation, 9 Feb 2023. https://rofasss.org/2023/02/09/layering

Meta, Model Reproduction Reports, review

RofASSS to encourage reproduction reports and reviews of old papers&books

February 8, 2023 someeditorialcontent 1 Comment

Reproducing simulation models is essential for verifying them and critiquing them. This involves a lot more work than one would think (Axtell & al. 1996) and can reveal surprising flaws, even in the simplest of models (e.g. Edmonds & Hales 2003). Such reproduction is especially vital if the model outcomes are likely to affect people’s lives (Chattoe-Brown & al. 2021).

Whilst substantial pieces of work – where there is extensive analysis or extension – can be submitted to JASSS/CMOT, some such reports might be much simpler and not justify a full journal paper. Thus RofASSS has decided to encourage researchers to submit reports of reproductions here – however simple or complicated.

Similarly, JASSS, CMOT etc. do publish book reviews, but these tend to be of recent books. Although new books are of obvious interest to those at the cutting edge of research, it often happens that important papers & books are forgotten or overlooked. At RofASSS we would like to encourage reviews of any relevant book or paper, however old.

References

Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. Computational & Mathematical Organization Theory, 1, 123-141. DOI: 10.1007/BF01299065

Edmonds, B., & Hales, D. (2003). Replication, replication and replication: Some hard lessons from model alignment. Journal of Artificial Societies and Social Simulation, 6(4), 11. https://jasss.soc.surrey.ac.uk/6/4/11.html

Chattoe-Brown, E. Gilbert, N., Robertson, D. A. & Watts, C. (2021) Reproduction as a Means of Evaluating Policy Models: A Case Study of a COVID-19 Simulation. medRxiv 2021.01.29.21250743; DOI: 10.1101/2021.01.29.21250743

Content

If you want to be cited, calibrate your agent-based model: A Reply to Chattoe-Brown

March 9, 2022 thesubmissionauthor 1 Comment

By Marijn A. Keijzer

This is a reply to a previous comment, (Chattoe-Brown 2022).

The social simulation literature has called on its proponents to enhance the quality and realism of their contributions through systematic validation and calibration (Flache et al., 2017). Model validation typically refers to assessments of how well the predictions of their agent-based models (ABMs) map onto empirically observed patterns or relationships. Calibration, on the other hand, is the process of enhancing the realism of the model by parametrizing it based on empirical data (Boero & Squazzoni, 2005). We would expect that presenting a validated or calibrated model serves as a signal of model quality, and would thus be a desirable characteristic of a paper describing an ABM.

In a recent contribution to RofASSS, Edmund Chattoe-Brown provocatively argued that model validation does not bear fruit for researchers interested in boosting their citations. In a sample of articles from JASSS published on opinion dynamics he observed that “the sample clearly divides into non-validated research with more citations and validated research with fewer” (Chattoe-Brown, 2022). Well-aware of the bias and limitations of the sample at hand, Chattoe-Brown calls on refutation of his hypothesis. An analysis of the corpus of articles in Web of Science, presented here, could serve that goal.

To test whether there exists an effect of model calibration and/or validation on the citation counts of papers, I compare citation counts of a larger number of original research articles on agent-based models published in the literature. I extracted 11,807 entries from Web of Science by searching for items that contained the phrases “agent-based model”, “agent-based simulation” or “agent-based computational model” in its abstract.^[1] I then labeled all items that mention “validate” in its abstract as validated ABMs and those that mention “calibrate” as calibrated ABMs. This measure if rather crude, of course, as descriptions containing phrases like “we calibrated our model” or “others should calibrate our model” are both labeled as calibrated models. However, if mentioning that future research should calibrate or validate the model is not related to citations counts (which I would argue it indeed is not), then this inaccuracy does not introduce bias.

The shares of entries that mention calibration or validation are somewhat small. Overall, just 5.62% of entries mention validation, 3.21% report a calibrated model and 0.65% fall in both categories. The large sample size, however, will still enable the execution of proper statistical analysis and hypothesis testing.

How are mentions of calibration and validation in the abstract related to citation counts at face value? Bivariate analyses show only minor differences, as revealed in Figure 1. In fact, the distribution of citations for validated and non-validated ABMs (panel A) is remarkably similar. Wilcoxon tests with continuity correction—the nonparametric version of the simple t test—corroborate their similarity (W = 3,749,512, p = 0.555). The differences in citations between calibrated and non-calibrated models appear, albeit still small, more pronounced. Calibrated ABMs are cited slightly more often (panel B), as also supported by a bivariate test (W = 1,910,772, p < 0.001).

Figure 1. Distributions of number of citations of all the entries in the dataset for validated (panel A) and calibrated (panel B) ABMs and their averages with standard errors over years (panels C and D)

Age of the paper might be a more important determinant of citation counts, as panels C and D of Figure 1 suggest. Clearly, the age of a paper should be important here, because older papers have had much more opportunity to get cited. In particular, papers younger than 10 years seem to not have matured enough for its citation rates to catch up to older articles. When comparing the citation counts of purely theoretical models with calibrated and validated versions, this covariate should not be missed, because the latter two are typically much younger. In other words, the positive relationship between model calibration/validation and citation counts could be hidden in the bivariate analysis, as model calibration and validation are recent trends in ABM research.

I run a Poisson regression on the number of citations as explained by whether they are validated and calibrated (simultaneously) and whether they are both. The age of the paper is taken into account, as well as the number of references that the paper uses itself (controlling for reciprocity and literature embeddedness, one might say). Finally, the fields in which the papers have been published, as registered by Web of Science, have been added to account for potential differences between fields that explains both citation counts and conventions about model calibration and validation.

Table 1 presents the results from the four models with just the main effects of validation and calibration (model 1), the interaction of validation and calibration (model 2) and the full model with control variables (model 3).

Table 1. Poisson regression on the number of citations

	# Citations
	(1)	(2)	(3)

Validated	-0.217^***	-0.298^***	-0.094^***
	(0.012)	(0.014)	(0.014)
Calibrated	0.171^***	0.064^***	0.076^***
	(0.014)	(0.016)	(0.016)
Validated x Calibrated		0.575^***	0.244^***
		(0.034)	(0.034)
Age			0.154^***
			(0.0005)
Cited references			0.013^***
			(0.0001)
Field included	No	No	Yes
Constant	2.553^***	2.556^***	0.337^**
	(0.003)	(0.003)	(0.164)

Observations	11,807	11,807	11,807
AIC	451,560	451,291	301,639

Note:	^p<0.1; ^p<0.05; ^**p<0.01

The results from the analyses clearly suggest a negative effect of model validation and a positive effect of model calibration on the likelihood of being cited. The hypothesis that was so “badly in need of refutation” (Chattoe-Brown, 2022) will remain unrefuted for now. The effect does turn positive, however, when the abstract makes mention of calibration as well. In both the controlled (model 3) and uncontrolled (model 2) analyses, combining the effects of validation and calibration yields a positive coefficient overall.^[2]

The controls in model 3 substantially affect the estimates from the three main factors of interest, while remaining in expected directions themselves. The age of a paper indeed helps its citation count, and so does the number of papers the item cites itself. The fields, furthermore, take away from the main effects somewhat, too, but not to a problematic degree. In an additional analysis, I have looked at the relationship between the fields and whether they are more likely to publish calibrated or validated models and found no substantial relationships. Citation counts will differ between fields, however. The papers in our sample are more often cited in, for example, hematology, emergency medicine and thermodynamics. The ABMs in the sample coming from toxicology, dermatology and religion are on the unlucky side of the equation, receiving less citations on average. Finally, I have also looked at papers published in JASSS specifically, due to the interest of Chattoe-Brown and the nature of this outlet. Surprisingly, the same analyses run on the subsample of these papers (N=376) showed a negative relationship between citation counts and model calibration/validation. Does the JASSS readership reveal its taste for artificial societies?

In sum, I find support for the hypothesis of Chattoe-Brown (2022) on the negative relationship between model validation and citations counts for papers presenting ABMs. If you want to be cited, you should not validate your ABM. Calibrated ABMs, on the other hand, are more likely to receive citations. What is more, ABMs that were both calibrated and validated are most the most successful papers in the sample. All conclusions were drawn considering (i.e. controlling for) the effects of age of the paper, the number of papers the paper cited itself, and (citation conventions in) the field in which it was published.

While the patterns explored in this and Chattoe-Brown’s recent contribution are interesting, or even puzzling, they should not distract from the goal of moving towards realistic agent-based simulations of social systems. In my opinion, models that combine rigorous theory with strong empirical foundations are instrumental to the creation of meaningful and purposeful agent-based models. Perhaps the results presented here should just be taken as another sign that citation counts are a weak signal of academic merit at best.

Data, code and supplementary analyses

All data and code used for this analysis, as well as the results from the supplementary analyses described in the text, are available here: https://osf.io/x9r7j/

Notes

^[1] Note that the hyphen between “agent” and “based” does not affect the retrieved corpus. Both contributions that mention “agent based” and “agent-based” were retrieved.

^[2] A small caveat to the analysis of the interaction effect is that the marginal improvement of model 2 upon model 1 is rather small (AIC difference of 269). This is likely (partially) due to the small number of papers that mention both calibration and validation (N=77).

Acknowledgements

Marijn Keijzer acknowledges IAST funding from the French National Research Agency (ANR) under the Investments for the Future (Investissements d’Avenir) program, grant ANR-17-EURE-0010.

References

Boero, R., & Squazzoni, F. (2005). Does empirical embeddedness matter? Methodological issues on agent-based models for analytical social science. Journal of Artificial Societies and Social Simulation, 8(4), 1–31. https://www.jasss.org/8/4/6.html

Chattoe-Brown, E. (2022) If You Want To Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly In Need of Refutation. Review of Artificial Societies and Social Simulation, 1st Feb 2022. https://rofasss.org/2022/02/01/citing-od-models

Flache, A., Mäs, M., Feliciani, T., Chattoe-Brown, E., Deffuant, G., Huet, S., & Lorenz, J. (2017). Models of social influence: towards the next frontiers. Journal of Artificial Societies and Social Simulation, 20(4). https://doi.org/10.18564/jasss.3521

Keijzer, M. (2022) If you want to be cited, calibrate your agent-based model: Reply to Chattoe-Brown. Review of Artificial Societies and Social Simulation, 9th Mar 2022. https://rofasss.org/2022/03/09/Keijzer-reply-to-Chattoe-Brown

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

If You Want To Be Cited, Don’t Validate Your Agent-Based Model: A Tentative Hypothesis Badly In Need of Refutation

February 1, 2022 thesubmissionauthor 1 Comment

By Edmund Chattoe-Brown

As part of a previous research project, I collected a sample of the Opinion Dynamics (hereafter OD) models published in JASSS that were most highly cited in JASSS. The idea here was to understand what styles of OD research were most influential in the journal. In the top 50 on 19.10.21 there were eight such articles. Five were self-contained modelling exercises (Hegselmann and Krause 2002, 58 citations, Deffuant et al. 2002, 35 citations, Salzarulo 2006, 13 citations, Deffuant 2006, 13 citations and Urbig et al. 2008, 9 citations), two were overviews of OD modelling (Flache et al. 2017, 13 citations and Sobkowicz 2009, 10 citations) and one included an OD example in an article mainly discussing the merits of cellular automata modelling (Hegselmann and Flache 1998, 12 citations). In order to get in to the top 50 on that date you had to achieve at least 7 citations. In parallel, I have been trying to identify Agent-Based Models that are validated (undergo direct comparison of real and equivalent simulated data). Based on an earlier bibliography (Chattoe-Brown 2020) which I extended to the end of 2021 for JASSS and articles which were described as validated in the highly cited articles listed above, I managed to construct a small and unsystematic sample of validated OD models. (Part of the problem with a systematic sample is that validated models are not readily searchable as a distinct category and there are too many OD models overall to make reading them all feasible. Also, I suspect, validated models just remain rare in line with the larger scale findings of Dutton and Starbuck (1971, p. 130, table 1) and discouragingly, much more recently, Angus and Hassani-Mahmooei (2015, section 4.5, figure 9). Obviously, since part of the sample was selected by total number of citations, one cannot make a comparison on that basis, so instead I have used the best possible alternative (given the limitations of the sample) and compared articles on citations per year. The problem here is that attempting validated modelling is relatively new while older articles inevitably accumulate citations however slowly. But what I was trying to discover was whether new validated models could be cited at a much higher annual rate without reaching the top 50 (or whether, conversely, older articles could have a high enough total citations to get into the top 50 without having a particularly impressive annual citation rate.) One would hope that, ultimately, validated models would tend to receive more citations than those that were not validated (but see the rather disconcerting related findings of Serra-Garcia and Gneezy 2021). Table 1 shows the results sorted by citations per year.

Article	Status	Number of JASSS Citations[1]	Number of Years[2]	Citations Per Year
Bernardes et al. 2002	Validated	1	20	0.05
Bernardes et al. 2001	Validated	2	21	0.096
Fortunato and Castellano 2007	Validated	2	15	0.13
Caruso and Castorina 2005	Validated	4	17	0.24
Chattoe-Brown 2014	Validated	2	8	0.25
Brousmiche et al. 2016	Validated	2	6	0.33
Hegselmann and Flache 1998	Non-Validated	12	24	0.5
Urbig et al. 2008	Non-Validated	9	14	0.64
Sobkowicz 2009	Non-Validated	10	13	0.77
Deffuant 2006	Non-Validated	13	16	0.81
Salzarulo 2006	Non-Validated	13	16	0.81
Duggins 2017	Validated	5	5	1
Deffuant et al. 2002	Non-Validated	35	20	1.75
Flache et al. 2017	Non-Validated	13	5	2.6
Hegselmann and Krause 2002	Non-Validated	58	20	2.9

Table 1. Annual Citation Rates for OD Articles Highly Cited in JASSS (Systematic Sample) and Validated OD Articles in or Cited in JASSS (Unsystematic Sample)

With the notable (and potentially encouraging) exception of Duggins (2017), the most recent validated OD model I have been able to discover in JASSS, the sample clearly divides into non-validated research with more citations and validated research with fewer. The position of Duggins (2017) might suggest greater recent interest in validated OD models. Unfortunately, however, qualitative analysis of the citations suggests that these are not cited as validated models per se (and thus as a potential improvement over non-validated models) but merely as part of general classes of OD model (like those involving social networks or repulsion – moving away from highly discrepant opinions). This tendency to cite validated models without acknowledging that they are validated (and what the implications of that might be) is widespread in the articles I looked at.

Obviously, there is plenty wrong with this analysis. Even looking at citations per annum we are arguably still partially sampling on the dependent variable (articles selected for being widely cited prove to be widely cited!) and the sample of validated OD models is unsystematic (though in fairness the challenges of producing a systematic sample are significant.[3]) But the aim here is to make a distinctive use of RoFASSS as a rapid mode of permanent publication and to think differently about science. If I tried to publish this in a peer reviewed journal, the amount of labour required to satisfy reviewers about the research design would probably be prohibitive (even if it were possible). As a result, the case to answer about this apparent (and perhaps undesirable) pattern in data might never see the light of day.

But by publishing quickly in RoFASSS without the filter of peer review I actively want my hypothesis to be rejected or replaced by research based on a better design (and such research may be motivated precisely by my presenting this interesting pattern with all its imperfections). When it comes to scientific progress, the chance to be clearly wrong now could be more useful than the opportunity to be vaguely right at some unknown point in the future.

Acknowledgements

This analysis was funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

Notes

[1] Note that the validated OD models had their citations counted manually while the high total citation articles had them counted automatically. This may introduce some comparison error but there is no reason to think that either count will be terribly inaccurate.

[2] Including the year of publication and the current year (2021).

[3] Note, however, that there are some checks and balances on sample quality. Highly successful validated OD models would have shown up independently in the top 50. There is thus an upper bound to the impact of the articles I might have missed in manually constructing my “version 1” bibliography. The unsystematic review of 47 articles by Sobkowicz (2009) also checks independently on the absence of validated OD models in JASSS to that date and confirms the rarity of such articles generally. Only four of the articles that he surveys are significantly empirical.

References

Angus, Simon D. and Hassani-Mahmooei, Behrooz (2015) ‘“Anarchy” Reigns: A Quantitative Analysis of Agent-Based Modelling Publication Practices in JASSS, 2001-2012’, Journal of Artificial Societies and Social Simulation, 18(4), October, article 16, <http://jasss.soc.surrey.ac.uk/18/4/16.html>. doi:10.18564/jasss.2952

Bernardes, A. T., Costa, U. M. S., Araujo, A. D. and Stauffer, D. (2001) ‘Damage Spreading, Coarsening Dynamics and Distribution of Political Votes in Sznajd Model on Square Lattice’, International Journal of Modern Physics C: Computational Physics and Physical Computation, 12(2), February, pp. 159-168. doi:10.1140/e10051-002-0013-y

Bernardes, A. T., Stauffer, D. and Kertész, J. (2002) ‘Election Results and the Sznajd Model on Barabasi Network’, The European Physical Journal B: Condensed Matter and Complex Systems, 25(1), January, pp. 123-127. doi:10.1142/S0129183101001584

Brousmiche, Kei-Leo, Kant, Jean-Daniel, Sabouret, Nicolas and Prenot-Guinard, François (2016) ‘From Beliefs to Attitudes: Polias, A Model of Attitude Dynamics Based on Cognitive Modelling and Field Data’, Journal of Artificial Societies and Social Simulation, 19(4), October, article 2, <https://www.jasss.org/19/4/2.html>. doi:10.18564/jasss.3161

Caruso, Filippo and Castorina, Paolo (2005) ‘Opinion Dynamics and Decision of Vote in Bipolar Political Systems’, arXiv > Physics > Physics and Society, 26 March, version 2. doi:10.1142/S0129183105008059

Chattoe-Brown, Edmund (2014) ‘Using Agent Based Modelling to Integrate Data on Attitude Change’, Sociological Research Online, 19(1), February, article 16, <https://www.socresonline.org.uk/19/1/16.html>. doi:0.5153/sro.3315

Chattoe-Brown Edmund (2020) ‘A Bibliography of ABM Research Explicitly Comparing Real and Simulated Data for Validation: Version 1’, CPM Report CPM-20-216, 12 June, <http://cfpm.org/discussionpapers/256>

Deffuant, Guillaume (2006) ‘Comparing Extremism Propagation Patterns in Continuous Opinion Models’, Journal of Artificial Societies and Social Simulation, 9(3), June, article 8, <https://www.jasss.org/9/3/8.html>.

Deffuant, Guillaume, Amblard, Frédéric, Weisbuch, Gérard and Faure, Thierry (2002) ‘How Can Extremism Prevail? A Study Based on the Relative Agreement Interaction Model’, Journal of Artificial Societies and Social Simulation, 5(4), October, article 1, <https://www.jasss.org/5/4/1.html>.

Duggins, Peter (2017) ‘A Psychologically-Motivated Model of Opinion Change with Applications to American Politics’, Journal of Artificial Societies and Social Simulation, 20(1), January, article 13, <http://jasss.soc.surrey.ac.uk/20/1/13.html>. doi:10.18564/jasss.3316

Dutton, John M. and Starbuck, William H. (1971) ‘Computer Simulation Models of Human Behavior: A History of an Intellectual Technology’, IEEE Transactions on Systems, Man, and Cybernetics, SMC-1(2), April, pp. 128-171. doi:10.1109/TSMC.1971.4308269

Flache, Andreas, Mäs, Michael, Feliciani, Thomas, Chattoe-Brown, Edmund, Deffuant, Guillaume, Huet, Sylvie and Lorenz, Jan (2017) ‘Models of Social Influence: Towards the Next Frontiers’, Journal of Artificial Societies and Social Simulation, 20(4), October, article 2, <http://jasss.soc.surrey.ac.uk/20/4/2.html>. doi:10.18564/jasss.3521

Fortunato, Santo and Castellano, Claudio (2007) ‘Scaling and Universality in Proportional Elections’, Physical Review Letters, 99(13), 28 September, article 138701. doi:10.1103/PhysRevLett.99.138701

Hegselmann, Rainer and Flache, Andreas (1998) ‘Understanding Complex Social Dynamics: A Plea For Cellular Automata Based Modelling’, Journal of Artificial Societies and Social Simulation, 1(3), June, article 1, <https://www.jasss.org/1/3/1.html>.

Hegselmann, Rainer and Krause, Ulrich (2002) ‘Opinion Dynamics and Bounded Confidence Models, Analysis, and Simulation’, Journal of Artificial Societies and Social Simulation, 5(3), June, article 2, <http://jasss.soc.surrey.ac.uk/5/3/2.html>.

Salzarulo, Laurent (2006) ‘A Continuous Opinion Dynamics Model Based on the Principle of Meta-Contrast’, Journal of Artificial Societies and Social Simulation, 9(1), January, article 13, <http://jasss.soc.surrey.ac.uk/9/1/13.html>.

Serra-Garcia, Marta and Gneezy, Uri (2021) ‘Nonreplicable Publications are Cited More Than Replicable Ones’, Science Advances, 7, 21 May, article eabd1705. doi:10.1126/sciadv.abd1705

Sobkowicz, Pawel (2009) ‘Modelling Opinion Formation with Physics Tools: Call for Closer Link with Reality’, Journal of Artificial Societies and Social Simulation, 12(1), January, article 11, <http://jasss.soc.surrey.ac.uk/12/1/11.html>.

Urbig, Diemo, Lorenz, Jan and Herzberg, Heiko (2008) ‘Opinion Dynamics: The Effect of the Number of Peers Met at Once’, Journal of Artificial Societies and Social Simulation, 11(2), March, article 4, <http://jasss.soc.surrey.ac.uk/11/2/4.html>.

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Reply to Frank Dignum

November 10, 2021 thesubmissionauthor Leave a comment

By Edmund Chattoe-Brown

This is a reply to Frank Dignum’s reply (about Edmund Chattoe-Brown’s review of Frank’s book)

As my academic career continues, I have become more and more interested in the way that people justify their modelling choices, for example, almost every Agent-Based Modeller makes approving noises about validation (in the sense of comparing real and simulated data) but only a handful actually try to do it (Chattoe-Brown 2020). Thus I think two specific statements that Frank makes in his response should be considered carefully:

“… we do not claim that we have the best or only way of developing an Agent-Based Model (ABM) for crises.” Firstly, negative claims (“This is not a banana”) are not generally helpful in argument. Secondly, readers want to know (or should want to know) what is being claimed and, importantly, how they would decide if it is true “objectively”. Given how many models sprang up under COVID it is clear that what is described here cannot be the only way to do it but the question is how do we know you did it “better?” This was also my point about institutionalisation. For me, the big lesson from COVID was how much the automatic response of the ABM community seems to be to go in all directions and build yet more models in a tearing hurry rather than synthesise them, challenge them or test them empirically. I foresee a problem both with this response and our possible unwillingness to be self-aware about it. Governments will not want a million “interesting” models to choose from but one where they have externally checkable reasons to trust it and that involves us changing our mindset (to be more like climate modellers for example, Bithell & Edmonds 2020). For example, colleagues and I developed a comparison methodology that allowed for the practical difficulties of direct replication (Chattoe-Brown et al. 2021).
The second quotation which amplifies this point is: “But we do think it is an extensive foundation from which others can start, either picking up some bits and pieces, deviating from it in specific ways or extending it in specific ways.” Again, here one has to ask the right question for progress in modelling. On what scientific grounds should people do this? On what grounds should someone reuse this model rather than start their own? Why isn’t the Dignum et al. model built on another “market leader” to set a good example? (My point about programming languages was purely practical not scientific. Frank is right that the model is no less valid because the programming language was changed but a version that is now unsupported seems less useful as a basis for the kind of further development advocated here.)

I am not totally sure I have understood Frank’s point about data so I don’t want to press it but my concern was that, generally, the book did not seem to “tap into” relevant empirical research (and this is a wider problem that models mostly talk about other models). It is true that parameter values can be adjusted arbitrarily in sensitivity analysis but that does not get us any closer to empirically justified parameter values (which would then allow us to attempt validation by the “generative methodology”). Surely it is better to build a model that says something about the data that exists (however imperfect or approximate) than to rely on future data collection or educated guesses. I don’t really have the space to enumerate the times the book said “we did this for simplicity”, “we assumed that” etc. but the cumulative effect is quite noticeable. Again, we need to be aware of the models which use real data in whatever aspects and “take forward” those inputs so they become modelling standards. This has to be a collective and not an individualistic enterprise.

References

Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/

Chattoe-Brown, E. (2020) A Bibliography of ABM Research Explicitly Comparing Real and Simulated Data for Validation. Review of Artificial Societies and Social Simulation, 12th June 2020. https://rofasss.org/2020/06/12/abm-validation-bib/

Chattoe-Brown, E. (2021) A review of “Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis”. Journal of Artificial Society and Social Simulation. 24(4). https://www.jasss.org/24/4/reviews/1.html

Chattoe-Brown, E., Gilbert, N., Robertson, D. A., & Watts, C. J. (2021). Reproduction as a Means of Evaluating Policy Models: A Case Study of a COVID-19 Simulation. medRxiv 2021.01.29.21250743; DOI: https://doi.org/10.1101/2021.01.29.21250743

Dignum, F. (2020) Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”. Review of Artificial Societies and Social Simulation, 4th Nov 2021. https://rofasss.org/2021/11/04/dignum-review-response/

Dignum, F. (Ed.) (2021) Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis. Springer. DOI:10.1007/978-3-030-76397-8

Chattoe-Brown, E. (2021) Reply to Frank Dignum. Review of Artificial Societies and Social Simulation, 10th November 2021. https://rofasss.org/2021/11/10/reply-to-dignum/

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”

November 4, 2021 thesubmissionauthor Leave a comment

By Frank Dignum

This is a reply to a review in JASSS (Chattoe-Brown 2021) of (Dignum 2021).

Before responding to some of the specific concerns of Edmund I would like to thank him for the thorough review. I am especially happy with his conclusion that the book is solid enough to make it a valuable contribution to scientific progress in modelling crises. That was the main aim of the book and it seems that is achieved. I want to reiterate what we already remarked in the book; we do not claim that we have the best or only way of developing an Agent-Based Model (ABM) for crises. Nor do we claim that our simulations were without limitations. But we do think it is an extensive foundation from which others can start, either picking up some bits and pieces, deviating from it in specific ways or extending it in specific ways.

The concerns that are expressed by Edmund are certainly valid. I agree with some of them, but will nuance some others. First of all the concern about the fact that we seem to abandon the NetLogo implementation and move to Repast. This fact does not make the ABM itself any less valid! In itself it is also an important finding. It is not possible to scale such a complex model in NetLogo beyond around two thousand agents. This is not just a limitation of our particular implementation, but a more general limitation of the platform. It leads to the important challenge to get more computer scientists involved to develop platforms for social simulations that both support the modelers adequately and provide efficient and scalable implementations.

That the sheer size of the model and the results make it difficult to trace back the importance and validity of every factor on the results is completely true. We have tried our best to highlight the most important aspects every time. But, this leaves questions as to whether we make the right selection of highlighted aspects. As an illustration to this, we have been busy for two months to justify our results of the simulations of the effectiveness of the track and tracing apps. We basically concluded that we need much better integrated analysis tools in the simulation platform. NetLogo is geared towards creating one simulation scenario, running the simulation and analyzing the results based on a few parameters. This is no longer sufficient when we have a model with which we can create many scenarios and have many parameters that influence a result. We used R now to interpret the flood of data that was produced with every scenario. But, R is not really the most user friendly tool and also not specifically meant for analyzing the data from social simulations.

Let me jump to the third concern of Edmund and link it to the analysis of the results as well. While we tried to justify the results of our simulation on the effectiveness of the track and tracing app we compared our simulation with an epidemiological based model. This is described in chapter 12 of the book. Here we encountered the difference in assumed number of contacts per day a person has with other persons. One can take the results, as quoted by Edmund as well, of 8 or 13 from empirical work and use them in the model. However, the dispute is not about the number of contacts a person has per day, but what counts as a contact! For the COVID-19 simulations standing next to a person in the queue in a supermarket for five minutes can count as a contact, while such a contact is not a meaningful contact in the cited literature. Thus, we see that what we take as empirically validated numbers might not at all be the right ones for our purpose. We have tried to justify all the values of parameters and outcomes in the context for which the simulations were created. We have also done quite some sensitivity analyses, which we did not all report on just to keep the volume of the book to a reasonable size. Although we think we did a proper job in justifying all results, that does not mean that one can have different opinions on the value that some parameters should have. It would be very good to check the influence on the results of changes in these parameters. This would also progress scientific insights in the usefulness of complex models like the one we made!

I really think that an ABM crisis response should be institutional. That does not mean that one institution determines the best ABM, but rather that the ABM that is put forward by that institution is the result of a continuous debate among scientists working on ABM’s for that type of crisis. For us, one of the more important outcomes of the ASSOCC project is that we really need much better tools to support the types of simulations that are needed for a crisis situation. However, it is very difficult to develop these tools as a single group. A lot of the effort needed is not publishable and thus not valued in an academic environment. I really think that the efforts that have been put in platforms such as NetLogo and Repast are laudable. They have been made possible by some generous grants and institutional support. We argue that this continuous support is also needed in order to be well equipped for a next crisis. But we do not argue that an institution would by definition have the last word in which is the best ABM. In an ideal case it would accumulate all academic efforts as is done in the climate models, but even more restricted models would still be better than just having a thousand individuals all claiming to have a useable ABM while governments have to react quickly to a crisis.

The final concern of Edmund is about the empirical scale of our simulations. This is completely true! Given the scale and details of what we can incorporate we can only simulate some phenomena and certainly not everything around the COVID-19 crisis. We tried to be clear about this limitation. We had discussions about the Unity interface concerning this as well. It is in principle not very difficult to show people walking in the street, taking a car or a bus, etc. However, we decided to show a more abstract representation just to make clear that our model is not a complete model of a small town functioning in all aspects. We have very carefully chosen which scenarios we can realistically simulate and give some insights in reality from. Maybe we should also have discussed more explicitly all the scenarios that we did not run with the reasons why they would be difficult or unrealistic in our ABM. One never likes to discuss all the limitations of one’s labor, but it definitely can be very insightful. I have made up for this a little bit by submitting an to a special issue on predictions with ABM in which I explain in more detail, which should be the considerations to use a particular ABM to try to predict some state of affairs. Anyone interested to learn more about this can contact me.

To conclude this response to the review, I again express my gratitude for the good and thorough work done. The concerns that were raised are all very valuable to concern. What I tried to do in this response is to highlight that these concerns should be taken as a call to arms to put effort in social simulation platforms that give better support for creating simulations for a crisis.

References

Dignum, F. (Ed.) (2021) Social Simulation for a Crisis: Results and Lessons from Simulating the COVID-19 Crisis. Springer. DOI:10.1007/978-3-030-76397-8

Dignum, F. (2020) Response to the review of Edmund Chattoe-Brown of the book “Social Simulations for a Crisis”. Review of Artificial Societies and Social Simulation, 4th Nov 2021. https://rofasss.org/2021/11/04/dignum-review-response/

© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Content

Does It Take Two (And A Creaky Search Engine) To Make An Outstation? Hunting Highly Cited Opinion Dynamics Articles in the Journal of Artificial Societies and Social Simulation (JASSS)

August 19, 2021 thesubmissionauthor Leave a comment

By Edmund Chattoe-Brown

In an important article, Squazzoni and Casnici (2013) raise the issue of how social simulation (as manifested in the Journal of Artificial Societies and Social Simulation – hereafter JASSS – the journal that has probably published the most of this kind of research for longest) cites and is cited in the wider scientific community. They discuss this in terms of social simulation being a potential “outstation” of social science (but better integrated into physical science and computing). This short note considers the same argument in reverse. As an important site of social simulation research, is it the case that JASSS is effectively representing research done more widely across the sciences?

The method used to investigate this was extremely simple (and could thus easily be extended and replicated). On 28.08.21, using the search term “opinion dynamics” in “all fields”, all sources from Web of Science (www.webofknowledge.com, hereafter WOS) that were flagged as “highly cited” were selected as a sample. For each article (only articles turned out to be highly cited), the title was searched in JASSS and the number of hits recorded. Common sense was applied in this search process to maximise the chances of success. So if a title had two sub clauses, these were searched jointly as quotations (to avoid the “hits” being very sensitive to the reproduction of punctuation linking clauses.) In addition, the title of the journal in which the article appeared was searched to give a wider sense of how well the relevant journal is known is JASSS.

However, now we come to the issue of the creaky search engine (as well as other limitations of quick and dirty searches). Obviously searching for the exact title will not find variants of that title with spelling mistakes or attempts to standardise spelling (i. e. changing behavior to behaviour). Further, it turns out that the Google search engine (which JASSS uses) does not promise the consistency that often seems to be assumed for it (http://jdebp.uk/FGA/google-result-counts-are-a-meaningless-metric.html). For example, when I searched for “SIAM Review” I mostly got 77 hits, rather often 37 hits and very rarely 0 or 1 hits. (PDFs are available for three of these outcomes from the author but the fourth could not be reproduced to be recorded in the time available.) This result occurred when another search took place seconds after the first so it is not, for example, a result of substantive changes to the content of JASSS. To deal with this problem I tried to confirm the presence of a particular article by searching jointly for all its co-authors. Mostly this approach gave a similar result (but where it does not it is noted in the table below). In addition, wherever there were a relatively large number of hits for a specific search, some of these were usually not the ones intended. (For example no hit on the term “global challenges” actually turned out to be for the journal Global Challenges.) In addition, JASSS often gives an oddly inconsistent number of hits for a specific article: It may appear as PDF and HTML as well as in multiple indices or may occur just once. (This discouraged attempts to go from hits to the specific number of unique articles citing these WOS sources. As it turns out, this additional detail would have added little to the headline result.)

The term “opinion dynamics” was chosen somewhat arbitrarily (for reasons connected with other research) and it is not claimed that this term is even close to a definitive way of capturing any models connected with opinion/attitude change. Nonetheless, it is clear that the number of hits and the type of articles reported on WOS (which is curated and quality controlled) are sufficient (and sufficiently relevant) for this to be a serviceable search term to identify a solid field of research in JASSS (and elsewhere). I shall return to this issue.

The results, shown in the table below are striking on several counts. (All these sources are fully cited in the references at the end of this article.) Most noticeably, JASSS is barely citing a significant number of articles that are very widely cited elsewhere. Because these are highly cited in WOS this cannot be because they are too new or too inaccessible. The second point is the huge discrepancy in citation for the one article on the WOS list that appears in JASSS itself (Flache et al. 2017). Thirdly, although some of these articles appear in journals that JASSS otherwise does not cite (like Global Challenges and Dynamic Games and Applications) others appear in journals that are known to JASSS and generally cited (like SIAM Review).

Reference	WOS Citations	Article Title Hits in JASSS	Journal Title Hits in JASSS
Acemoglu and Ozdaglar (2011)	301	0 (1 based on joint authors)	2
Motsch and Tadmor (2014)	214	0	77
Van Der Linden et al. (2017)	191	0	6 (but none for the journal)
Acemoğlu et al. (2013)	186	1	2 (but 1 article)
Proskurnikov et al. (2016)	165	0	9
Dong et al. (2017)	147	0	48 (but rather few for the journal)
Jia et al. (2015)	118	0	77
Dong et al. (2018)	117	0 (1 based on joint authors)	48 (but rather few for the journal)
Flache et al. (2017)	86	58 (17 based on joint authors)	N/A
Urena et al. (2019)	72	0	6
Bu et al. (2020)	56	0	5
Zhang et al. (2020)	55	0	33 (but only some of these are for the journal)
Xiong et al. (2020)	28	0	1
Carrillo et al. (2020)	13	0	0

One possible interpretation of this result is simply that none of the most highly cited articles in WOS featuring the term “opinion dynamics” happen to be more than incidentally relevant to the scientific interests of JASSS. On consideration, however, this seems a rather improbable coincidence. Firstly, these articles were chosen exactly because they are highly cited so we would have to explain how they could be perceived as so useful generally but specifically not in JASSS. Secondly, the same term (“opinion dynamics”) consistently generates 254 hits in JASSS, suggesting that the problem isn’t a lack of overlap in terminology or research interests.

This situation, however, creates a problem for more conclusive explanation. The state of affairs here is not that these articles are being cited and then rejected on scientific grounds given the interests of JASSS (thus providing arguments I could examine). It is that they are barely being cited at all. Unfortunately, it is almost impossible to establish why something is not happening. Perhaps JASSS authors are not aware of these articles to begin with. Perhaps they are aware but do not see the wider scientific value of critiquing them or attempting to engage with their irrelevance in print.

But, given that the problem is non citation, my concern can be made more persuasive (perhaps as persuasive as it can be given problems of convincingly explaining an absence) by investigating the articles themselves. (My thanks are due to Bruce Edmonds for encouraging me to strengthen the argument in this way.) There are definitely some recurring patterns in this sample. Firstly, a significant proportion of the articles are highly mathematical and, therefore (as Agent-Based Modelling often criticises) rely on extreme simplifying assumptions and toy examples. Even here, however, it is not self-evident that such articles should not be cited in JASSS merely because they are mathematical. JASSS has itself published relatively mathematical articles and, if an article contains a mathematical model that could be “agentised” (thus relaxing its extreme assumptions) which is no less empirical than similar models in JASSS (or has particularly interesting behaviours) then it is hard to see why this should not be discussed by at least a few JASSS authors. A clear example of this is provided by Acemoğlu et al. (2013) which argues that existing opinion dynamics models fail to produce the ongoing fluctuations of opinion observed in real data (see, for example, Figures 1-3 in Chattoe-Brown 2014 which also raises concerns about the face validity of popular social simulations of opinion dynamics). In fact, the assumptions of this model could easily be questioned (and real data involves turning points and not just fluctuations) but the point is that JASSS articles are not citing it and rejecting it based on argument but simply not citing it. A model capable of generating ongoing opinion fluctuations (however imperfect) is simply too important to the current state of opinion dynamics research in social simulation not to be considered at all. Another (though less conclusive) example is Motsch and Tadmor (2014) which presents a model suggesting (counter intuitively) that interaction based on heterophily can better achieve consensus than interaction based on homophily. Of course one can reject such an assumption on empirical grounds but JASSS is not currently doing that (and in fact the term heterophily is unknown in the journal except for the title of a cited article.)

Secondly, there are also a number of articles which, while not providing important results seem no less plausible or novel than typical OD articles that are published in JASSS. For example, Jia et al. (2015) add self-appraisal and social power to a standard OD model. Between debates, agents amend the efficacy they believe that they and others have in terms of swaying the outcome and take that into account going forward. Proskurnikov et al. (2016) present the results of a model in which agents can have negative ties with each other (as well as the more usual positive ones) and thus consider the coevolution of positive/negative sentiments and influence (describing what they call hostile camps i. e. groups with positive ties to each other and negative ties to other groups). This is distinct from the common repulsive effect in OD models where agents do not like the opinions of others (rather than disliking the others themselves.)

Finally, both Dong et al. (2017) and Zhang et al. (2020) reach for the idea (through modelling) that experts and leaders in OD models may not just be randomly scattered through the population as types but may exist because of formal organisations or accidents of social structure: This particular agent is either deliberately appointed to have more influence or happens to have it because of their network position.

On a completely different tack, two articles (Dong et al. 2018 and Acemoglu and Ozdaglar 2011) are literature reviews or syntheses on relevant topics and it is hard to see how such broad ranging articles could have so little value to OD research in JASSS.

It will be admitted that some of the articles in the sample are hard to evaluate with certainty. Mathematical approaches often seem to be more interested in generating mathematics than in justifying its likely value. This is particularly problematic when combined with a suggestion that the product of the research may be instrumental algorithms (designed to get things done) rather than descriptive ones (designed to understand social behaviour). An example of this is several articles which talk about achieving consensus without really explaining whether this is a technical goal (for example in a neural network) or a social phenomenon and, if the latter, whether this places constraints on what it legitimate: You can reach consensus by debate but not by shooting dissenters!

But as well as specific ideas in specific models, this sample of articles also suggest a different emphasis from those currently found within JASSS OD research. For example, there is much more interest in deliberately achieving consensus (and the corresponding hazards of manipulation or misinformation impeding that.) Reading these articles collectively gives a sense that JASSS OD models are very much liberal democratic: Agents honestly express their views (or at most are somewhat reticent to protect themselves.) They decently expect the will of the people to prevail. They do not lie strategically to sway the influential, spread rumours to discredit the opinions of opponents or flood the debate with bots. Again, this darker vision is no more right a priori than the liberal democratic one but JASSS should at least be engaging with articles modelling (or providing data on – see Van Der Linden et al. 2017) such phenomena in an OD context. (Although misinformation is mentioned in some OD articles in JASSS it does not seem to be modelled. There also seems to be another surprising glitch in the search engine which considers the term “fake news” to be a hit for misinformation!) This also puts a new slant on an ongoing challenge in OD research, identifying a plausible relationship between fact and opinion. Is misinformation a different field of research (on the grounds that opinions can never be factually wrong) or is it possible for the misinformed to develop mis-opinions? (Those that they would change if what they knew changed.) Is it really the case that Brexiteers, for example, are completely indifferent to the economic consequences which will reveal themselves or did they simply have mistaken beliefs about how high those costs might turn out to be which will cause them to regret their decision at some later stage?

Thus to sum up, while some of the articles in the sample can be dismissed as either irrelevant to JASSS or having a potential relevance that is hard to establish, the majority cannot reasonably be regarded in this way (and a few are clearly important to the existing state of OD research.) While we cannot explain why these articles are not in fact cited, we can thus call into question one possible (Panglossian) explanation for the observed pattern (that they are not cited because they have nothing to contribute).

Apart from the striking nature of the result and its obvious implication (if social simulators want to be cited more widely they need to make sure they are also citing the work of others appropriately) this study has two wider (related) implications for practice.

Firstly, systematic literature reviewing (see, for example, Hansen et al. 2019 – not published in JASSS) needs to be better enforced in social simulation: “Systematic literature review” gets just 7 hits in JASSS. It is not enough to cite just what you happen to have read or models that resemble your own, you need to be citing what the community might otherwise not be aware of or what challenges your own model assumptions. (Although, in my judgement, key assumptions of Acemoğlu et al. 2013 are implausible I don’t think that I could justify non subjectively that they are any more implausible than those of those of the Zaller-Deffuant model – Malarz et al. 2011 – given the huge awareness discrepancy which the two models manifest in social simulation.)

Secondly, we need to rethink the nature of literature reviewing as part of progressive research. I have used “opinion dynamics” here not because it is the perfect term to identify all models of opinion and attitude change but because it throws up enough hits to show that this term is widely used in social simulation. Because I have clearly stated my search term, others can critique it and extend my analysis using other relevant terms like “opinion change” or “consensus formation”. A literature review that is just a bunch of arbitrary stuff cannot be critiqued or improved systematically (rather than nit-picked for specific omissions – as reviewers often do – and even then the critique can’t tell what should have been included if there are no clearly stated search criteria.) It should not be possible for JASSS (and the social simulation community it represents) simply to disregard articles as potentially important in their implications for OD as Acemoğlu et al. (2013). Even if this article turned out to be completely wrong-headed, we need to have enough awareness of it to be able to say why before setting it aside. (Interestingly, the one citation it does receive in JASSS can be summarised as “there are some other model broadly like this” with no detailed discussion at all – and thus no clear statement of how the model presented in the citing article adds to previous models – but uninformative citation is a separate problem.)

Acknowledgements

This article as part of “Towards Realistic Computational Models of Social Influence Dynamics” a project funded through ESRC (ES/S015159/1) by ORA Round 5.

References

Acemoğlu, Daron and Ozdaglar, Asuman (2011) ‘Opinion Dynamics and Learning in Social Networks’, Dynamic Games and Applications, 1(1), March, pp. 3-49. doi:10.1007/s13235-010-0004-1

Acemoğlu, Daron, Como, Giacomo, Fagnani, Fabio and Ozdaglar, Asuman (2013) ‘Opinion Fluctuations and Disagreement in Social Networks’, Mathematics of Operations Research, 38(1), February, pp. 1-27. doi:10.1287/moor.1120.0570

Bu, Zhan, Li, Hui-Jia, Zhang, Chengcui, Cao, Jie, Li, Aihua and Shi, Yong (2020) ‘Graph K-Means Based on Leader Identification, Dynamic Game, and Opinion Dynamics’, IEEE Transactions on Knowledge and Data Engineering, 32(7), July, pp. 1348-1361. doi:10.1109/TKDE.2019.2903712

Carrillo, J. A., Gvalani, R. S., Pavliotis, G. A. and Schlichting, A. (2020) ‘Long-Time Behaviour and Phase Transitions for the Mckean–Vlasov Equation on the Torus’, Archive for Rational Mechanics and Analysis, 235(1), January, pp. 635-690. doi:10.1007/s00205-019-01430-4

Chattoe-Brown, Edmund (2014) ‘Using Agent Based Modelling to Integrate Data on Attitude Change’, Sociological Research Online, 19(1), February, article 16, <http://www.socresonline.org.uk/19/1/16.html>. doi:10.5153/sro.3315

Dong, Yucheng, Ding, Zhaogang, Martínez, Luis and Herrera, Francisco (2017) ‘Managing Consensus Based on Leadership in Opinion Dynamics’, Information Sciences, 397-398, August, pp. 187-205. doi:10.1016/j.ins.2017.02.052

Dong, Yucheng, Zhan, Min, Kou, Gang, Ding, Zhaogang and Liang, Haiming (2018) ‘A Survey on the Fusion Process in Opinion Dynamics’, Information Fusion, 43, September, pp. 57-65. doi:10.1016/j.inffus.2017.11.009

Hansen, Paula, Liu, Xin and Morrison, Gregory M. (2019) ‘Agent-Based Modelling and Socio-Technical Energy Transitions: A Systematic Literature Review’, Energy Research and Social Science, 49, March, pp. 41-52. doi:10.1016/j.erss.2018.10.021

Jia, Peng, MirTabatabaei, Anahita, Friedkin, Noah E. and Bullo, Francesco (2015) ‘Opinion Dynamics and the Evolution of Social Power in Influence Networks’, SIAM Review, 57(3), pp. 367-397. doi:10.1137/130913250

Malarz, Krzysztof, Gronek, Piotr and Kulakowski, Krzysztof (2011) ‘Zaller-Deffuant Model of Mass Opinion’, Journal of Artificial Societies and Social Simulation, 14(1), 2, <https://www.jasss.org/14/1/2.html>. doi:10.18564/jasss.1719

Motsch, Sebastien and Tadmor, Eitan (2014) ‘Heterophilious Dynamics Enhances Consensus’, SIAM Review, 56(4), pp. 577-621. doi:10.1137/120901866

Proskurnikov, Anton V., Matveev, Alexey S. and Cao, Ming (2016) ‘Opinion Dynamics in Social Networks With Hostile Camps: Consensus vs. Polarization’, IEEE Transactions on Automatic Control, 61(6), June, pp. 1524-1536. doi:10.1109/TAC.2015.2471655

Squazzoni, Flaminio and Casnici, Niccolò (2013) ‘Is Social Simulation a Social Science Outstation? A Bibliometric Analysis of the Impact of JASSS’, Journal of Artificial Societies and Social Simulation, 16(1), 10, <http://jasss.soc.surrey.ac.uk/16/1/10.html>. doi:10.18564/jasss.2192

Ureña, Raquel, Chiclana, Francisco, Melançon, Guy and Herrera-Viedma, Enrique (2019) ‘A Social Network Based Approach for Consensus Achievement in Multiperson Decision Making’, Information Fusion, 47, May, pp. 72-87. doi:10.1016/j.inffus.2018.07.006

Van Der Linden, Sander, Leiserowitz, Anthony, Rosenthal, Seth and Maibach, Edward (2017) ‘Inoculating the Public against Misinformation about Climate Change’, Global Challenges, 1(2), 27 February, article 1600008. doi:10.1002/gch2.201600008

Xiong, Fei, Wang, Ximeng, Pan, Shirui, Yang, Hong, Wang, Haishuai and Zhang, Chengqi (2020) ‘Social Recommendation With Evolutionary Opinion Dynamics’, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(10), October, pp. 3804-3816. doi:10.1109/TSMC.2018.2854000

Zhang, Zhen, Gao, Yuan and Li, Zhuolin (2020) ‘Consensus Reaching for Social Network Group Decision Making by Considering Leadership and Bounded Confidence’, Knowledge-Based Systems, 204, 27 September, article 106240. doi:10.1016/j.knosys.2020.106240

Chattoe-Brown, E. (2021) Does It Take Two (And A Creaky Search Engine) To Make An Outstation? Hunting Highly Cited Opinion Dynamics Articles in the Journal of Artificial Societies and Social Simulation (JASSS). Review of Artificial Societies and Social Simulation, 19th August 2021. https://rofasss.org/2021/08/19/outstation/

Content

The Systematic Comparison of Agent-Based Policy Models – It’s time we got our act together!

May 11, 2021 thesubmissionauthor 6 Comments

By Mike Bithell and Bruce Edmonds

Model Intercomparison

The recent Covid crisis has led to a surge of new model development and a renewed interest in the use of models as policy tools. While this is in some senses welcome, the sudden appearance of many new models presents a problem in terms of their assessment, the appropriateness of their application and reconciling any differences in outcome. Even if they appear similar, their underlying assumptions may differ, their initial data might not be the same, policy options may be applied in different ways, stochastic effects explored to a varying extent, and model outputs presented in any number of different forms. As a result, it can be unclear what aspects of variations in output between models are results of mechanistic, parameter or data differences. Any comparison between models is made tricky by differences in experimental design and selection of output measures.

If we wish to do better, we suggest that a more formal approach to making comparisons between models would be helpful. However, it appears that this is not commonly undertaken most fields in a systematic and persistent way, except for the field of climate change, and closely related fields such as pollution transport or economic impact modelling (although efforts are underway to extend such systematic comparison to ecosystem models – Wei et al., 2014, Tittensor et al., 2018⁠). Examining the way in which this is done for climate models may therefore prove instructive.

Model Intercomparison Projects (MIP) in the Climate Community

Formal intercomparison of atmospheric models goes back at least to 1989 (Gates et al., 1999)⁠ with the first atmospheric model inter-comparison project (AMIP), initiated by the World Climate Research Programme. By 1999 this had contributions from all significant atmospheric modelling groups, providing standardised time-series of over 30 model variables for one particular historical decade of simulation, with a standard experimental setup. Comparisons of model mean values with available data helped to reveal overall model strengths and weaknesses: no single model was best at simulation of all aspects of the atmosphere, with accuracy varying greatly between simulations. The model outputs also formed a reference base for further inter-comparison experiments including targets for model improvement and reduction of systematic errors, as well as a starting point for improved experimental design, software and data management standards and protocols for communication and model intercomparison. This led to AMIPII and, subsequently, to a series of Climate model inter-comparison projects (CMIP) beginning with CMIP I in 1996. The latest iteration (CMIP 6) is a collection of 23 separate model intercomparison experiments covering atmosphere, ocean, land surface, geo-engineering, and the paleoclimate. This collection is aimed at the upcoming 2021 IPCC process (AR6). Participating projects go through an endorsement process for inclusion, (a process agreed with modelling groups), based on 10 criteria designed to ensure some degree of coherence between the various models – a further 18 MIPS are also listed as currently active (https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6). Groups contribute to a central set of common experiments covering the period 1850 to the near-present. An overview of the whole process can be found in (Eyring et al., 2016).

The current structure includes a set of three overarching questions covering the dynamics of the earth system, model systematic biases and understanding possible future change under uncertainty. Individual MIPS may build on this to address one or more of a set of 7 “grand science challenges” associated with the climate. Modelling groups agree to provide outputs in a standard form, obtained from a specified set of experiments under the same design, and to provide standardised documentation to go with their models. Originally (up to CMIP 5), outputs were then added to a central public repository for further analysis, however the output grew so large under CMIP6 that now the data is held dispersed over repositories maintained by separate groups.

Other Examples

Two further more recent examples of collective model development may also be helpful to consider.

Firstly, an informal network collating models across more than 50 research groups has already been generated as a result of the COVID crisis – the Covid Forecast Hub (https://covid19forecasthub.org). This is run by a small number of research groups collaborating with the US Centre for Disease Control and is strongly focussed on the epidemiology. Participants are encouraged to submit weekly forecasts, and these are integrated into a data repository and can be vizualized on the website – viewers can look at forward projections, along with associated confidence intervals and model evaluation scores, including those for an ensemble of all models. The focus on forecasts in this case arises out of the strong policy drivers for the current crisis, but the main point is that it is possible to immediately view measures of model performance and to compare the different model types: one clear message that rapidly becomes apparent is that many of the forward projections have 95% (and at some times, even 50%) confidence intervals for incident deaths that more than span the full range of the past historic data. The benefit of comparing many different models in this case is apparent, as many of the historic single-model projections diverge strongly from the data (and the models most in error are not consistently the same ones over time), although the ensemble mean tends to be better.

As a second example, one could consider the Psychological Science Accelerator (PSA: Moshontz et al 2018, https://psysciacc.org/). This is a collaborative network set up with the aim of addressing the “replication crisis” in psychology: many previously published results in psychology have proved problematic to replicate as a result of small or non-representative sampling or use of experimental designs that do not generalize well or have not been used consistently either within or across studies. The PSA seeks to ensure accumulation of reliable and generalizable evidence in psychological science, based on principles of inclusion, decentralization, openness, transparency and rigour. The existence of this network has, for example, enabled the reinvestigation of previous experiments but with much larger and less nationally biased samples (e.g. Jones et al 2021).

The Benefits of the Intercomparison Exercises and Collaborative Model Building

More specifically, long-term intercomparison projects help to do the following.

Build on past effort. Rather than modellers re-inventing the wheel (or building a new framework) with each new model project, libraries of well-tested and documented models, with data archives, including code and experimental design, would allow researchers to more efficiently work on new problems, building on previous coding effort
Aid replication. Focussed long term intercomparison projects centred on model results with consistent standardised data formats would allow new versions of code to be quickly tested against historical archives to check whether expected results could be recovered and where differences might arise, particularly if different modelling languages were being used
Help to formalize. While informal code archives can help to illustrate the methods or theoretical foundations of a model, intercomparison projects help to understand which kinds of formal model might be good for particular applications, and which can be expected to produce helpful results for given desired output measures
Build credibility. A continuously updated set of model implementations and assessment of their areas of competence and lack thereof (as compared with available datasets) would help to demonstrate the usefulness (or otherwise) of ABM as a way to represent social systems
Influence Policy (where appropriate). Formal international policy organisations such as the IPCC or the more recently formed IPBES are effective partly through an underpinning of well tested and consistently updated models. As yet it is difficult to see whether such a body would be appropriate or effective for social systems, as we lack the background of demonstrable accumulated and well tested model results.

Lessons for ABM?

What might we be able to learn from the above, if we attempted to use a similar process to compare ABM policy models?

In the first place, the projects started small and grew over time: it would not be necessary, for example, to cover all possible ABM applications at the outset. On the other hand, the latest CMIP iterations include a wide range of different types of model covering many different aspects of the earth system, so that the breadth of possible model types need not be seen as a barrier.

Secondly, the climate inter-comparison project has been persistent for some 30 years – over this time many models have come and gone, but the history of inter-comparisons allows for an overview of how well these models have performed over time – data from the original AMIP I models is still available on request, supporting assessments concerning long-term model improvement.

Thirdly, although climate models are complex – implementing a variety of different mechanisms in different ways – they can still be compared by use of standardised outputs, and at least some (although not necessarily all) have been capable of direct comparison with empirical data.

Finally, an agreed experimental design and public archive for documentation and output that is stable over time is needed; this needs to be done via a collective agreement among the modelling groups involved so as to ensure a long-term buy-in from the community as a whole, so that there is a consistent basis for long-term model development, building on past experience.

The need for aligning or reproducing ABMs has long been recognised within the community (Axtell et al. 1996; Edmonds & Hales 2003), but on a one-one basis for verifying the specification of models against their implementation, although (Hales et al. 2003) discusses a range of possibilities. However, this is far from a situation where many different models of basically the same phenomena are systematically compared – this would be a larger scale collaboration lasting over a longer time span.

The community has already established a standardised form of documentation in the ODD protocol. Sharing of model code is also becoming routine, and can be easily achieved through COMSES, Github or similar. The sharing of data in a long-term archive may require more investigation. As a starting project COVID-19 provides an ideal opportunity for setting up such a model inter-comparison project – multiple groups already have running examples, and a shared set of outputs and experiments should be straightforward to agree on. This would potentially form a basis for forward looking experiments designed to assist with possible future pandemic problems, and a basis on which to build further features into the existing disease-focussed modelling, such as the effects of economic, social and psychological issues.

Additional Challenges for ABMs of Social Phenomena

Nobody supposes that modelling social phenomena is going to have the same set of challenges that climate change models face. Some of the differences include:

The availability of good data. Social science is bedevilled by a paucity of the right kind of data. Although an increasing amount of relevant data is being produced, there are commercial, ethical and data protection barriers to accessing it and the data rarely concerns the same set of actors or events.
The understanding of micro-level behaviour. Whilst the micro-level understanding of our atmosphere is very well established, those of the behaviour of the most important actors (humans) is not. However, it may be that better data might partially substitute for a generic behavioural model of decision-making.
Agreement upon the goals of modelling. Although there will always be considerable variation in terms of what is wanted from a model of any particular social phenomena, a common core of agreed objectives will help focus any comparison and give confidence via ensembles of projections. Although the MIPs and Covid Forecast Hub are focussed on prediction, it may be that empirical explanation may be more important in other areas.
The available resources. ABM projects tend to be add-ons to larger endeavours and based around short-term grant funding. The funding for big ABM projects is yet to be established, not having the equivalent of weather forecasting to piggy-back on.
Persistence of modelling teams/projects. ABM tends to be quite short-term with each project developing a new model for a new project. This has made it hard to keep good modelling teams together.
Deep uncertainty. Whilst the set of possible factors and processes involved in a climate change model are well established, which social mechanisms need to be involved in any model of any particular social phenomena is unknown. For this reason, there is deep disagreement about the assumptions to be made in such models, as well as sharp divergence in outcome due to changes brought about by a particular mechanism but not included in a model. Whilst uncertainty in known mechanisms can be quantified, assessing the impact of those due to such deep uncertainty is much harder.
The sensitivity of the political context. Even in the case of Climate Change, where the assumptions made are relatively well understood and done on objective bases, the modelling exercise and its outcomes can be politically contested. In other areas, where the representation of people’s behaviour might be key to model outcomes, this will need even more care (Adoha & Edmonds 2017).

However, some of these problems were solved in the case of Climate Change as a result of the CMIP exercises and the reports they ultimately resulted in. Over time the development of the models also allowed for a broadening and updating of modelling goals, starting from a relatively narrow initial set of experiments. Ensuring the persistence of individual modelling teams is easier in the context of an internationally recognised comparison project, because resources may be easier to obtain, and there is a consistent central focus. The modelling projects became longer-term as individual researchers could establish a career doing just climate change modelling and importance of the work increasingly recognised. An ABM modelling comparison project might help solve some of these problems as the importance of its work is established.

Towards an Initial Proposal

The topic chosen for this project should be something where there: (a) is enough public interest to justify the effort, (b) there are a number of models with a similar purpose in mind being developed. At the current stage, this suggests dynamic models of COVID spread, but there are other possibilities, including: transport models (where people go and who they meet) or criminological models (where and when crimes happen).

Whichever ensemble of models is focussed upon, these models should be compared on a core of standard, with the same:

Start and end dates (but not necessarily the same temporal granularity)
Covering the same set of regions or cases
Using the same population data (though possibly enhanced with extra data and maybe scaled population sizes)
With the same initial conditions in terms of the population
Outputting a core of agreed measures (but maybe others as well)
Checked against their agreement against a core set of cases, with agreed data sets
Reported on in a standard format (though with a discussion section for further/other observations)
well documented and with code that is open access
Run a minimum of times with different random seeds

Any modeller/team that had a suitable model and was willing to adhere to the rules would be welcome to participate (commercial, government or academic) and these teams would collectively decide the rules, development and write any reports on the comparisons. Other interested stakeholder groups could be involved including professional/academic associations, NGOs and government departments but in a consultative role providing wider critique – it is important that the terms and reports from the exercise be independent or any particular interest or authority.

Conclusion

We call upon those who think ABMs have the potential to usefully inform policy decisions to work together, in order that the transparency and rigour of our modelling matches our ambition. Whilst model comparison exercises of the kind described are important for any simulation work, particular care needs to be taken when the outcomes can affect people’s lives.

References

Aodha, L. & Edmonds, B. (2017) Some pitfalls to beware when applying models to issues of policy relevance. In Edmonds, B. & Meyer, R. (eds.) Simulating Social Complexity – a handbook, 2nd edition. Springer, 801-822. (A version is at http://cfpm.org/discussionpapers/236)

Axtell, R., Axelrod, R., Epstein, J. M., & Cohen, M. D. (1996). Aligning simulation models: A case study and results. Computational & Mathematical Organization Theory, 1(2), 123-141. https://link.springer.com/article/10.1007%2FBF01299065

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., & Taylor, K. E. (2016). Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development, 9(5), 1937–1958. https://doi.org/10.5194/gmd-9-1937-2016

Gates, W. L., Boyle, J. S., Covey, C., Dease, C. G., Doutriaux, C. M., Drach, R. S., Fiorino, M., Gleckler, P. J., Hnilo, J. J., Marlais, S. M., Phillips, T. J., Potter, G. L., Santer, B. D., Sperber, K. R., Taylor, K. E., & Williams, D. N. (1999). An Overview of the Results of the Atmospheric Model Intercomparison Project (AMIP I). In Bulletin of the American Meteorological Society (Vol. 80, Issue 1, pp. 29–55). American Meteorological Society. https://doi.org/10.1175/1520-0477(1999)080<0029:AOOTRO>2.0.CO;2

Hales, D., Rouchier, J., & Edmonds, B. (2003). Model-to-model analysis. Journal of Artificial Societies and Social Simulation, 6(4), 5. http://jasss.soc.surrey.ac.uk/6/4/5.html

Jones, B.C., DeBruine, L.M., Flake, J.K. et al. To which world regions does the valence–dominance model of social perception apply?. Nat Hum Behav 5, 159–169 (2021). https://doi.org/10.1038/s41562-020-01007-2

Moshontz, H. + 85 others (2018) The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network , 1(4) 501-515. https://doi.org/10.1177/2515245918797607

Tittensor, D. P., Eddy, T. D., Lotze, H. K., Galbraith, E. D., Cheung, W., Barange, M., Blanchard, J. L., Bopp, L., Bryndum-Buchholz, A., Büchner, M., Bulman, C., Carozza, D. A., Christensen, V., Coll, M., Dunne, J. P., Fernandes, J. A., Fulton, E. A., Hobday, A. J., Huber, V., … Walker, N. D. (2018). A protocol for the intercomparison of marine fishery and ecosystem models: Fish-MIP v1.0. Geoscientific Model Development, 11(4), 1421–1442. https://doi.org/10.5194/gmd-11-1421-2018

Wei, Y., Liu, S., Huntzinger, D. N., Michalak, A. M., Viovy, N., Post, W. M., Schwalm, C. R., Schaefer, K., Jacobson, A. R., Lu, C., Tian, H., Ricciuto, D. M., Cook, R. B., Mao, J., & Shi, X. (2014). The north american carbon program multi-scale synthesis and terrestrial model intercomparison project – Part 2: Environmental driver data. Geoscientific Model Development, 7(6), 2875–2893. https://doi.org/10.5194/gmd-7-2875-2014

Bithell, M. and Edmonds, B. (2020) The Systematic Comparison of Agent-Based Policy Models - It’s time we got our act together!. Review of Artificial Societies and Social Simulation, 11th May 2021. https://rofasss.org/2021/05/11/SystComp/

Content

How To Make Your Code “Immortal”: NetLogo Edition

May 6, 2021 thesubmissionauthor Leave a comment

By Edmund Chattoe-Brown, Alvaro Gil and The STREAMS Group¹

¹ Professor Graeme Ackland (School of Physics and Astronomy, University of Edinburgh), Professor Fiifi Amoako Johnson (Department of Population and Health, University of Cape Coast), Dr Daniel Amoako-Sakyi (Department of Medical Biochemistry, University of Cape Coast), Dr Rebecca S. Balira (National Institute for Medical Research, Mwanza, Tanzania), Dr Edmund Chattoe-Brown (School of Media, Communication and Sociology, University of Leicester), Professor Elizabeth David-Barrett (School of Law, Politics and Sociology, University of Sussex), Dr Martins Ekor (Department of Pharmacology, University of Cape Coast), Dr Heather Hamill (Department of Sociology, University of Oxford), PI: Professor Kate Hampshire (Department of Anthropology, University of Durham), Dr Gerry Mshana (National Institute for Medical Research, Mwanza, Tanzania), Dr Simon Mariwah (Department of Geography and Regional Planning, University of Cape Coast), Dr Adams Osman (University of Education Winneba), Dr Samuel Asiedu Owusu (Directorate of Research, Innovation and Consultancy, University of Cape Coast).

The Issue

When should Agent-Base Modellers write their own code and when should they reuse or extend code that already exists? From a purely individual point of view, they should write their own code when, all things considered, it is quicker than reusing or extending existing code. This may happen because they are very fast (though not necessarily very accurate) coders, when existing code is impenetrable (either ineffectively documented or not documented at all) or when it is badly designed (so it cannot really be extended) and/or a long way from the use to which the researcher wants to put it.

The complication arises because (like many phenomena) code reuse has social as well as individual benefits. Reused code has been checked by at least one other person. This may disclose such things as bugs and potential efficiency improvements but also establishes whether the code and documentation is really accessible for general use or whether the original designers merely believe it is. (Coders always know more about their own code than they realise and therefore don’t always understand how to explain everything that is actually relevant.) Ultimately, it probably makes more sense if there is one well checked and somewhat general set of code for a particular area of modelling than a whole bunch of overlapping code of variable quality (which is what we tend to observe). But for the “supply” of code to travel in that direction, at least some individuals have to elect for code reuse. This note reports an experience of trying to do that and draws provisional conclusions from the case study (which can be supported or refuted by analysis of further cases in subsequent work).

The Example

Most of the authors (The STREAMS Group) are involved in a research project about supply chains and substandard/falsified medicines in Africa (see, for example, Ackland et al. 2019 and Hamill et al. 2019) where most of the available funding had to be allocated to fieldwork. This meant that as a modeller with a very small time involvement (but nonetheless a requirement to deliver research outputs) it seemed to the first author (Chattoe-Brown) that he would have to reuse an existing supply chain ABM (in NetLogo because that is the main language he uses) rather than build one from scratch. After looking in the NetLogo Modeling Commons (a resource for model access that can be found at <http://modelingcommons.org/>) he chose Gil (2012) as potentially suited to his purpose (and not having any feedback on its download page to suggest that there were any serious problems with it.) Despite being written in NetLogo 5, it worked straightaway in NetLogo 6.1.1 (although a “step by step” version also offered by Gil and documented in French would not – though an additional advantage of NetLogo is that older versions remain available and still run so use could have made use of that code if necessary.)

Rather to Chattoe-Brown’s surprise, however (given the known challenges of using undocumented code and aims of collective resources generally), the code was barely documented at all and certainly not in a way that would allow a reader to establish how it worked overall. The information provided in the Info Tab also did not fulfil that function (though it did provide a broad overview of the model’s aims).

The solution was simply to work through the code, annotating it line by line, both with regard to its procedures and “logic”. (This version of the code is available from Chattoe-Brown on request.) By the end of this process Chattoe-Brown was pretty confident that the programme did what it claimed and that he understood about 95% of it (with the remaining parts being equations with opaque variable names whose specifics did not seem crucial to the overall logic of the code.) This level of confidence certainly makes him think that he can extend the model in ways that are needed for the African project (which was the original aim). For the first analysis he “parked” various add on features (like being able to observe a specific organisation and organise “promotions” of the product) and just concentrated on the core code.

Below, therefore, are conclusions about how code needs to be documented based not on abstract principles but on the challenge of understanding a real piece of code intended for actual reuse.

Think about variable naming: This is a trivial point (though its implications are surprisingly non trivial to making sense of code) but a reader will probably not be able to interpret a variable called “K” even from its use in equations (particularly if these are not themselves explained). Don’t assume that the reader will know the standard abbreviations for particular variables in your field (or the standard equations linking them). Instead use variables that attempt to self-describe like fixed_cost_of_order.
Make sure any version of the code you make available is actually “finished”. Don’t declare variables for functions you no longer call (or even define). Don’t include commented out “scaffolding” for things you were either going to do or didn’t end up needing. (The person intending to reuse the code still has to spend time figuring these out only to discover they “lead nowhere”.)
Establish (or follow) conventions for naming in your code. For example, give sliders variable names with words connected by underscores (like population_size) to distinguish them from “in code” variables or turtle attributes, perhaps linked by dashes (total-sales). Consider having prefixes to further distinguish attributes and variables: For example t-age is the age of a turtle while v-age is a variable holding some kind of age data. For obvious reasons (debugging as well as reading) don’t use total_sales and total-sales in the same piece of code!
Be aware that “material” relating to model inputs (like slider values) and outputs (like the use of global variables in plotting) is much less directly accessible to the reader than material in the code itself (which can often partly be interpreted by context). Document your input and output materials specifically in the code (or, if this is more easily understood), in the Info Tab.
Make it clear (by self-describing naming or naming conventions) which are “local” or “throwaway” variables or constructs (like loop counters or lists in the process of construction) and which are meaningful throughout the code. Establish standard names for common local variable functions like loop-counter or variable-holder to make it clearer that they are “disposable”. Local variables for list construction should be used with caution as they miss out on the support to interpretation that comes from variables that name themselves.
If your code involves undifferentiated data structures (like lists of lists) the reader will find it very hard to infer which list elements represent what. It is much easier simply to explain in the comments that an “order” is defined as a two part list with the first element being a quantity ordered and the second a “tick for delivery due”. This is particularly important when using commands like map which, while very powerful as operators on lists, are extremely hard to parse unless you already know what the elements of a list are intended to represent.
If code is to be made available undocumented, it is particularly important that it has “good logic”. For example, an operation (like updating a plot) should not be incorporated into a procedure serving a totally different function (like evaluating potential suppliers) just because that is convenient (because the reader may become very frustrated trying to read programme logic into that decision when it is not actually there). Procedures should not be “aggregated” with names that only make sense to the programmer (like main-sequence) but should also be self-naming (all-attribute-updating-for-wholesalers). It is particularly important that key design issues do not “get lost” in the allocation of operations across procedures. For example, the main conceptual challenge Chattoe-Brown had in understanding the Gil model was that although there were plainly “such things” as orders (a two element list of a quantity and delivery date as above) and there were procedures that could be identified as transferring products from one organisation to another and “soliciting” products, the actual procedure that “created” the orders as objects was a sub procedure of a procedure with an uninformative name. Furthermore, because orders were not indexed by the organisation placing them, one had to infer the implication of order lists being sorted in a certain way (that you could tell which order “belonged” to which wholesaler, for example, by its position in a list which corresponded to another list with the ordered identities of the wholesalers). A third example is the difficulty of interpreting procedures which take arguments. These should only be used when the argument is really something that modifies or qualifies an otherwise similar process in an intuitive way and not as a way of “code sharing” to do qualitatively very different things within the same procedure depending on the argument used.
If the code relies on “external” materials (like libraries or data files) make sure these are briefly described and, in particular, say where they can be accessed, in case they become separated from the code itself. Otherwise, unless they are well known resources that can be tracked down independently, the code may become inoperable without them. (This seems to have happened in the Gil code with some, fortunately minor, dedicated “shapes” to display the various organisations in the supply chain.)
Try to document “design decisions” in the Info Tab if not in the code itself. For example, the Gil model relies on building lists for nearly every event in the model (lists of successful sales, lists of lost sales and so on). These lists extend as time passes to serve as a complete record of a particular kind of “event” (like when a particular distributor is unable to fulfil its order and how big each of these lost orders were.) There is thus a technical concern that such ever growing lists may raise efficiency issues if the simulation runs for very long period and, in fact, there is a procedure that stops some lists (but perhaps surprisingly not all of them) from growing too long. In addition it is clear that the functioning of the code relies on all these various lists remaining the same length (and thus having to be “padded” with zeroes when nothing relevant happens) but the reader cannot tell whether the aim is that these very long lists will ever be used (in data analysis for example) and whether they are “padded” as part of their role as material for analysis (or just because it is just less programming effort than doing lots of list length checking.) The motivation of decisions like this (and their implications for improving or extending the models) are the hardest things to infer from the code itself. Another example is the code needed to “hook together” clients and suppliers in both directions (so if I choose you as my supplier, you must then record me as your client.) In retrospect it is obvious what the “aim” of this part of the code is but the practical details of programming alone leave it rather hard to work out.
If you haven’t “finished” your code (or in Chattoe-Brown’s case the comments on it), make clear how/where it is unfinished so later readers (or you yourself) can add to it. For example, I (Chattoe-Brown) use the string “xx” (which almost never occurs in natural text) to mark places where, for example, I am still unsure how a variable or procedure works. Only when I have dealt with all such instances is an article ready to submit or the code ready to circulate. This can be combined with the use of conventions, for example to mark whether or not you have yet debugged a particular procedure in your code. As with documentation generally, the aim is that mistakes should not be made because you go away from your code and come back to it later. The code should always “explain its own current state” (however incomplete or inelegant that may be).
Think about the best “supporting material” suitable to documenting specific code. For example, in the Gil code, some variables and procedures are unique to specific organisations or positions in the supply chain while others apply to several kinds of organisation. A table would be a very effective way of showing the overall pattern and rationale here. For example, do these kinds of organisation all do this particular procedure because they are “internal” to the supply chain i. e. having both upstream and downstream partners? Does this organisation not have this procedure because customer purchases are modelled differently from the rest of the purchase decisions in the model? Again, it is particularly hard to read out this kind of “model motivation” from the workings of the code alone.
“Hacky” solutions (things that may be clever in programming terms) are generally harder to interpret. For example, to give a variable a value, the Gil code “strips off” a number that is part of the name of the strategy and uses that. The syntax of string processing is not necessarily what a reader expects to engage with in what is mainly a list based model. (Generally, the more different approaches a single piece of code takes – for example list processing and string processing – the fewer users will be skilled enough to read it overall and the harder it will be to interpret. Once one has explained the logic of, say, a list based approach, it may be better to stick to it as far as possible.)
Consider having a “variable table” as part of your documentation. It is particularly fiddly to sort out which variables come from sliders (as discussed above) and which may be defined at arbitrary points in the code as a whole. It is also useful to have a sense of the overall “call structure” of code (so this procedure is strictly used as a subroutine of another while this one is called at many different points, perhaps with different arguments too.) The same applies to variables (which may simply pull a slider value into the code “once only” or be intertwined with almost every procedure.)
Be aware of the “passing implications” of concise code (and record them even if you do not amend them). For example, in the Gil code, demand is subtracted from stock if stock is greater than demand but if stock is less than demand, stock is set to zero. This means that some demand “evaporates”. This not only has implications for the general applicability of the model (the good cannot be an essential one for example) but may also affect the simulation outcome in unexpected ways. (It is one thing to discard one unit of demand because you want 9 and the stock is 8 but this implementation will equally discard 15 units if you want 16 and the stock is just 1. “Demand evaporation” on this scale may prove to have significant implications for the overall dynamics of the model This seems unlikely to be intended.)

The Conclusions

These conclusions operate at a number of levels from the practical to the institutional:

One can get a long way with print statements. Chattoe-Brown often found it feasible to test “hypotheses” about (unexplained) data structures by printing these out along with associated variables (so the third element of this data structure must be “ticks” because it always matches when that variable is printed out as well.) By contrast, it is still extremely hard to infer the “motivation” of an equation even if you print out material that describes how it behaves.
Even though it involves significant effort (which is nonetheless quite educational) it is probably still quicker (unless you are a real programming ace) to document and develop a solid piece of code built by someone else than to build your own from scratch.
The caveat emptor approach to code availability needs to be considered carefully in terms of wider community benefits. The danger is that the lower the cost of making things available, the lower the expected value of doing so. Chattoe-Brown was lucky in finding a “solid” bit of code in the NetLogo Modelling Commons but it might be (had the code reuse exercise not been conducted partly out of intellectual curiosity) that he would have done better to look at the model library of the COMSES network (which enforces minimum standards of documentation for example).
We need to consider the institutional structures that may support the division of labour in code reuse to improve its efficiency (more improvement and consolidation of existing code and less proliferation of ad hoc overlapping code) and thus the “efficiency” of the modelling community as a whole. (There is actual little virtue in everyone building their own models from scratch unless it is part of their training for example.) If Chattoe-Brown et al. document the Gil model so it qualifies for COMSES (where it will probably be more used as it deserves) does that remain Gil’s contribution, does it now “belong” to Chattoe-Brown et al. or is it some sort of joint venture? (Chattoe-Brown has assumed the latter. This explains the complicated authorship of the paper and the somewhat awkward references to separate authors in the text.) When academics are busy, the right incentives to engage in certain activities are a key aspect in making them happen. Another example is the difference between measuring the quality of code by simple “downloads” and in terms of people actually running it (since maybe the download wouldn’t even run) or actively endorsing it (since even if it ran it might be no good for further use on closer examination). More generally this suggests that we might need to support more nuanced “versioning” of shared code and consider commentary (as well as the code itself) as something that can usefully be shared and communally improved. (For example, a better coder than Chattoe-Brown could perhaps add to his documentation and fill the remaining gaps when they wouldn’t be motivated to document the whole programme from scratch. There may be a “brain drain” from communal code development because the ablest coders are most likely to write their own code and therefore have the least individual incentive to contribute to collective code development activities.)
There is generally not that much discussion of the ‘nitty gritty’ of code documentation, coding “styles” and so on in order that common practices can be shared and improved. (Arguably the account given here is not “research” in the sense intended for peer reviewed publication and this may explain its relative lack of visibility.) This is particularly the case of work inductively based on practice (and thus guaranteed to be relevant) rather than on “principle” (which may be impractical or inapplicable). We would be interested to hear if others use these “tricks” or whether they have other or better ones in the same vein. In the spirit of collective action, it is clear that others could perform the kind of analysis presented here and thus support, refute or improve on our tentative conclusions.
This article is clearly geared up to the specifics of NetLogo. It would be interesting to know whether the same (or different) specific problems arise in reusing code written in SWARM, RePast, MASON or whatever. RofASSS might be a good outlet for such related contributions (following its brief to provide, in permanent citable form, useful material that does not have an obvious existing outlet). It would also be interesting if others attempting code reuse endorsed or qualified our analysis. (Just as programmers who know their own code very well don’t always know what to explain, perhaps Chattoe-Brown’s view of what is difficult in this code may just reflect his own limitations. On the other hand, the less experience/expertise it takes to interpret and reuse code, perhaps the better that is communally.)
In the spirit of collaborative endeavour, we would be very grateful if anyone could further illuminate any areas of the more documented version of the Gil code that we still cannot follow!
The possibility of code reuse is a real issue in terms of efficiency and reliability and not simply an “academic” matter. By my good fortune, Gil was still at the same email address and responded very promptly and helpfully to an inquiry about some aspects of the code but freely admitted, after all these years, that he could no longer follow it either! (In another piece of research Chattoe-Brown was working on, a replication came to a dead stop because the modeller on a project had left academia and was therefore now too busy with other things to answer “academic” questions and none of the other authors really “owned” the code on which the article was based. In this situation, the supposed conclusions of the article stand but the model on which they are based fades into unverifiability.) It is not simply a cliché that undocumented code can easily become unusable. This suggests practical steps (like deleting anything from code archives of more than a certain age unless some “responsible” can still be contacted to affirm that it should remain.)
There may be useful work to be done in assessing the extent to which, in practice, published models tend to overlap in their functionality. Is it in fact the case that most supply chain simulations do basically the same things (but based on separate ad hoc code rather than the communal creation of a robust platform). A further issue here is that a whole set of independent models which are basically similar may create a kind of “groupthink” about what modelling in certain areas should involve. By contrast, are there certain fields where such collective work would not be feasible or useful (and why?) How do we promote communal models as an “output” where these turn out to be suitable? (This is part of a more general problem: Can we rely on individualistic academics with a competitive structure imposed on them to “communalise” their activities at appropriate points in time to avoid negative social outcomes – including waste – or do we have to find ways to somehow “encourage” this process by institutional development/design?)

Acknowledgements

ECB’s contribution to this article was as part of “Towards Realistic Computational Models of Social Influence Dynamics” a project funded through ESRC (ES/S015159/1) by ORA Round 5.

References

Ackland , Graeme J., Chattoe-Brown, Edmund, Hamill, Heather, Hampshire, Kate R., Mariwah, Simon and Mshana, Gerry (2019) ‘Role of Trust in a Self-Organizing Pharmaceutical Supply Chain Model with Variable Good Quality and Imperfect Information’, Journal of Artificial Societies and Social Simulation, 22(2), March, article 5, <http://jasss.soc.surrey.ac.uk/22/2/5.html>. doi:10.18564/jasss.3984

Gil, Alvaro (2012) ‘Artificial Supply Chain’, <http://modelingcommons.org/browse/one_model/3378#model_tabs_browse_info>, École Polytechnique de Montréal.

Hamill, Heather, Hampshire, Kate, Mariwah, Simon, Amoako-Sakyi, Daniel, Kyei, Abigail and Castelli, Michele (2019) ‘Managing Uncertainty in Medicine Quality in Ghana: The Cognitive and Affective Basis of Trust in a High-Risk, Low-Regulation Context’, Social Science and Medicine, 234, August, article 112369. doi:10.1016/j.socscimed.2019.112369

Wilensky, Uri (1999) ‘NetLogo’, <http://ccl.northwestern.edu/netlogo/>. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.

Chattoe-Brown, E., Gil, A. and The STREAMS Group (2021) How To Make Your Code “Immortal”: NetLogo Edition. Review of Artificial Societies and Social Simulation, 6th May 2021. https://rofasss.org/2021/05/06/how-to-make-your-code-immortal-netlogo-edition/

By Corinna Elsenbroich & Petra Ahrweiler

References

By Bruce Edmonds

Motivation

Partially Understood Models

Layering Models to Leverage some Understanding

Concluding Discussion

Notes

Acknowledgements

References

References

By Marijn A. Keijzer

Data, code and supplementary analyses

Notes

Acknowledgements

References

By Edmund Chattoe-Brown

Acknowledgements

Notes

References

By Edmund Chattoe-Brown

References

By Frank Dignum

References

By Edmund Chattoe-Brown

Acknowledgements

References

By Mike Bithell and Bruce Edmonds

Model Intercomparison

Model Intercomparison Projects (MIP) in the Climate Community

Other Examples

The Benefits of the Intercomparison Exercises and Collaborative Model Building

Lessons for ABM?

Additional Challenges for ABMs of Social Phenomena

Towards an Initial Proposal

Conclusion

References

By Edmund Chattoe-Brown, Alvaro Gil and The STREAMS Group1

The Issue

The Example

The Conclusions

Acknowledgements

References

For discussion about social simulation research

By Edmund Chattoe-Brown, Alvaro Gil and The STREAMS Group¹