Tag Archives: qualitative

The Challenge of Validation

By Martin Neumann

Introduction

In November 2021 Chattoe-Brown initiated a discussion at the SimSoc list on validation which generated quite some traffic on the list. The interest in this topic revealed that empirical validation provides a notorious challenge for agent-based modelling. The discussion raised many important points and questions which even motivated a “smalltalk about big things” at the Social Simulation Fest 2022. Many contributors highlighted that validation cannot be reduced to the comparison of numbers between simulated and empirical data. Without attempting a comprehensive review of this insightful discussion, it has been emphasized that different kinds of science call for different kinds of quality criteria. Prediction might be one criterium that is particularly important in statistics, but that is not sufficient for agent-based social simulation. For instance, agent-based modelling is specifically suited for studying complex systems and turbulent phenomena. Modelling also enables studying alternative and counterfactual scenarios which deviates from the paradigm of prediction as quality criterion. Besides output validation, other quality criteria for agent-based models include for instance input validation or process validation, reflecting the realism of the initialization and the mechanisms implemented in the model.

Qualitative validation procedures

The brief introduction is by no means an exhaustive summary of the broad discussion on validation. Already the measurement of empirical data can be put into question. Less discussed however, had been the role which qualitative methods potentially could play in this endeavor. In fact, there has been a long debate in the community of qualitative social research on this issue as well. Like agent-based social simulation also qualitative methods are challenged by the notion of validation. It has been noted that already the vocabulary that is used in attempts to ensure scientific rigor has a background in a positivist understanding of science whereas qualitative researcher often take up constructivist or poststructuralist positions (Cho and Trent 2006). For this reason, in qualitative research sometimes the notion of trustworthiness (Lincoln and Guba 1985) is preferred rather than speaking of validation. In an influential article (according to google scholar cited more than 17.000 times in May 2023) Creswell and Miller (2000) distinguish between a postpositivist, a constructivist, and a critical paradigm as well as between the lens of the researcher, the lens of the study participants, and the lens of external people and assign different validity procedures for qualitative research to the combinations of these different paradigms and lenses.

Paradigm/ lenspostpositivistconstructivistcritical
Lens of researchertriangulationDisconfirming evidenceReflexivity
Lens of study participantsMember checkingEngagement in the fieldCollaboration
Lens of external peopleAudit trialThick descriptionPeer debriefing
Table 1. validity procedures according to Creswell and Miller (2000).

While it remains contested if the validation procedure depends on the research design, this is at least a source of different accounts. Others differentiate between transactional and transformational validity (Cho and Trent 2006). The former concentrates on formal techniques in the research process for avoiding misunderstandings. Such procedures include for instance, techniques such as member checking. The latter account perceives research as an emancipatory process on behalf of the research subjects. This goes along with questioning the notion of absolute truth in the domain of human sciences which calls for alternative sources for the legitimacy of science such as emancipation of the researched subjects. This concept of emancipatory research resonates with participatory modelling approaches. In fact, in participatory modelling accounts some of these procedures are well-known even though they differ in terminology. The participatory approach originates from research on resource management (Pahl-Wostl 2002). For this purpose, integrated assessment models have been developed, inspired by the concept of post-normal science (Funtowicz and Ravetz 1993). Post-normal science emphasizes the communication of uncertainty, justification of practice, and complexity. This approach recognizes the legitimacy of multiple perspectives on an issue, both with respect to multiple scientific disciplines as well as lay men involved in the issue. For instance, Wynne (1992) analyzed the knowledge claims of sheep farmers in the interaction with scientists and authorities. In such an extended peer community of a citizen science (Stilgoe 2009), lay men of the affected communities play an active role in knowledge production, not only because of moral principles of fairness but to increase the quality of science (Fjelland 2016). One of the most well-known participatory approaches is the so-called companion modelling (ComMod) developed at CIRAD, a French agricultural research center for international development. The term companion modelling has been coined originally by (Barreteau et al 2003) and been further developed to a research paradigm for decision making in complex situations to support sustainable development (Étienne 2014). In fact, these approaches have a strong emancipatory component and rely on collaboration and member checking for ensuring resonance and practicality of the models (Tesfatsion 2021).

An interpretive validation procedure

While the participatory approaches show a convergence of methods between modelling and qualitative research even though they differ in terminology, in the following a further approach for examining the trustworthiness of simulation scenarios will be introduced that has not been considered so far, namely interpretive methods from qualitative research. A strong feature of agent-based modelling is that it allows for studying “what-if” questions. The ex-ante investigation of possible alternative futures enables identifying possible options of action alternatives but also detecting early warning signals of undesired developments. For this purpose, counterfactual scenarios are an important feature of agent-based modelling. It is important to note in this context that counterfactuals do not match empirical data. In the following it is suggested to examine the trustworthiness of counterfactual scenarios by using methods from objective hermeneutics (Oevermann 2002), the so-called sequence analysis (Kurt and Herbrik 2014). In terms of Creswell and Miller (2000) the examination of trustworthiness is from the lens of the researcher and a constructivist paradigm. For this purpose, simulation results have to be transformed into narrative scenarios, a method which is described in (Lotzmann and Neumann 2017).   

In the social sciences, sequence analysis is regarded as the central instrument of hermeneutic interpretation of meaning. It is “a method of interpretation that attempts to reconstruct the meaning of any kind of human action sequence by sequence, i.e. sense unit by sense unit […]. Sequence analysis is guided by the assumption that in the succession of actions […] contexts of meaning are realized …” (Kurt and Herbrik 2014: 281). A first important rule is the sequential procedure. The interpretation takes place in the sequence that the protocol to be analyzed itself specifies. It is assumed that each sequence point closes possibilities on the one hand and opens new possibilities on the other hand. This is done practically by sketching a series of stories in which the respective sequence passage would make sense. The basic question that can be asked of each sequence passage can be summarized as, “Consider who might have addressed this utterance to whom, under what conditions, with what justification, and what purpose?” (Schneider 1995: 140). The answers to these questions are the thought-experimentally designed stories. These stories are examined for commonalities and differences and condensed into readings. Through the generation of readings, certain possibilities of connection to the interpreted sequence passage become visible at the same time. In this sense, each step of interpretation makes sequentially spaces of possibility visible and at the same time closes other spaces of possibility.

In the following it will be argued that this method enables an examination of the trustworthiness of counterfactual scenarios using the example of a counterfactual simulation scenario of a successful non-violent conflict regulation within a criminal group: ‘They had a meeting at their lawyer’s office to assess the value of his investment, and Achim complied with the request. Thus, trust was restored, and the group continued their criminal activities’ (names are fictitious). Following Dickel and Neumann (2021) it is argued that this is a meaningful story. It is an example of how the linking of the algorithmic rules generates something new from the individual parts of the empirical material. However, it also shows how the individual pieces of the puzzle of the empirical data material are put together to form a collage that tells a story that makes sense. A sequence that can be interpreted in a meaningful way is produced. It should be noted, however, that this is a counterfactual sequence. In fact, a significantly different sequence is found in the empirical data: ‘Achim was ordered to his lawyer’s office. Instead of his lawyer, however, Toby and three thugs were waiting for him. They forced him to his knees and pointed a machine gun at his stomach’. In fact, this was by no means a non-violent form of conflict regulation. However, after Achim (in the real case) was forced to his knees by three thugs and threatened with a machine gun, the way to non-violent conflict regulation was hardly open any more. The sequence generated by the simulation, on the other hand, shows a way how the violence could have been avoided – a way that was not taken in reality. Is this now a programming error in the modeling? On the contrary, it is argued that it demonstrates the trustworthiness of the counterfactual scenario: from a methodological point of view a comparison of the factual with the counterfactual is instructive: Factually, Achim had a machine gun pointed at his stomach. Counterfactually, Achim agreed on a settlement. From a sequence-analytic perspective, this is a logical conclusion to a story, even if it does not apply to the factual course of events. Thus, the sequence analysis shows that the simulation here has decided between two possibilities, a path branching in which certain possibilities open and others close.

The trustworthiness of a counterfactual narrative is shown by whether 1) a meaningful case structure can be generated at all, or whether the narrative reveals itself as an absurd series of sequence passages from which no rules of action can be reconstructed. 2) it can be tested whether the case structure withstands a confrontation with the ‘external context’ and can be interpreted as a plausible structural variation. If both are given, scenarios can be read as explorations of the space of cultural possibilities, or of a cultural horizon (in this case: a specific criminal milieu). Thereby the interpretation of the counterfactual scenario provides a means for assessing the trustworthiness of the simulation.

References

Barreteau, O., et al. (2003). Our companion modelling approach. Journal of Artificial Societies and Social Simulation 6(2): 1. https://www.jasss.org/6/2/1.html

Cho, J., Trent, A. (2006). Validity in qualitative research revisited. Qualitative Research 6(3), 319-340. https://doi.org/10.1177/1468794106065006

Creswell, J., Miller, D. (2000). Determining validity in qualitative research. Theory into Practice 39(3), 124-130. https://doi.org/10.1207/s15430421tip3903_2

Dickel, S., Neumann. M. (2021). Hermeneutik sozialer Simulationen: Zur Interpretation digital erzeugter Narrative. Sozialer Sinn 22(2): 252-287. https://doi.org/10.1515/sosi-2021-0013

Étienne, M. (2014)(Ed.). Companion Modelling: A Participatory Approach to Support Sustainable Development. Springer, Dordrecht. https://link.springer.com/book/10.1007/978-94-017-8557-0

Fjelland, R. (2016). When Laypeople are Right and Experts are Wrong: Lessons from Love Canal. International Journal for Philosophy of Chemistry 22(1): 105–125. https://www.hyle.org/journal/issues/22-1/fjelland.pdf

Funtowicz, S., Ravetz, J. (1993). Science for the post-normal age. Futures 31(7): 735-755. https://doi.org/10.1016/0016-3287(93)90022-L

Kurt, R.; Herbrik, R. (2014). Sozialwissenschaftliche Hermeneutik und hermeneutische Wissenssoziologie. In: Baur, N.; Blasius, J. (eds.): Handbuch Methoden der empirischen Sozialforschung, pp. 473–491. Springer VS, Wiesbaden. https://link.springer.com/chapter/10.1007/978-3-658-21308-4_37

Lotzmann, U., Neumann, M. (2017). Simulation for interpretation. A methodology for growing virtual cultures. Journal of Artificial Societies and Social Simulation 20(3): 13. https://www.jasss.org/20/3/13.html

Lincoln, Y.S., Guba, E.G. (1985). Naturalistic Inquiry. Sage, Beverly Hill.

Oevermann, U. (2002). Klinische Soziologie auf der Basis der Methodologie der objektiven Hermeneutik. Manifest der objektiv hermeneutischen Sozialforschung http://www.ihsk.de/publikationen/Ulrich_Oevermann-Manifest_der_objektiv_hermeneutischen_Sozialforschung.pdf (Download am 01.03.2020).

Pohl-Wostl, C. (2002). Participative and Stakeholder-Based Policy Design, Evaluation and Modeling Processes. Integrated Assessment 3(1): 3-14. https://doi.org/10.1076/iaij.3.1.3.7409

Schneider, W. L. (1995). Objektive Hermeneutik als Forschungsmethode der Systemtheorie. Soziale Systeme 1(1): 135–158.

Stilgoe, J. (2009). Citizen Scientists: Reconnecting Science with Civil Society. Demos, London.

Tesfatsion, L. (2021). “Agent-Based Modeling: The Right Mathematics for Social Science?,” Keynote address, 16th Annual Social Simulation Conference (virtual), sponsored by the European Social Simulation Association (ESSA), September 20-24, 2021.

Wynne, B. (1992). Misunderstood misunderstanding: social identities and public uptake of science. Public Understanding of Science 1(3): 281–304.


Neumann, M. (2023) The Challenge of Validation. Review of Artificial Societies and Social Simulation, 18th Apr 2023. https://rofasss.org/2023/04/18/ChallengeValidation


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)The Challenge of Validation

Discussions on Qualitative & Quantitative Data in the Context of Agent-Based Social Simulation

By Peer-Olaf Siebers, in collaboration with Kwabena Amponsah, James Hey, Edmund Chattoe-Brown and Melania Borit

Motivation

1.1: Some time ago, I had several discussions with my PhD students Kwabena Amponsah and James Hey (we are all computer scientists, with a research interest in multi-agent systems) on the topic of qualitative vs. quantitative data in the context of Agent-Based Social Simulation (ABSS). Our original goal was to better understand the role of qualitative vs. quantitative data in the life cycle of an ABSS study. But as you will see later, we conquered more ground during our discussions.

1.2: The trigger for these discussions came from numerous earlier discussions within the RAT task force (Sebastian Achter, Melania Borit, Edmund Chattoe-Brown, and Peer-Olaf Siebers) on the topic, while we were developing the Rigour and Transparency – Reporting Standard (RAT-RS). The RAT-RS is a tool to improve the documentation of data use in Agent-Based Modelling (Achter et al 2022). During our RAT-RS discussions we made the observation that when using the terms “qualitative data” and “quantitative data” in different phases of the ABM simulation study life cycle these could be interpreted in different ways, and we felt difficult to clearly state the definition/role of these different types of data in the different contexts that the individual phases within the life cycle represent. This was aggravated by the competing understandings of the terminology within different domains (from social and natural sciences) that form the field of social simulation.

1.3: As the ABSS community is a multi-disciplinary one, often doing interdisciplinary research, we thought that we should share the outcome of our discussions with the community. To demonstrate the different views that exist within the topic area, we ask some of our friends from the social simulation community to comment on our philosophical discussions. And we were lucky enough to get our RAT-RS colleagues Edmund Chattoe-Brown and Melania Borit on board who provided critical feedback and their own view of things. In the following we provide a summary of the overall discussion Each of the following paragraph contains summaries of the initial discussion outcomes, representing the computer scientists’ views, followed by some thoughts provided by our two friends from the social simulation community (Borit’s in {} brackets and italic and Chattoe-Brown’s in [] brackets and bold), both commenting on the initial discussion outcomes of the computer scientists. To see the diverse backgrounds of all the contributors and perhaps to better understand their way of thinking and their arguments, I have added some short biographies of all contributors at the end of this Research Note. To support further (public) discussions I have numbered the individual paragraphs to make it easier to refer back to them. 

Terminology

2.1: As a starting point for our discussions I searched the internet for some terminology related to the topic of “data”. Following is a list of initial definitions of relevant terms [1]. First, the terms qualitative data and quantitative data, as defined by the Australian Bureau of Statistics: “Qualitative data are measures of ‘types’ and may be represented by a name, symbol, or a number code. They are data about categorical variables (e.g. what type). Quantitative data are measures of values or counts and are expressed as numbers. They are data about numeric variables (e.g. how many; how much; or how often).” (Australian Bureau of Statistics 2022) [Maybe don’t let a statistics unit define qualitative research? This has a topic that is very alien to us but argues “properly” about the role of different methods (Helitzer-Allen and Kendall 1992). “Proper” qualitative researchers would fiercely dispute this. It is “quantitative imperialism”.].

2.2: What might also help for this discussion is to better understand the terms qualitative data analysis and quantitative data analysis. Qualitative data analysis refers to “the processes and procedures that are used to analyse the data and provide some level of explanation, understanding, or interpretation” (Skinner et al 2021). [This is a much less contentious claim for qualitative data – and makes the discussion of the Australian Bureau of Statistics look like a distraction but a really “low grade” source in peer review terms. A very good one is Strauss (1987).] These methods include content analysis, narrative analysis, discourse analysis, framework analysis, and grounded theory and the goal is to identify common patterns. {These data analysis methods connect to different types of qualitative research: phenomenology, ethnography, narrative inquiry, case study research, or grounded theory. The goal of such research is not always to identify patterns – see (Miles and Huberman 1994): e.g., making metaphors, seeing plausibility, making contrasts/comparisons.} [In my opinion some of these alleged methods are just empire building or hot air. Do you actually need them for your argument?] These types of analysis must therefore use qualitative inputs, broadening the definition to include raw text, discourse and conceptual frameworks.

2.3 When it comes to quantitative data analysis “you are expected to turn raw numbers into meaningful data through the application of rational and critical thinking. Quantitative data analysis may include the calculation of frequencies of variables and differences between variables.” (Business Research Methodology 2022) {One does the same in qualitative data analysis – turns raw words (or pictures etc.) into meaningful data through the application of interpretation based on rational and critical thinking. In quantitative data analysis you usually apply mathematical/statistical models to analyse the data.}. While the output of quantitative data analysis can be used directly as input to a simulation model, the output of qualitative data analysis still needs to be translated into behavioural rules to be useful (either manually or through machine learning algorithms). {What is meant by “translated” in this specific context? Do we need this kind of translation only for qualitative data or also for quantitative data? Is there a difference between translation methods of qualitative and quantitative data?} [That seems pretty contentious too. It is what is often done, true, but I don’t think it is a logical requirement. I guess you could train a neural net using “cases” or design some other simple “cognitive architecture” from the data. Would this (Becker 1953), for example, best be modelled as “rules” or as some kind of adaptive process? But of course you have to be careful that “rule” is not defined so broadly that everything is one or it is true by definition. I wonder what the “rules” are in this: Chattoe-Brown (2009).]

2.4: Finally, let’s have a quick look at the difference between “data” and “evidence”. For this, we found the following distinction by Wilkinson (2022) helpful: “… whilst data can exist on its own, even though it is essentially meaningless without context, evidence, on the other hand, has to be evidence of or for something. Evidence only exists when there is an opinion, a viewpoint or an argument.”

Hypothesis

3.1: The RAT-RS divides the simulation life cycle into five phases, in terms of data use: model aim and context, conceptualisation, operationalisation, experimentation, and evaluation (Siebers et al 2019). We started our discussion by considering the following hypothesis: The outcome of qualitative data analysis is only useful for the purpose of conceptualisation and as a basis for producing quantitative data. It does not have any other roles within the ABM simulation study life cycle. {Maybe this hypothesis in itself has to be discussed. Is it so that you use only numbers in the operationalisation phase? One can write NetLogo code directly from qualitative data, without numbers.} [Is this inevitable given the way ABM works? Agents have agency and therefore can decide to do things (and we can only access this by talking to them, probably “open ended”). A statistical pattern – time series or correlation – has no agency and therefore cannot be accessed “qualitatively” – though we also sometimes mean by “qualitative” eyeballing two time series rather than using some formal measure of tracking. I guess that use would >>not<< be relevant here.]

Discussion

4.1: One could argue that qualitative data analysis provides causes for behaviour (and indications about their importance (ranking); perhaps also the likelihood of occurrence) as well as key themes that are important to be considered in a model. All sounds very useful for the conceptual modelling phase. The difficulty might be to model the impact (how do we know we model it correctly and at the right level), if that is not easily translatable into a quantitative value but requires some more (behavioural) mechanistic structures to represent the impact of behaviours. [And, of course, there is a debate in psychology (with some evidence on both sides) about the extent to which people are able to give subjective accounts we can trust (see Hastorf and Cantril (1954).] This might also provide issues when it comes to calibration – how does one calibrate qualitative data? {Triangulation.} One random idea we had was that perhaps fuzzy logic could help with this. More brainstorming and internet research is required to confirm that this idea is feasible and useful. [A more challenging example might be ethnographic observation of a “neighbourhood” in understanding crime. This is not about accessing the cognitive content of agents but may well still contribute to a well specified model. It is interesting how many famous models – Schelling, Zaller-Deffuant – actually have no “real” environment.]

4.2: One could also argue that what starts (or what we refer to initially) as qualitative data always ends up as quantitative data, as whatever comes out of the computer are numbers. {This is not necessarily true. Check  the work on qualitative outputs using Grounded Theory by Neumann (2015).} Of course this is a question related to the conceptual viewpoint. [Not convinced. It sounds like all sociology is actually physics because all people are really atoms. Formally, everything in computers is numbers because it has to be but that isn’t the same as saying that data structures or whatever don’t constitute a usable and coherent level of description: We “meet” and “his” opinion changes “mine” and vice versa. Somewhere, that is all binary but you can read the higher level code that you can understand as “social influence” (whatever you may think of the assumptions). Be clear whether this (like the “rules” claim) is a matter of definition – in which case it may not be useful (even if people are atoms we have no idea of how to solve the “atomic physics” behind the Prisoner’s Dilemma) or an empirical one (in which case some models may just prove it false). This (Beltratti et al 1996) contains no “rules” and no “numbers” (except in the trivial sense that all programming does).]

4.3: Also, an algorithm is expressed in code and can only be processed numerically, so it can only deliver quantitative data as output. These can perhaps be translated into qualitative concepts later. A way of doing this via the use of grounded theory is proposed in Neumann and Lotzmann (2016). {This refers to the same idea as my previous comment.} [Maybe it is “safest” to discuss this with rules because everyone knows those are used in ABM. Would it make sense to describe the outcome of a non trivial set of rules – accessed for example like this: Gladwin (1989) – as either “quantitative” or “numbers?”]

4.4: But is it true that data delivered as output is always quantitative? Let’s consider, for example, a consumer marketing scenario, where we define stereotypes (shopping enthusiast; solution demander; service seeker; disinterested shopper; internet shopper) that can change over time during a simulation run (Siebers et al 2010). These stereotypes are defined by likelihoods (likelihood to buy, wait, ask for help, and ask for refund). So, during a simulation run an agent could change its stereotype (e.g. from shopping enthusiast to disinterested shopper), influenced by the opinion of others and their own previous experience. So, at the beginning of the simulation run the agent can have a different stereotype compared to the end. Of course we could enumerate the five different stereotypes, and claim that the outcome is numeric, but the meaning of the outcome would be something qualitative – the stereotype related to that number. To me this would be a qualitative outcome, while the number of people that change from one stereotype to another would be a quantitative outcome. They would come in a tandem. {So, maybe the problem is that we don’t yet have the right ways of expressing or visualising qualitative output?} [This is an interesting and grounded example but could it be easily knocked down because everything is “hard coded” and therefore quantifiable? You may go from one shopper type to another – and what happens depends on other assumptions about social influence and so on – but you can’t “invent” your own type. Compare something like El Farol (Arthur 1994) where agents arguably really can “invent” unique strategies (though I grant these are limited to being expressed in a specified “grammar”).]

4.5: In order to define someone’s stereotype we would use numerical values (likelihood = proportion). However, stereotypes refer to nominal data (which refers to data that is used for naming or labelling variables, without any quantitative value). The stereotype itself would be nominal, while the way one would derive the stereotype would be numerical. Figure 1 illustrates a case in which the agent moves from the disinterested stereotype to the enthusiast stereotype. [Is there a potential confusion here between how you tell an agent is a type – parameters in the code just say so – and how you say a real person is a type? Everything you say about the code still sounds “quantitative” because all the “ingredients” are.]

Figure 1: Enthusiastic and Disinterested agent stereotypes

4.6: Let’s consider a second example, related to the same scenario: The dynamics over time to get from an enthusiastic shopper (perhaps via phases) to a disinterested shopper. This is represented as a graph where the x-axis represents time and the y-axis stereotypes (categorical data). If you want to take a quantitative perspective on the outcome you would look at a specific point in time (state of the system) but to take a qualitative perspective of the outcome, you would look at the pattern that the curve represents over the entire simulation runtime. [Although does this shade into the “eyeballing” sense of qualitative rather than the “built from subjective accounts” sense? Another way to think of this issue is to imagine “experts” as a source of data. We might build an ABM based on an expert perception of say, how a crime gang operates. That would be qualitative but not just individual rules: For example, if someone challenges the boss to a fight and loses they die or leave. This means the boss often has no competent potential successors.]

4.7: So, the inputs (parameters, attributes) to get the outcome are numeric, but the outcome itself in the latter case is not. The outcome only makes sense once it’s put into the qualitative context. And then we could say that the simulation produces some qualitative outputs. So, does the fact that data needs to be seen in a context make it evidence, i.e. do we only have quantitative and qualitative evidence on the output side? [Still worried that you may not be happy equating qualitative interview data with qualitative eyeballing of graphs. Mixes up data collection and analysis? And unlike qualitative interviews you don’t have to eyeball time series. But the argument of qualitative research is you can’t find out some things any other way because, to run a survey say, or an experiment, you already have to have a pretty good grasp of the phenomenon.]

4.8: If one runs a marketing campaign that will increase the number of enthusiastic shoppers. This can be seen as qualitative data as it is descriptive of how the system works rather than providing specific values describing the performance of a system. You could also equally express this algebraic terms which would make it quantitative data. So, it might be useful to categorise quantitative data to make the outcome easier to understand. [I don’t think this argument is definitely wrong – though I think it may be ambiguous about what “qualitative” means – but I think it really needs stripping down and tightening. I’m not completely convinced as a new reader that I’m getting at the nub of the argument. Maybe just one example in detail and not two in passing?]

Outcome

5.1: How we understand things and how the computer processes things are two different things. So, in fact qualitative data is useful for the conceptualisation and for describing experimentation and evaluation output, and needs to be translated into numerical data or algebraic constructs for the operationalisation. Therefore, we can reject our initial hypothesis, as we found more places where qualitative data can be useful. [Yes, and that might form the basis for a “general” definition of qualitative that was not tied to one part of the research process but you would have to be clear that’s what you were aiming at and not just accidentally blurring two different “senses” of qualitative.]

5.2: In the end of the discussion we picked up the idea of using Fuzzy Logic. Could perhaps fuzzy logic be used to describe qualitative output, as it describes a degree of membership to different categories? An interesting paper to look at in this context would be Sugeno and Yasukawa (1993). Also, a random idea that was mentioned is if there is potential in using “fuzzy logic in reverse”, i.e. taking something that is fuzzy, making it crisp for the simulation, and making it fuzzy again for presenting the result. However, we decided to save this topic for another discussion. [Devil will be in the detail. Depends on exactly what assumptions the method makes. Devil’s advocate: What if qualitative research is only needed for specification – not calibration or validation – but it doesn’t follow from this that that use is “really” quantitative? How intellectually unappealing is that situation and why?]

Conclusion

6.1: The purpose of this Research Note is really to stimulate you to think about, talk about, and share your ideas and opinions on the topic! What we present here is a philosophical impromptu discussion of our individual understanding of the topic, rather than a scientific debate that is backed up by literature. We still thought it is worthwhile to share this with you, as you might stumble across similar questions. Also, we don’t think we have found the perfect answers to the questions yet. So we would like to invite you to join the discussion and leave some comments in the chat, stating your point of view on this topic. [Is the danger of discussing these data types “philosophically”? I don’t know if it is realistic to use examples directly from social simulation but for sure examples can be used from social science generally. So here is a “quantitative” argument from quantitative data: “The view that cultural capital is transmitted from parents to their children is strongly supported in the case of pupils’ cultural activities. This component of pupils’ cultural capital varies by social class, but this variation is entirely mediated by parental cultural capital.” (Sullivan 2001). As well as the obvious “numbers” (social class by a generally agreed scheme) there is also a constructed “measure” of cultural capital based on questions like “how many books do you read in a week?” Here is an example of qualitative data from which you might reason: “I might not get into Westbury cos it’s siblings and how far away you live and I haven’t got any siblings there and I live a little way out so I might have to go on a waiting list … I might go to Sutton Boys’ instead cos all my mates are going there.” (excerpt from Reay 2002). As long as this was not just a unique response (but was supported by several other interviews) one would add to one’s “theory” of school choice: 1) Awareness of the impact of the selection system (there is no point in applying here whatever I may want) and 2) The role of networks in choice: This might be the best school for me educationally but I won’t go because I will be lonely.]

Biographies of the authors

Peer-Olaf Siebers is an Assistant Professor at the School of Computer Science, University of Nottingham, UK. His main research interest is the application of Computer Simulation and Artificial Intelligence to study human-centric and coupled human-natural systems. He is a strong advocate of Object-Oriented Agent-Based Social Simulation and is advancing the methodological foundations. It is a novel and highly interdisciplinary research field, involving disciplines like Social Science, Economics, Psychology, Geography, Operations Research, and Computer Science.

Kwabena Amponsah is a Research Software Engineer working for the Digital Research Service, University of Nottingham, UK. He completed his PhD in Computer Science at Nottingham in 2019 by developing a framework for evaluating the impact of communication on performance in large-scale distributed urban simulations.

James Hey is a PhD student at the School of Computer Science, University of Nottingham, UK. In his PhD he investigates the topic of surrogate optimisation for resource intensive agent based simulation of domestic energy retrofit uptake with environmentally conscious agents. James holds a Bachelor degree in Economics as well as a Master degree in Computer Science.

Edmund Chattoe-Brown is a lecturer in Sociology, School of Media, Communication and Sociology, University of Leicester, UK. His career has been interdisciplinary (including Politics, Philosophy, Economics, Artificial Intelligence, Medicine, Law and Anthropology), focusing on the value of research methods (particularly Agent-Based Modelling) in generating warranted social knowledge. His aim has been to make models both more usable generally and particularly more empirical (because the most rigorous social scientists tend to be empirical). The results of his interests have been published in 17 different peer reviewed journals across the sciences to date. He was funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) by the ESRC via ORA Round 5.

Melania Borit is an interdisciplinary researcher and the leader of the CRAFT Lab – Knowledge Integration and Blue Futures at UiT The Arctic University of Norway. She has a passion for knowledge integration and a wide range of interconnected research interests: social simulation, agent-based modelling; research methodology; Artificial Intelligence ethics; pedagogy and didactics in higher education, games and game-based learning; culture and fisheries management, seafood traceability; critical futures studies.

References

Achter S, Borit M, Chattoe-Brown E, and Siebers PO (2022) RAT-RS: a reporting standard for improving the documentation of data use in agent-based modelling. International Journal of Social Research Methodology, DOI: 10.1080/13645579.2022.2049511

Australian Bureau of Statistics (2022) Statistical Language > Qualitative and Quantitative data. https://www.abs.gov.au/websitedbs/D3310114.nsf/Home/Statistical+Language (last accessed 05/05/2022)

Arthur WB (1994) Inductive reasoning and bounded rationality. The American Economic Review, 84(2), pp.406-411. https://www.jstor.org/stable/pdf/2117868.pdf

Becker HS (1953). Becoming a marihuana user. American Journal of Sociology, 59(3), pp.235-242. https://www.degruyter.com/document/doi/10.7208/9780226339849/pdf

Beltratti A, Margarita S, and Terna P (1996) Neural Networks for Economic and Financial Modelling. International Thomson Computer Press.

Business Research Methodology (2022) Quantitative Data Analysis. https://research-methodology.net/research-methods/data-analysis/quantitative-data-analysis/ (last accessed 05/05/2022)

Chattoe-Brown E (2009) The social transmission of choice: a simulation with applications to hegemonic discourse. Mind & Society, 8(2), pp.193-207. DOI: 10.1007/s11299-009-0060-7

Gladwin CH (1989) Ethnographic Decision Tree Modeling. SAGE Publications.

Hastorf AH and Cantril H (1954) They saw a game; a case study. The Journal of Abnormal and Social Psychology, 49(1), pp.129–134.

Helitzer-Allen DL and Kendall C (1992) Explaining differences between qualitative and quantitative data: a study of chemoprophylaxis during pregnancy. Health Education Quarterly, 19(1), pp.41-54. DOI: 10.1177%2F109019819201900104

Miles MB and Huberman AM (1994) . Qualitative Data Analysis: An Expanded Sourcebook. Sage

Neumann M (2015) Grounded simulation. Journal of Artificial Societies and Social Simulation, 18(1)9. DOI: 10.18564/jasss.2560

Neumann M and Lotzmann U (2016) Simulation and interpretation: a research note on utilizing qualitative research in agent based simulation. International Journal of Swarm Intelligence and Evolutionary Computing 5/1.

Reay D (2002) Shaun’s Story: Troubling discourses of white working-class masculinities. Gender and Education, 14(3), pp.221-234. DOI: 10.1080/0954025022000010695

Siebers PO, Achter S, Palaretti Bernardo C, Borit M, and Chattoe-Brown E (2019) First steps towards RAT: a protocol for documenting data use in the agent-based modeling process (Extended Abstract). Social Simulation Conference 2019 (SSC 2019), 23-27 Sep, Mainz, Germany.

Siebers PO, Aickelin U, Celia H and Clegg C (2010) Simulating customer experience and word-of-mouth in retail: a case study. Simulation: Transactions of the Society for Modeling and Simulation International, 86(1) pp. 5-30. DOI: 10.1177%2F0037549708101575

Skinner J, Edwards A and Smith AC (2021) Qualitative Research in Sport Management – 2e, p171. Routledge.

Strauss AL (1987). Qualitative Analysis for Social Scientists. Cambridge University Press.

Sugeno M and Yasukawa T (1993) A fuzzy-logic-based approach to qualitative modeling. IEEE Transactions on Fuzzy Systems, 1(1), pp.7-31.

Sullivan A (2001) Cultural capital and educational attainment. Sociology 35(4), pp.893-912. DOI: 10.1017/S0038038501008938

Wilkinson D (2022) What’s the difference between data and evidence? Evidence-based practice. https://oxford-review.com/data-v-evidence/ (last accessed 05/05/2022)


Notes

[1] An updated set of the terminology, defined by the RAT task force in 2022, is available as part of the RAT-RS in Achter et al (2022) Appendix A1.


Peer-Olaf Siebers, Kwabena Amponsah, James Hey, Edmund Chattoe-Brown and Melania Borit (2022) Discussions on Qualitative & Quantitative Data in the Context of Agent-Based Social Simulation. Review of Artificial Societies and Social Simulation, 16th May 2022. https://rofasss.org/2022/05/16/Q&Q-data-in-ABM


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)

Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM

By Sebastian Achter, Melania Borit, Edmund Chattoe-Brown, Christiane Palaretti and Peer-Olaf Siebers

The initiative presented below arose from a Lorentz Center workshop on Integrating Qualitative and Quantitative Evidence using Social Simulation (8-12 April 2019, Leiden, the Netherlands). At the beginning of this workshop, the attenders divided themselves into teams aiming to work on specific challenges within the broad domain of the workshop topic. Our team took up the challenge of looking at “Rigour, Transparency, and Reuse”. The aim that emerged from our initial discussions was to create a framework for augmenting rigour and transparency (RAT) of data use in ABM when both designing, analysing and publishing such models.

One element of the framework that the group worked on was a roadmap of the modelling process in ABM, with particular reference to the use of different kinds of data. This roadmap was used to generate the second element of the framework: A protocol consisting of a set of questions, which, if answered by the modeller, would ensure that the published model was as rigorous and transparent in terms of data use, as it needs to be in order for the reader to understand and reproduce it.

The group (which had diverse modelling approaches and spanned a number of disciplines) recognised the challenges of this approach and much of the week was spent examining cases and defining terms so that the approach did not assume one particular kind of theory, one particular aim of modelling, and so on. To this end, we intend that the framework should be thoroughly tested against real research to ensure its general applicability and ease of use.

The team was also very keen not to “reinvent the wheel”, but to try develop the RAT approach (in connection with data use) to augment and “join up” existing protocols or documentation standards for specific parts of the modelling process. For example, the ODD protocol (Grimm et al. 2010) and its variants are generally accepted as the established way of documenting ABM but do not request rigorous documentation/justification of the data used for the modelling process.

The plan to move forward with the development of the framework is organised around three journal articles and associated dissemination activities:

  • A literature review of best (data use) documentation and practice in other disciplines and research methods (e.g. PRISMA – Preferred Reporting Items for Systematic Reviews and Meta-Analyses)
  • A literature review of available documentation tools in ABM (e.g. ODD and its variants, DOE, the “Info” pane of NetLogo, EABSS)
  • An initial statement of the goals of RAT, the roadmap, the protocol and the process of testing these resources for usability and effectiveness
  • A presentation, poster, and round table at SSC 2019 (Mainz)

We would appreciate suggestions for items that should be included in the literature reviews, “beta testers” and critical readers for the roadmap and protocol (from as many disciplines and modelling approaches as possible), reactions (whether positive or negative) to the initiative itself (including joining it!) and participation in the various activities we plan at Mainz. If you are interested in any of these roles, please email Melania Borit (melania.borit@uit.no).

Acknowledgements

Chattoe-Brown’s contribution to this research is funded by the project “Towards Realistic Computational Models Of Social Influence Dynamics” (ES/S015159/1) funded by ESRC via ORA Round 5 (PI: Professor Bruce Edmonds, Centre for Policy Modelling, Manchester Metropolitan University: https://gtr.ukri.org/projects?ref=ES%2FS015159%2F1).

References

Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J. and Railsback, S. F. (2010) ‘The ODD Protocol: A Review and First Update’, Ecological Modelling, 221(23):2760–2768. doi:10.1016/j.ecolmodel.2010.08.019


Achter, S., Borit, M., Chattoe-Brown, E., Palaretti, C. and Siebers, P.-O.(2019) Cherchez Le RAT: A Proposed Plan for Augmenting Rigour and Transparency of Data Use in ABM. Review of Artificial Societies and Social Simulation, 4th June 2019. https://rofasss.org/2019/06/04/rat/


© The authors under the Creative Commons’ Attribution-NoDerivs (CC BY-ND) Licence (v4.0)