16. Activity Theory
This chapter is about a theory that was developed decades ago. Some of the basic ideas of the theory were formulated before the word "computer" was ever invented. Then why does the Encyclopaedia of Human-Computer Interaction feature a chapter on the theory? In other words, Why activity theory?
The question can be answered in two steps.
Activity is currently one of the most fundamental concepts in HCI research (Moran, 2006). Early HCI was predominantly concerned with understanding and supporting tasks, which people do to achieve clear predetermined goals (such as making certain changes in a document). The issues of why a person carries out a task and what the task means to the person were typically outside the scope of analysis, evaluation, and design. However, with interactive technology becoming a part of our everyday environments the focus on tasks proved to be insufficient. Understanding and designing technology in the context of purposeful, meaningful activities is now a central concern of HCI research and practice. Virtually all significant recent developments in interactive technologies — think about, for instance, social media, smartphones, and bookreaders — owe their success to helping us live fuller lives rather than merely supporting new types of tasks.
Most people have an intuitive understanding of what activities are. Is there any need for a theory here?
A problem with intuitive, commonsense notions of activity is that they can be different for different people. In addition, they may be not specific enough. How to distinguish activities from non-activities? Can activities be broken down into smaller units? What role does technology play in human activity? To answer these and other similar questions HCI needs a more elaborated concept of activity. Such concept is offered by activity theory, discussed in this chapter.
Activity theory is a conceptual framework originating from the socio-cultural tradition in Russian psychology. The foundational concept of the framework is “activity”, which is understood as purposeful, transformative, and developing interaction between actors (“subjects”) and the world (“objects”). The framework was originally developed by the Russian psychologist Aleksei Leontiev ( (Leontiev 1978; Leontiev 1981). A version of activity theory, based on Leontiev’s framework, was proposed in the 1980s by the Finnish educational researcher Yrjö Engeström (1987). Currently, both Leontiev’s and Engeström’s variants of activity theory, as well as their combinations, are being widely used interdisciplinarily, not only in psychology, but also in a range of other fields, including education, organizational learning, and cultural studies. )
Since the early 1990s, activity theory has been a visible landmark of the theoretical landscape of Human-Computer Interaction (HCI). In the last two decades, activity theory, along with some other frameworks, such as distributed cognition and phenomenology, has established itself as a leading post-cognitivist approach in HCI and interaction design (e.g., Bødker, 1991; Nardi, 1996a; Bertelsen and Bødker, 2003; Kaptelinin et al., 2003; Kaptelinin and Nardi, 2006). Carroll 2011 observes that: “Information processing psychology and laboratory user studies, once the kernel of HCI research, became important, but niche areas. The most canonical theory-base in HCI now is socio-cultural, Activity Theory.”
This chapter discusses the past, present, and future of activity theory as a theoretical approach in HCI. It starts with a brief introduction to the basic concepts and principles of activity theory, continues to describe its key contributions to research in HCI and interaction design, and concludes with reflections on challenges and prospects for further development of the approach.
The chapter is not intended to be a comprehensive exposition of the framework and its uses in HCI. More detailed discussions of activity theory concepts and applications in the context of HCI research can be found for instance, in Bødker (1991), Nardi (1996a), Engeström et al. (1999), Kaptelinin and Nardi (2006), and Kaptelinin and Nardi (2012)).
16.2 Brief overview of activity theory
16.2.1 Historical roots and underlying assumptions
The immediate conceptual roots of activity theory can be traced to Russian/Soviet psychology of the 1920s and 1930s (. During that time theoretical explorations in Russian psychology were heavily influenced by Marxist philosophy. A collective effort of a number of prominent psychologists, most notably Lev Vygotsky and Sergey Rubinshtein—which effort also involved much disagreement and even open conflicts—gave rise to a socio-cultural perspective (understood in a broad sense) in Russian psychology (e.g., Vygotsky, 1978; Rubinshtein, 1946; Rubinshtein 1986). )
The main conceptual thrust of the socio-cultural perspective was to overcome the divide between, on the one hand, human mind, and on the other hand, culture and society. As opposed to most psychological frameworks of that time, the perspective considered culture and society generative forces, “responsible” for the very production of human mind, rather than external factors, however important, that merely constitute conditions for the functioning of the mind without changing its basic nature.
The work based on the socio-cultural perspective produced a number of fundamental insights. Some of the most important contributions were as follows:
- Vygotsky’s universal law of development, according to which human mental functions first emerge as distributed between the person and other people (i.e., “inter-psychological ”) and only then as individually mastered by the person himself or herself (i.e., “intra-psychological”), and
- Rubinshtein’s principle of “unity and inseparability of consciousness and activity”, according to which human conscious experience and human acting in the world, the internal and the external, are closely interconnected and mutually determine one another.
Aleksei Leontiev’s activity theory ( emerged as an outgrowth of the socio-cultural perspective. The theory employs a number of ideas developed by Vygotsky, Leontiev’s mentor and friend. It is also strongly influenced by the work of Rubinshtein, a major figure in Russian psychology and a long-time colleague of Leontiev’s (Brushlinsky and Aboulhanova-Slavskaya, 2000). Arguably, activity theory also features some other important influences which are more difficult to discern, such as the framework developed by Mikhail Basov (Basov, 1991). The basic assumptions of activity theory are the same as those underlying the socio-cultural perspective in general: namely, the assumptions of the social nature of human mind and inseparability of human mind and activity. )
At the same time, Leontiev’s activity theory is not a simple imprint of all these influences. As discussed below, while the theory incorporates a variety of ideas developed by Vygotsky, Rubinshstein, and others, these ideas have been revised and elaborated upon by Leontiev to form his own distinct and consistent conceptual framework.
16.3 Basic concepts and principles of Leontiev's framework
16.3.1 The concept of 'activity'
Activity, in a broad sense, is an interaction of the actor (e.g., a human being) with the world. The interaction, according to activity theory terminology, is described as a process relating the subject (S) and the object (O). A common way to represent activity is “S ⇔ O”. There are two key aspects differentiating activity from other types of interaction: (a) subjects of activities have needs, which should be met through an interaction with the world, and (b) activities and their subjects mutually determine one another; or, more generally, activities are generative forces that transform both subjects and objects.
Subjects have needs. Activity is understood as a “unit of life” of a material subject existing in the objective world. Subjects have their own needs and, in order to survive, have to carry out activities, that is, interact with objects of the world to meet their needs. Leontiev’s analysis was mostly concerned with activities of individual human beings, but the notion of “subject” is not limited to individual humans. Other types of entities, such as animals, teams, and organizations can also have need-based agency and, therefore, be subjects of activities (Kaptelinin and Nardi, 2006).
Activities and their subjects mutually determine one another. It is immediately obvious that activities are influenced by the attributes of subjects and objects. Consider a simple example. Undoubtedly, whether or not a person can solve a math problem depends on the nature of the problem (e.g., how difficult it is) and the person’s abilities and skills (i.e., how good the person is at math). In the long run, however, the opposite is also true: both the object and the subject are over time transformed by the activity. It is apparent, for instance, that a person’s math skills are a result of previous experience: they have developed through solving math problems in the past. In other words, while it is true that a person’s math abilities determine how the person solves math problems, it is also true that solving math problems determine the person’s math abilities. Therefore, subjects do not only express themselves in their activities; in a very real sense they are produced by the activities (cf. Rubinshtein, 1986).
Mind and activity: Leontiev vs. Rubinshtein
Leontiev extends and develops Rubinshtein’s principle of unity and inseparability of consciousness and activity in three respects. First, Leontiev states that psychological studies should not be focusing only on the “psychological aspect or facet of activity” (as suggested by Rubinshtein), such as the relationship between activity and subjective experiences. Instead, he maintained that the relevance of activity to psychology is of a more general nature: activity is of fundamental importance to psychology because of its special function, the function of placing the subject in the objective reality and transforming this reality into a form of subjectivity (Leontiev, 1978). Second, as discussed below, Leontiev’s analysis focuses on both conscious and unconscious mental phenomena. Third and finally, Leontiev offered a number of more concrete insights about the relationship between mind and activity, most notably the idea of structural similarity between internal and external processes (Leontiev, 1978; Leontiev, 1981).
16.3.2 Basic principles
The main ideas and assumptions of activity theory, outlined above, have been elaborated by Leontiev into a set of more specific notions, claims, and arguments. A common problem with interpreting Leontiev’s texts is that they often reflect the unfolding logic of his conceptual explorations rather than provide a systematic overview of the logical structure of the framework as a whole. There have been several attempts to translate the representation of Leontiev’s framework, as it is described in his texts, into a structured set of distinct principles. Kaptelinin and Nardi (2006), building on Wertsch (1981), identify the following principles:
This principle (which bears some similarity to phenomenology’s notion of “intentionality” - see Dourish, 2001) is directly related to the very concept of activity as a “subject-object” relationship. Why is subjects’ interaction with the world defined in terms of interacting with objects? The explanation, offered by the principle of object-orientedness, is as follows. The world is structured; it comprises discrete objectively existing entities, that is, objects. Subjects’ interaction with the world is also structured; it is organized around the objects. Objects have their “objective” meanings, determined by their relationship with other entities existing in the world (including the subject). In order to meet their needs, the subject has to reveal the objective meaning of the objects, at least partly, and act accordingly.
Therefore, the object of activity has two facets, it should be understood:
- First, in its independent existence as subordinating to itself and transforming the activity of the subject
- Second, as an image of the object, as a product of its property of psychological reflection that is realized as an activity of the subject and cannot exist otherwise (Leontiev, 1978).
These two facets do not necessarily always coincide. They are dynamically aligned in the unfolding “subject-object” interaction. The alignment involves a double transition: the subject’s activity is subordinated to properties of the object which gives rise to new activity structures; in turn, new activity structures bring about new subjective phenomena, such as a more developed image of the object. For instance, a tourist wandering around an area may initially have a vague idea about the area and simply follow the constraints and possibilities provided by the environment. Over time, emerging patterns of walking may result in a development of an elaborated cognitive map of the area.
The principle of object-orientedness applies differently to animals and human beings. Animals live in a structured world of natural objects which are material and mostly have direct positive or negative meanings and values, provide affordances for action, and so forth. Human beings live in a predominantly man-made world, where objects are not necessarily physical things: they can be intangible, but they can still be considered “objects” as long as they objectively exist in the world. For instance, the objects of learning a new language or making a company profitable are impossible to touch, physically weigh, or measure with a ruler. However, the grammatical structure of a language or profit margin of a company does not exist merely in a person’s imagination. Rather, they are “facts of life”, which need to be faced and dealt with. “Objective” is understood in activity theory in a broad sense as including not only the properties of things that can be directly registered with physical instruments, but also socially and culturally defined properties.
Therefore, the principle of object-orientedness states that all human activities are directed toward their objects and are differentiated from one another by their respective objects. Objects motivate and direct activities, around them activities are coordinated, and in them activities are crystallized when the activities are complete. Analysis of objects is therefore a necessary requirement for understanding human activities, both individual and collective ones.
Lost in translation: “Predmet” vs. “objekt”
There is a language problem, which makes an adequate translation of Leontiev’s notion of “object” from Russian to English somewhat complicated. In Russian there are two words which have similar but distinct meanings: “objekt” and “predmet”. Both refer to objectively existing entities, but the notion of “predmet” typically also implies a relevance of the entity in question to certain human purposes or interests (. Similar linguistic distinctions can be found in German and some other languages. Leontiev deliberately referred to the object of activity as “predmet” rather than “objekt”. However, this distinction is usually lost in English translation since both words are translated as “object”. This linguistic problem is a likely reason why the emergence of objects of activities-the dynamics of “just objects” becoming involved in activities and acquiring the status of “objects of activities”, and vice versa-have not so far received the attention they deserve in concrete studies informed by activity theory. )
18.104.22.168 Hierarchical structure of activity
Human activities, according to Leontiev, are units of life which are organized into three hierarchical layers (see Figure 2). The top layer is the activity itself, which is oriented toward a motive, corresponding to a certain need. The motive is the object that the subject ultimately needs to attain. For instance, in some cultural contexts people reaching a certain age need to learn how to drive a car (and get a driver’s license); it is a general prerequisite of being a fully functional member of society. Learning how to drive a car is an activity which is organized as a multi-layer system of sub-units directed at getting a driver’s license. Actions are conscious processes directed at goals which must be undertaken to fulfil the object. Goals can be decomposed into sub-goals, sub-sub-goals, and so forth. For instance, one may decide to enroll in a driving school, purchase instructional materials, make a schedule of theoretical lessons and practice sessions, etc. Actions are implemented through lower-level units of activity, called operations. Operations are routine processes providing an adjustment of an action to the ongoing situation. They are oriented toward the conditions under which the subject is trying to attain a goal. People are typically not aware of their operations. For instance, a driving school student taking notes during a lecture might be fully concentrated on traffic rules rather than the process of writing. Operations emerge in two ways. First, an operation can be a result of step-by-step automatization of an originally conscious action (e.g., over time, the action of changing lanes may transform into a routine operation, which does not require conscious control). When such operations fail, they are often transformed into conscious actions again. Second, an operation can be a result of “improvisation”, a spontaneous adjustment of an action on the fly (e.g., in an emergency situation the driver may act “instinctively”, without thinking).
The three-layer model only applies to human activities. Complex relationships between motives (i.e., what motivates the activity) and goals (i.e., what directs the activity) is a characteristic feature of humans. While animals usually act directly toward the objects that motivate them (e.g., food), humans often attain their motives by directing their efforts to other things (e.g., however hungry, people usually grab a menu, rather than the first available food, upon entering a restaurant). This feature, according to Leontiev, is a product of the complex social organization of human life. In particular, the emergence of division of labour entails the need for some people to focus on objects, different from the ones that actually meet a certain need. For instance, the actions of primordial hunters who scare the game away (i.e., “beaters”) may look paradoxical if one does not know that the game is directed toward another group of hunters, waiting in the ambush (i.e., “ambushers”). Once the feature of the social organization of life, the dissociation between motivating objects (motives) and directing objects (goals) shapes the structure of individual activities and becomes its characteristic feature, as well (Leontiev, 1981).
Considering human activity as a three-layer system opens up a possibility for a combined analysis of motivational, goal-directed, and operational aspects of human acting in the world, that is, bringing together the issues of Why, What, and How within a consistent conceptual framework (Bødker, 1991). Realizing this possibility in a concrete study may, however, be problematic. Revealing the ultimate motives of a person or the fine-grain structure of automatic operations may prove to be difficult, if not impossible. This limitation of Leontiev’s three-layer model as an analytical tool can be overcome by employing an expansive “actions first” strategy. This strategy involves starting analysis from the actions layer which relatively easily yields itself to qualitative research methods. In particular, people are usually aware of their goals and can report or express them in a certain way. Then the analysis can be expanded both “up”, to progressively higher level goals and, ultimately, motives, and “down”, to sub-goals and operations. The expanding scope of analysis may not cover the entire structure of the activity in question but be sufficient for the purposes of the task at hand (see also Kaptelinin and Nardi, 2006).
Arguably, mediation is the primary dimension along which human beings differ from other animals. It is mediation which has made homo sapiens such a successful species: while we do not have sharp claws and thick fur, we compensate that by employing mediating artefacts, such as hammers, knives, and warm clothes. The effect of complex social organization on the structure of individual human activity, discussed above and illustrated by the case of primordial hunters, is another example of mediation. In fact, the main distinctive features of humans, such as language, society and culture, the production and use of advanced tools, etc., all involve mediation. They represent different aspects of the same phenomenon, that is, the emergence of a complex system of objects and structures, both material and immaterial which serve as mediating means embedded in the interaction between human beings and the world and shaping the interaction.
Activity theory inherits its special interest in mediation from the approach that made the most fundamental impact on Leontiev’s framework - that is, Vygotsky’s cultural-historical psychology. In cultural-historical psychology mediation is, arguably, the most important concept of all; it serves as the cornerstone of the approach as a whole. Vygotsky proposed that the very nature of human mental processes, as opposed to animals’ mental processes, is defined by mediation. Vygotsky’s ideas concerning mediation were explicitly incorporated into the conceptual framework of activity theory but placed in a somewhat different theoretical context. As opposed to Vygotsky, who was predominantly interested in particular higher mental functions and their ontogenetic development and, therefore, particularly concerned with means that mediate specific mental operations (especially, signs), Leontiev's mainly focussed on means that mediate a purposeful object-oriented activity as a whole.
Tool mediation allows for appropriating socially developed forms of acting in the world. Tools reflect the previous experience of other people, which experience is accumulated in the structural properties of tools, such as their shape or material, as well as in the knowledge of how the tool should be used (see Figure 3 to 8). Therefore, the use of tools is a form of accumulation and transmission of social, cultural knowledge. Tools not only shape the external behaviour. As discussed below, through internalization they also influence the mental functioning of individuals. For instance, a person’s cognitive map of a city may depend on whether or not the person is a car driver. Some folklore sayings also suggest that our perception of the world is affected by the tools we are using, e.g.: “If all you have is a hammer, everything looks like a nail.”
22.214.171.124 Example of mediation and accumulation of experience over time: Devices for calculation and computation
126.96.36.199 Internalization and externalization
This principle states that human activities are distributed - and dynamically re-distributed - along the external/internal dimension. Any human activity contains both internal and external components. Sometimes external components are hardly visible: they can be reduced, for instance, to eye movements or even patterns of brain activation, but they are always present. The concepts of internalization and externalization refer to the processes of mutual transformations between internal and external components of an activity.
In the process of internalization external components become internal. For instance, young children often use their fingers to do simple math, but over time the use of fingers typically becomes redundant. An inexperienced driver may speak aloud to remind himself of the “parallel parking” procedure, but the need for speaking aloud is likely to disappear with practice.
The concept of internalization in activity theory is similar to the one proposed in some other frameworks, most notably Vygotsky’s cultural-historical psychology. Within Vygotsky’s framework the notion of internalization predominantly refers to a step in the development of higher mental functions, at which sign mediation initially emerging in the external plane eventually progresses to the internal plane. In activity theory internalization is used in a broader meaning as any re-distribution of internal and external components of an activity that results in a shift from the external to the internal.
Furthermore, in activity theory internalization is considered as just one type of transition, and providing a full account of the dynamics of activity “...necessarily presupposes the existence of regularly occurring transitions in the opposite direction also, from internal to external activity. “ (Leontiev, 1978). The process, opposite to internalization is externalization - that is, transformation of internal components of an activity into external ones. An example of externalization is sketching a design idea (see Figure 9).
Leontiev observed that in modern forms of work internal and external components of activity are becoming increasingly intertwined: ”Physical work accomplishing a practical transformation of material objects, ever more “intellectualized,” incorporates into itself the carrying out of more complex mental acts; at the same time the work of the contemporary researcher, activity that is specially cognitive, intellectual par excellence, is ever more filled with processes that in their form are external actions” (Leontiev, 1978).
In a similar vein, an activity, which is initially socially distributed, that is, distributed between several people (e.g., driving a car by a person taking a driving lesson, distributed between the learner and driving instructor) can be appropriated by a person (i.e., the learner) and then carried out individually. The opposite process is the transformation of an individual activity into a socially distributed one, e.g., when a person initiates a group project or other people intervene to help an individual to carry out her actions (Cole and Engeström, 1993). The dimensions of internal/external and individual/social are similar to one another in many respects and are closely related. For instance, when an internal activity is externalized, it also affects the individual-collective dimension: for instance, tools and signs employed in externally distributed actions can be shared and thus enable social distribution of the actions.
Finally, activity theory requires that activities always be analysed in the context of development. Development in activity theory is both an object of study and research strategy. As an object of study, development constitutes a complex phenomenon that can be analysed at different levels. Examples of the levels of analysis include studying the development of various forms of animal activity in biological evolution (phylogenesis), emergence of specifically human forms of activity in social history (sociogenesis), individual development throughout various phases of life (ontogenesis), appropriation of particular artefacts (instrumental genesis, Rabardel and Bourmaud, 2003), and so forth.
In activity theory development is also a research strategy. Analysis of the dynamics of how the object of study transforms over time is considered essential for a deep understanding of the object. Activity theory does not prescribe a single method of study since different types and levels of development require different methods or combinations of methods.
The developmental research perspective adopted by activity theory is often associated with dialectical logic, a concept and framework introduced by the Russian philosopher Evald Ilyenkov (Ilyenkov 2008; see Engeström et al., 1999). Dialectical logic is different from traditional formal logic in how it views contradictions and development. Traditional logic invariantly considers contradictions as indicators of problems that need to be addressed. Contradictions are to be eliminated in order to create a perfectly logical system (either an abstract one, such as a model or theory, or more concrete one, such as the management structure of an organization). In addition, traditional logic is typically not concerned with development; perfectly logical systems do not need to be changed and may stay as they are indefinitely.
Dialectical logic starts from a different assumption. It is assumed that dialectical development-that is, development driven by contradictions-is a fundamental aspect of all imaginable objects of study and therefore should be taken into consideration in analysis. While some “superficial” contradictions can be eliminated in a relatively straightforward way, there are also other, deeper contradictions which cannot be simply resolved once and for all. Any solution intended to resolve such contradictions is temporary, for it gives rise to new contradictions. An example of a contradiction of this type, well known to HCI researchers, is the contradiction between tasks and artefacts. The notion of “task-artefact cycle” (Carroll, 1991) implies that the ultimate balance between tasks and artefacts cannot be achieved. A new artefact changes the task for which it is developed which means that another artefact needs to be developed to support the new task, and so on and so forth.
Dialectical logic posits that analysis of the object of study which only deals with how the object exists at the present time is insufficient. Instead, analysis of the developmental trajectory of the object-preferably, starting from an initial undeveloped form (i.e., a “germ”)-is claimed to be critically important for understanding how the object has come to be what it is, and what contradictions can be expected to drive its further development.
The principles of activity theory, described above, comprise an integrated system: they represent different aspects of human activity as a whole. Systematic application of any of the principles often makes it necessary to eventually engage the others as well. For instance, analysis of the effects of certain technologies on human cognition from an activity theoretical perspective would require identifying the variety of activities, as well as their respective objects within which the technologies are being employed (object-orientedness), the role and place of the technologies in the hierarchical structure of each of these activities (hierarchical structure), how the activities are being re-shaped by using the technologies as mediating means (mediation), and how transformations of external components of activity are related to corresponding changes of internal components (externalization and internalization). And all these phenomena should be analysed as they unfold over time (development).
16.3.3 Engeström's activity system model
Leontiev’s approach is predominantly concerned with activities of individual human beings. While Leontiev explicitly mentions that activities can be carried out not only by individual human beings but also by social entities (collective subjects), too, he does not systematically explore the structure and development of collective activities and does not present a conceptual model of collective activity (which can probably be explained, at least partly, by the ideology-related limitations and constraints that were imposed on studies of social phenomena in the USSR). A model of collective activity, the “activity system model” (a.k.a. “Engeström’s triangle”) was proposed by the Finnish educational researcher Yrjö Engeström (1987). The model is a result of a two-step extension of Leontiev’s original concept of activity— that is, activity understood as the “subject-object” interaction — to the case of collective activity.
The first step, the most significant revision of Leontiev’s notion of activity as the “subject-object” interaction, was adding a third node, “community”, which resulted in a structure comprising a three-way interaction between “subject”, “object”, and “community”. This structure can be represented as a down-pointing triangle (see Figure 10). Second, it was suggested that each of the three particular interactions within the structure is mediated by a special type of meditational means. Concrete mediational means for these interactions, according to Engeström, are: (a) tools/ instruments for the “subject - object” interaction (as also posited by Leontiev), (b) rules for the “subject - community” interaction, and (c) division of labour for the “community - object” interaction. In addition, the model includes the outcome of the activity system as a whole: a transformation of the object produced by the activity in question into an intended result, which can be utilized by other activity systems. The complete model is shown in Figure 11.
As an example, consider the activity of an interaction designer who works as a member of a design team on redesigning the user interface of a computer application. The object of the activity is the existing interface, and the expected outcome is a new interface. The interaction designer employs a variety of tools in her work on the object, including physical objects (e.g., computers), software (e.g., development environments), and methods and techniques (e.g., personas). The community comprises other members of the team: interaction designers, the project manager, technicians, etc. The interaction designer’s relation with the community is mediated by explicit and implicit rules, e.g., taking part in project meetings, receiving certain financial rewards, etc. Furthermore, producing the outcome of the activity system as a whole, a new interface, is the responsibility of the entire design team: the effort of the interaction designer is a part of a larger effort of the team. Therefore, the work of the interaction designer needs to be coordinated with the work of other team members. This coordination is achieved by employing a division of labour, which thus mediates the relation between the design team and its object.
When studying complex real-life phenomena, applying one activity system model is often not sufficient. Such phenomena need to be represented as networks of activity systems. For instance, redesigning the user interface of a computer application can be a part of an even larger-scale effort, involving several design teams, directed at developing a new version of the computer application in question. Redesigning the user interface in that case would provide a partial outcome which would need to be integrated with outcomes of other activity systems (e.g., a team developing new functionality of the product) to achieve the overarching purpose of a network of activity systems.
A key tenet of Engeström’s framework is that activity systems are constantly developing. The development is understood in a dialectical sense as a process driven by contradictions. Engeström identifies four types of contradictions in activity systems:
- Primary contradictions are inner contradictions of each of the nodes of an activity system. For instance, the mediating means used by a physician include various medications which, on the one hand, have certain medical effects, and, on the other hand, are products with associated costs, legal regulations, distribution channels, etc.
- Secondary contradictions are those that arise between the nodes of an activity system. For instance, a certain type of medical treatment may be unsuitable for certain patients
- Tertiary contradictions describe potential problems emerging in the relationship between the existing forms of an activity system and its potential, more advanced object and outcome. The advancement of an activity system as a whole may be undermined by the resistance to change, demonstrated by the existing organization of the activity system.
- Finally, quaternary contradictions refer to contradictions within a network of activity systems, that is, between an activity system and other activity systems involved in the production of a joint outcome.
The activity system model has been employed in a range of disciplines, especially education and organizational learning (see, e.g., CRADLE, 2011).
16.3.4 Current diversity of activity theoretical frameworks
The approaches developed by, respectively, Leontiev and Engeström are currently the most common variants of activity theory. The approaches provide complementary perspectives on human activities. Leontiev’s variant mostly focuses on individuals understood as social creatures acting in social contexts. Engeström’s activity system model, on the other hand, is predominantly concerned with collective activities carried out by groups and organizations and implemented through contributions—i.e., actions—of individual subjects.
In addition, a number of other current frameworks are partly influenced by activity theory and partly built upon other approaches. Such frameworks include, for instance, instrumental genesis (Rabardel and Bourmaud, 2003), genre tracing (Spinuzzi, 2003), and the systemic-structural activity theory (Bedny and Harris, 2005; Bedny and Karwowski, 2003).
16.4 Activity theory in HCI and interaction design
16.4.1 Activity theory as a second-wave, post-cognitivist HCI theory
The dominant paradigm in HCI when it appeared as a field in early 80s was information processing (“cognitivist”) psychology. But the HCI community gradually came to realize that the focus on information processing was not sufficient. Individuals’ interests, needs, frustrations, and so forth, proved to be important and powerful factors in choosing, learning, and using a technology. Furthermore, it was becoming increasingly obvious that the use of technology critically depends on complex, meaningful, social, and dynamic contexts in which it takes place. The inner logic of the development of the field required that the scope of HCI be expanded to include the issues of motivation, meanings, culture, and social interactions. However, the cognitivist approach could not provide conceptual tools for dealing with such issues. When the limitations of the information processing psychology in HCI became widely acknowledged (Carroll, 1991), activity theory was identified as a potential alternative theoretical foundation for the field (Bødker, 1991).
The impact of activity theory on HCI and interaction design in the last two decades has been, essentially, threefold. First, the theory offered some general theoretical insights that resonated with the need for a richer conceptual framework which would allow the field to move from the “first-wave HCI” to the “second-wave HCI” (see Cooper and Bowers, 1995). Second, it served as an analytical framework for design and evaluation of concrete interactive systems and stimulated the development of a variety of analytical tools. Third and finally, the application of the approach, especially in recent years, resulted in a number of novel systems, implementing the ideas of activity-centric (or activity-based) computing.
16.4.2 General theoretical insights
The unit of analysis proposed by activity theory, that is, “subject-object” interaction, may appear similar to the traditional focus of HCI on “human-computer” interaction. However, adopting an activity theoretical perspective had important implications for understanding how people use interactive technologies. First of all, it made it immediately obvious that “computer” is typically not an object of activity but rather a mediating artefact. Therefore, generally speaking, people are not interacting with computers: they interact with the world through computers. The book by Susanne Bødker, which played a key role in introducing activity theory to HCI, reflected this perspective on interactive technologies in its very title: “Through the interface: An activity-theoretical perspective on human-computer interaction” (Bødker, 1991).
Another general theoretical contribution of activity theory to HCI was placing computer use in the hierarchical structure of human activity, that is, relating the operational aspects of the interaction with technology to meaningful goals and, ultimately, needs and motives of technology users. It did not mean rejecting the formal models of users and tasks which were developed in early HCI research, but rather extending the scope of analysis beyond low-level interaction. Such an extension was considered by some researchers as perfectly consistent with the need of the field to move “from human factors to human actors” (Bannon, 1991).
Finally, adopting the conceptual framework of activity theory promised to open up new possibilities for analysing the context of technology use. As mentioned, the lack of conceptual tools for understanding context was a major limitation of the information-processing psychology in HCI. Activity theory, with its emphasis on society, culture, and development, offered a set of concepts for capturing the context of use and taking it into account in the design, evaluation, and deployment of interactive technologies. An edited collection entitled “Context and consciousness: Activity theory and human-computer interaction” (Nardi, 1996a) provided an in-depth exposition of a wide range of such concepts.
16.4.3 Activity theory and other 'second-wave' theories
There are both similarities and differences between activity theory and other “second-wave theories” (cf. Kaptelinin et al., 2003), such as phenomenology (Winograd and Flores, 1986; Svanaes, 2000; Dourish, 2001) and distributed cognition (Hollan et al, 2000; Rogers, 2004).
A fundamental assumption uniting most second-wave theories is that human beings cannot be understood separately from the world in which they live, act, and cognize. The need to analyse the inseparability of humans and their physical, social, and information environments is emphasized by activity theory’s notion of activity as “subject-object” interaction, phenomenology’s concept of “being-in-the-world” (Dourish, 2001), and distributed cognition’s models of the propagation of representations across the boundaries of humans and artefacts (Hollan et al., 2000; Rogers, 2004). Another key notion, common to many post-cognitivist frameworks, is that humans’ connection to the world is to a large degree determined by the artefacts used by the humans, which artefacts are variously defined in terms of mediating means (activity theory), equipment (phenomenology), or processing and transmission of information (distributed cognition and external cognition) (see also Nardi, 1996a; Kaptelinin and Nardi, 2006).
While similar in a number of important respects, "second-wave" theories are also different in their general perspectives on humans and human relation to the world. Phenomenology is relatively less interested in the issues of development and the social nature of human beings, compared to activity theory. For instance, the question of how “being-in-the world” comes to exist in the first place (that is, how exactly we are thrown into the world) does not seem to play a critical role in phenomenology, which is in stark contrast with the attention paid in activity theory to how subjects emerge in evolution, social history, and individual development. In addition, a systematic exploration of the social dimension of being is a relatively recent development in the phenomenological tradition (Dourish, 2001), even though the need to take it into account was already emphasized, for instance, in the foundational work of Heidegger (1962).
The distributed cognition framework, at least as it is applied in HCI, is less explicit in its general assumptions about the nature of human beings and mostly focuses on concrete problems of understanding and supporting cognitive processes distributed between people and artefacts (Rogers, 2004). It is, however, apparent that activity theory and distributed cognition substantially differ in their respective views of human agency. Human agency is a major conceptual point of departure in activity theory, while in HCI research informed by the distributed cognition framework this issue does not play a significant role.
Comprehensive analysis of similarities and differences between activity theory and other post-cognitivist approaches is a complex issue, which is beyond the scope of this chapter. Such analysis can be found elsewhere. A systematic comparison of activity theory with a variety of other approaches is conducted by Nardi (1996b) and Kaptelinin and Nardi (2006). Halverson (2002) discusses activity theory and distributed cognition as theoretical frameworks for CSCW research. Rogers (2004) provides an overview of current theoretical approaches in HCI, including activity theory, distributed cognition, and external cognition.
16.4.4 Analytical tools
Activity theory is not a “theory” in the traditional sense in which “theory” is understood in natural sciences. Activity theory does not support creating and running predictive models which would only need to be “fed” with appropriate data. Instead, it aims to help researchers and practitioners to orientate themselves in complex real-life problems, identify key issues which need to be dealt with, and direct the search for relevant evidence and suitable solutions. In other words, the key advantage of activity theory appears to be in supporting researchers and practitioners in their own inquiry-for instance, by helping to ask right questions-rather than providing ready-made answers.
A variety of analytical tools, informed by activity theory, have been proposed to support asking “the right questions” in analysis, design, and evaluation of interactive systems (Quek and Shah, 2004). Most of such tools have the format of a checklist: they are, essentially, organized lists of questions or issues that researchers or practitioners need to pay attention to in order to make sure that the most important aspects of human activity are taken into account. The choice of the checklist format is intended to help bridge the gap between theory’s high level of abstraction and the need to address concrete issues in analysis and design. Arguably, the elaborated system of concepts (and their relations) offered by activity theory can be used in HCI to better understand the role and place of concrete interactive technologies in the overall structure of purposeful, mediated, social human action. However, the framework provides a high level description, not limited to particular types of artefacts, and needs to be specifically adjusted to the requirements of HCI research and practice. Such an adjustment can, in principle, be delegated to HCI researchers and practitioners themselves, but in many cases this strategy may not be realistic since it would require considerable time and effort. Activity theory-based checklists reduce the effort associated with domain-specific adjustment of the theory by converting the organized set of concepts, offered by the theory, into a set of concrete issues and questions, specifically related to analysis and design of interactive technologies.
Different types of such checklists are based on different variants of activity theory. For instance, the Activity Checklist (Kaptelinin et al., 1999) is intended to support systematic exploration of the “space of context” in design and evaluation of interactive technologies. The overall structure of the checklist is derived from the basic principles of Leontiev’s framework. The checklist comprises four sections-Means and ends, The environment, Learning, cognition and articulation, and Development-which are produced by combining the principle of mediation with, respectively, the principles of object-orientedness, the hierarchical structure of activity, internalization/externalization, and development, The checklist was employed in a number of design and evaluation projects (see Kaptelinin and Nardi, 2006).
Jonassen and Rohrer-Murphy (1999) introduce another analytical tool, based on a somewhat different (while partly overlapping) set of activity-theoretical concepts. The tool comprises several organized arrays of questions and issues mostly derived from Engeström’s activity system model. The basic components of the model-Subject, Object, and Community, as well as Tools, Rules, and Roles mediating the three-way interaction between the components-serve as the main rubric for issues that need to be taken into account and modeled when designing the components of a constructivist learning environment, as well as the relationship between the components. The AODM (Activity-oriented design method) approach to supporting technology-enhanced learning analysis and design, developed by Mwanza (2002) includes several lists of issues to explore, which lists mostly capitalize upon the conceptual structure provided by Engeström’s activity system model. For instance, the Eight-Step Model prescribes a sequence of analytical steps, starting from focusing on the activity system in question as a whole, then proceeding to each of the six individual nodes and, finally, analysing the outcome of the activity system.
16.4.5 Activity-centric computing
Adopting an activity-theoretical perspective has an immediate implication for design: it suggests that the primary concern of designers of interactive systems should be supporting meaningful human activities in everyday contexts, rather than striving for logical consistency and technological sophistication. Currently many systems fail to comply with this, seemingly obvious, requirement. For instance, traditional desktop systems organize digital resources into formal categories (e.g., files, email messages, bookmarks...) rather than according to the relevance of a resource to the task at hand, and most systems provide limited support for task switching and interruptions (Bardram et al., 2006; Kaptelinin and Czerwinski, 2007).
Activity-centric (also referred to as "activity-centred" or "activity-based") computing is an approach to designing interactive systems according to which the top priority and an explicit aim in the design of digital artefacts and environments should be supporting meaningful human activities. The work in activity-centric computing is being conducted from a diversity of perspectives; some of the key projects (e.g., Moran et al., 2005; Moran, 2006) do not employ actvity theory as their theoretical foundation. It is fair to say, however, that the theory has influenced, one way or another, many (if not most) developments in the area.
An early attempt to propose an activity-centric alternative to then dominant application-centric and document-centric approaches was made by Don Norman and his colleagues at Apple Computer (Norman, 1998), who employed a somewhat modified version of Leontiev’s framework. More recently, Norman (2005) argues that activity-centred design has advantages over traditional human-centred design and should supersede the latter.
For various reasons, the attempt to introduce an activity-centric approach at Apple Computer has not resulted in the development of concrete novel technologies. However, in the recent decade a number of systems, adopting an activity-centric perspective and, for the most part, explicitly informed by activity theory, have been designed and implemented. They include, for instance, the UMEA system (Kaptelinin, 2003, see Figure 12), a variety of systems implementing the ABC framework, e. g. the Windows XP ABC system (Bardram et al., 2006, see Figure 13), and the Giornata system (Voida and Mynatt, 2009, see Figure 14). All these systems provide alternatives to, or extensions of, traditional desktop systems to enable organizing various digital resources, such as documents and URL’s, around higher-level, meaningful tasks of the user, defined as “projects” or “activities”. In UMEA and the Windows XP ABC system it is achieved by automatically assigning resources to the activity selected by the user, while in Giornata a virtual desktop is set up for each new activity.
The results of evaluation studies (Kaptelinin, 2003; Bardram et al., 2006; Voida and Mynatt, 2009) suggest that activity-centric systems have certain advantages over more conventional types of systems.
Number of hits
Number of hits
|phenomenology & HCI||251|
|”activity theory” & HCI||512|
|“distributed cognition” & HCI||383|
|ethnomethodology & HCI||269|
|”situated action” & HCI||209|
|“language action” & HCI||66|
“actor network theory”
|“actor network theory” & HCI||43|
|“external cognition” & HCI||60|
16.5 Conclusions and prospects for the future
The discussion in this chapter indicates that in the last two decades, since its introduction to HCI, activity theory has established itself as a leading theoretical approach in the field. Along with some other post-cognitivist approaches, most notably distributed cognition (Hollan et al., 2000) and phenomenology (Svanaes, 2000; Dourish, 2001), it shapes the theoretical landscape of current HCI and interaction design. A number of fundamental notions, such as technological mediation, originating from activity theory have become widely accepted in the field. Table 1 gives an approximation, if very rough and imprecise, of the relative “popularity” of activity theory in studies of information technologies compared to some other theoretical approaches.
At the same time, lessons learnt from applying activity theory in HCI and interaction design indicate that the theory needs to be further developed, and there are some issues which must be addressed in the future. First, the concepts of activity theory should be more clearly specified and operationalized to make it easier for researchers and practitioners to see how the theory can be applied in concrete cases (cf. Rogers, 2004). Second, the conceptual framework of activity theory needs to be expanded to more adequately deal with coordination of multiple activities and cross-activity integration. Cross-activity integration is becoming an increasingly important issue in current uses of technology, characterized by complex social contexts (e.g., a combination of work and non-work factors typical of everyday practices of teleworkers) and employing multiple digital and non-digital technologies (or “webs of mediators” - see Bødker and Andersen, 2005). Third, as observed by Bødker (2006), HCI appears to be entering its new, third wave, during which there is a marked increase of interest in aesthetics and experience (see also Hassenzahl, 2011). The move toward third-wave HCI, according to Bødker (2006) presents a challenge for second-wave theories, including activity theory. Arguably, the conceptual apparatus of activity theory can, in principle, be employed to analyse subjective experiences, and some activity-theoretical analyses do address the issues of emotion, passion, and so forth (e.g., Vasilyuk, 1992; Nardi, 2005). However, the potential of activity theory to deal with such issues remains relatively untapped. Expanding the scope of activity-theoretical analysis in these three directions appears to be essential to make sure the theory continues to provide HCI with new insights and to help the field to deal with emerging challenges.
16.6 Where to learn more
16.6.1 Activity theory in HCI
188.8.131.52 Articles and book chapters
Kuuti, Kari (1995): Activity theory as a potential framework for human-computer interaction research. In: Nardi, Bonnie A. (ed.). "Context and Consciousness: Activity Theory and Human-Computer Interaction". The MIT Presspp. 17-44
Rogers, Yvonne (2004): New Theoretical Approaches for HCI. In Annual Review of Information Science and Technology, (38) pp. 1-43
Wilson, T. D. (2008): Activity theory and information seeking. In Annual Review of Information Science and Technology, 42 (1) pp. 119-161
16.6.2 Activity theory in general
184.108.40.206 Online resources
- The Mind, Culture, and Activity Homepage is an interactive forum for a community of interdisciplinary scholars who share an interest in the study of human mind in its cultural and historical contexts.
- International Society for Cultural and Activity Research (ISCAR)
I would like to thank Mads Soegaard, Rikke Friis Dam, Klaus Bærentsen, Allan Holstein-Rathlou, Bonnie Nardi, and Dag Svanæs for their feedback, help, and support. Special thanks to Mads for providing numerous figures for this chapter
16.8 Commentary by John M. Carroll
16.8.1 Activity Awareness
Activity-centric computing is exploring the idea of activity as an appropriate organizing rubric for user interfaces, in contrast to more traditional user interface designs organized around applications and data type hierarchies. This is an important direction for user interface design, and a specific design implication drawing upon Activity Theory. In my work, I am exploring the idea that collaborators need to share awareness of joint activities – in contrast, for example, to conceiving of awareness only with respect to joint actions, mutual presence, and/or shared synchronous situations.
Collaborators must attain and maintain reciprocal awareness in order to coordinate effectively (Dourish & Bellotti, 1992). Groups engaged in collaborative activities of significant scope and duration must achieve and maintain awareness of diverse aspects of their shared activity in order to coordinate effectively. For example, they must verify mutual presence and attention, which is fairly straightforward in face-to-face interaction, but often subtle, difficult, and a continuing challenge in computer-mediated collaboration. Members need to know what tools and resources they have access to, but also what tools and resources their counterparts can access. The availability of tools and resources may change throughout the course of an activity. The group must have an understanding of who among them might know potentially relevant information, or know how to do something that might be critical to the collective endeavor. Members need to know something of their partners’ attitudes and goals, and of what their partners expect from them and of the activity. They need to know what criteria their partners will use to evaluate joint outcomes, the moment-to-moment focus of their attention and action during the collaborative work, and how the view of the shared plan and the work actually accomplished evolves over time. All of these intentional variables change constantly as the task context itself changes.
Awareness in collaborative situations is sometimes regarded as a relatively discrete achievement – awareness of a task context (situation awareness), of group consensus, or of a shared mental model. These simplifications can be useful for scripted collaborative tasks, such as managing single-threaded processes or team training exercises. However, they do not address routine sources of complexity. In realistically complex tasks of significant scope and duration, the current situation is defined to a considerable extent by its history, which in turn is constantly reconstructed by the group and by its individual members. For instance knowing how other group members respond to criticisms can have a profound effect on group discussion and argumentation. The current situation is defined also by continuous exogenous dynamics that present a constantly changing situation to the group. Indeed, if awareness were to be supported by discrete updates, it would require an unceasing torrent of information, which ipso facto could never be useful or even usable.
Shared mental models are a popular way to think about the knowledge and skills that teams use to manage collective activity. But the notion of identical copies of knowledge used and maintained by team members to enable coordination is both exotic and cumbersome as a foundation for joint endeavor. Team members who believe that they should hold exactly the same understanding of a current task might spend considerable time and effort verifying agreed-upon preconditions for action, making them less useful to their partners in action than members who have different perspectives, and who could play complementary roles and take complementary team responsibilities. Moreover, too much literal shared understanding could entrain redundant capabilities, and teams no better than their best member. Teams with homogeneous understandings are maximally vulnerable to groupthink and stagnant thinking. Analogous to arguments regarding natural selection, the more variation that exists in a team, in individual backgrounds, mind sets and strategic approaches, the better the chances of that team to adapt to new and novel situations. For realistic and complex one-of-kind situations, such as emergency response, information analysis, and software design, creativity, learning and adaptation are critical to team performance. We are trying to articulate a sense of shared understanding among team members that is robust with respect to exogenous dynamics, and that can, in principle, leverage collaboration to produce performance better than any team member.
My colleagues and I are developing the concept of activity awareness as a programmatic analysis for the mutual awareness of partners sharing an activity of significant scope and duration (Carroll et al. 2003, 2006, 2009, 2011). Activity awareness builds upon, but transcends, synchronous awareness of where a partner's cursor is pointing, where the partner is looking, and other immediate features of a task situation. More importantly, it transcends the sharing of identical states of situation awareness or mental models. Indeed, we would argue that lower-level and simpler aspects of awareness are appropriately conceptualized as mediated by shared mental models: All stakeholders in a joint activity must have the same understanding of primitive and objective situation properties such as the document being edited, the key that was pressed, the reference of a deictic. But shared mental models are neither useful nor possible for intentional situation properties such as role-based interpretations and strategies, personal insights and perspectives, opportunistic problem solving derived from interactions with tools and other resources, value-based assessments drawing on personal histories, expectations and attributions about one’s teammates, etc.
In framing activity awareness, we appropriated the concept of activity from Activity Theory, to emphasize that collaborators need be aware of a whole, shared activity as complex, socially and culturally embedded endeavor, organized in dynamic hierarchies, and not merely aware of the synchronous and easily noticeable aspects of the activity. In this view, awareness is teleologically inseparable from collective regulation of a joint endeavor. Members need to be engaged with one another’s interests, values, and possibly relevant knowledge and skills, initial and current goals and motivations, criteria for evaluating outcomes, and assessments of the status and trajectory of ongoing work. This engagement is continually negotiated and developed. We articulated this continual process of activity awareness into arenas of conceptual negotiation among members of a team, a collection of ongoing interaction protocols rather than static sources of knowledge. Ours is a developmental framework in the traditional sense of Piaget and Vygotsky: higher-level facets are enabled by and resolve conflicts in lower-level facets.
When people plan, negotiate and coordinate with others in open-ended endeavors over significant spans of time, when they solve problems that are ill defined and consequential, when they stretch their own capabilities, they develop; that is, they come to experience and interact with the world in new ways. In Activity Theory, human development is a normal outcome of significant activity, but it is also profound in the sense that it qualitatively changes one's awareness of activity. As an individual develops, he or she becomes more able to understand, to reconcile, and to integrate different levels of performance and different approaches to problems by synthesizing zones of proximal development. The successive elaboration of personal perspectives further enhances each member's awareness of his or her own activity, and creates myriad new ways to construct common ground, codify practices, and build social capital. A shorthand for activity awareness is a group's awareness and regulation of its own activity.
Activity awareness is fundamentally a dynamic process, not a state of knowledge. It involves monitoring and integrating many different kinds of information at different levels of analysis, such as events, tasks, goals, social interactions and their meanings, group values and norms, and more. It involves monitoring and integrating more-or-less continuingly to learn about developing circumstances and the initiatives, reactions, and sense making of other people with respect to on-going and anticipated courses of action. Activity awareness is not merely a matter of coordinating state information. It must be continually negotiated and constructed throughout the course of a collaborative interaction. It is a process that is constitutive of collaboration.
16.8.2 Additional References
16.9 Commentary by Clay Spinuzzi
In Kaptelinin’s conclusion, he argues that activity theory must develop to address new work organization. He says:
Examples might include university-industry partnerships (Gygi & Zachry 2010); massive multiplayer online role-playing games (Nardi 2010); coworking (Spinuzzi 2012, in press); classroom collaborations that span locations and disciplines (Paretti, McNair & Holloway-Attaway 2007); and sales engineers, who must bridge between clients and engineers (Ludvigsen et al. 2003). As Kaptelinin stated, such cross-activity work poses challenges to the conceptual framework of activity theory - and such examples are multiplying as activities become more networked.
Why is cross-activity integration such a critical issue now, and how must activity theory develop to address it? The answer lies, in part, in changes to work organization that were not anticipated during earlier stages of the theory’s development. And the challenge lies in addressing these changes while keeping the theory relatively coherent.
The foundational ideas of activity theory came of age during the industrial era, grounded in Marx’s critique of early industrialization (1990) and developed during the rapid industrialization of the Soviet Union (see especially Luria 1976). In fact, its early examples reflect agricultural and craft labor: hunting, fishing, farming, blacksmithing. But as Yrjö Engeström began developing third-generation activity theory (3GAT)1, he recognized that work organization is changing in “the age of information technology” (1990, p.50), i.e., in the age of knowledge work, and that we are undergoing a historical transformation in the nature of expertise, moving toward “multi-professional team and network work and expertise” (1992, p.25). More recently, Engeström has suggested that we need a fourth generation of activity theory to address such work (2009, p.310). He argues that “Third-generation activity theory still treats activity systems as reasonably well-bounded, although interlocking and networked, structured units. What goes on between activity systems is processes, such as the flow of rules from management to workers”. But, he says,
Like Kaptelinin and Engeström, others see challenges to activity theory as currently constituted (Bødker 2009; Lompscher 2006; Ruckriem 2009). For instance, Yamazumi (2009, p.212) argues that the knowledge society has shifted from mass production to interorganizational collaboration (cf. Castells 1996, 2003), resulting in “new types of agency [that] are collaborations and engagements with a shared object in and for relationships of interaction between multiple activity systems” (p.213). As Engeström puts it, “social production requires and generates bounded hubs of concentrated coordination efforts” (Engeström 2009, p.310), hubs in which interorganizational collaboration constitutes an aspect of the activity’s object (cf. Adler & Heckscher 2007; Gygi & Zachry 2010). Consequently, if we are to perform an activity theory analysis that is oriented toward knowledge work, we must examine the interorganizational collaborations to which they contribute.
Given these changes, activity theorists are increasingly concerned with addressing knowledge work. In the past few years, at least three collections on activity theory have addressed how it must adapt to discussing knowledge work (Sawchuk et al. 2006; Sannino et al 2009; Daniels et al. 2010), as have various monographs (Kaptelinin & Nardi 2006; Engeström 2008; Spinuzzi 2008).
As Kaptelin and Nardi argue: “Work itself is changing. Work is more distributed, more contingent, less stable. How do we understand social forms such as networks and virtual teams that partially replace standard organizational hierarchies? ... Knowledge work usually involves multitasking and working with diverse groups and individuals” (2006, p.26). And they describe the theoretical difficulties associated with this sort of work:
So the issue is known, but the elaborated concepts are yet to be developed. As we attempt to develop them, our great challenge will be to keep the theory coherent and focused while expanding it to address such analyses.
16.9.1 Additional references
This commentary is a shorter version of the argument in Spinuzzi (2011)
16.10 Commentary by Antonio Rizzo
16.10.1 On Mediation and Play
The socio-cultural Russian approach to human psychology has been the first tradition to deserve special attention and a specific role for tools in human cognition. Contrary to activity theory, the so far dominant approach to cognition, the representational theory of mind, assumes that tools have no inherent meaning or intrinsic role in human cognition. In fact, all computational models of the human mind share the core assumption that the use of a tool (e.g., a hammer) requires the extraction of sensory information about object properties (heavy, rigid, etc.), which can then be translated directly or indirectly into appropriate motor outputs (grasping, hammering, etc.). Already back in 1890, William James pointed out the paradox of this position, i.e. that to perceive properties of objects we need to know in advance what is the object for:
We have to take into account that so far there is no clear experimental evidence supporting both the indirect (the semantic hypothesis) or direct (the cognitive definition of affordance) route between perception (vision/touch) and action (Osiurak at al., 2010). An alternative vision of the relationship between perception and action has received more empirical support (Adolph, Eppler, & Gibson, 1993).
Given this premise, it is not surprising that cognitive psychology has had a very limited impact on interaction design and that the original enthusiasm for a potential contribution to the design of “information processing” artifacts (Carroll, 1989) soon disappeared following the evidence that cognitive psychology had very little or no impact at all on design practices. As a result, the relationship between psychology and interaction design was best described as a relationship for mutual opportunity to learn (Carroll, 1991).
As opposed to representational theories of the human mind, soviet psychology attributes a special role to tools. Tools are, , as remarked by Victor Kaptelinin, integral to a fundamental feature of the socio-cultural approach, namely the process of mediation. And it is specifically on mediation that I would like to elaborate more Victor Kaptelinin’s description and report on my own experience of the role of mediation in the design process.
16.10.2 Mediation and its genesis
Mediation is a central aspect in Vygotsky’ s human psychology - constantly present in all his dynamic theoretical elaborations. The specific role that Vygotsky assign to tools can be summarized in this principle: human mental processes can be understood only if we understand the tools and signs that mediate these mental processes. Yet, in order to understand the mediation process we have to consider its relationship with the other two main topics of Vygotsky's approach, that is the social genesis of human cognition and the developmental (genetic) method (Wertsh, 1985).
The starting point of my argument is the difference between mediated and non-mediated activities - a distinction that is related to the difference between elementary and higher mental functions: The central characteristic of elementary functions is that they are totally and directly determined by stimulation from the environment. For higher functions, the central feature is self-generated stimulation, that is the creation and use of artificial stimuli, which become the immediate causes of behavior (Vygotsky, 1978; p.39)
Mediation is usually presented as a way of transmitting existing cultural knowledge, here I will argue in favour of the other role of mediation (cf. Bødker & Klokmose, 2011), namely that of producing new meanings (and thus knowledge) by means of transforming the objects we interact with; the creation and use of artificial stimuli. This role I trace back to the seminal work of Vygotsky on children’s play (1933/1982).
It is worthwhile to note that Vygotsky - in order to present his ideas about children’s play - takes his point of departure in a very early idea of affordance, that of Kurt Lewin’s valence. Vygotsky quotes a study carried out by Lewin where it is shown how very young children in the attempt to exploit the opportunities for action offered by a stone exhibit a behavior that is strongly determined by the conditions in which the activity takes place.
In the following videos we can observe the original recording of the Kurt Lewin’s study. The first video presents the interaction with the stone by Hannah (19 months old), while the second video shows the performance of Han (who is older than Hannah).
Video 1: Hannah is one year and seven months old. The stone has a positive valence in the momentary living space of the child. The child is attracted by the stone. In order to sit down, the child has to turn around, that is away from the goal. This detour to reach the goal is extremely difficult for children.
Video 2: Hans solves the problem in an intelligent fashion. He does not lose sight of his goal
For Vygotsky the interaction exhibited by the two children and the description provided by Kurt Lewin is a real illustration of the extent to which a very young child is bound in her/his action by situational constraints. He states:
Instead, for Vygotsky:
However, here comes an interesting consideration, namely that in the genesis the separation is not totally arbitrary:
Indeed, there is experimental evidence in social play to support that meaning is understood not by the shape, colour or other features of the objects involved in the activity but by the actions the object allows to be performed (Szokolsky, 2006)
It’s the action pattern that provides the cue for what is intended, not the objects.Indeed there is evidence that very young children (12 – 18 months old) imitate significantly more often when the pattern of action performed by an adult involves (is mediated by) an object compared to a condition where the same pattern of action is executed without any object (Rizzo and Carnesecchi, 2011). Pretend play is a privileged way of staying in touch with the environment as well as stepping out of the environment to mentally modify it.
For me, the more dramatic example of this was provided by the children involved in the design of POGO world (Rizzo et al 2003; Decortis and Rizzo, 2002).
Together with Françoise Decortis and Patrizia Marti we were in charge of the mock-upping and testing (role prototyping through dramatization) of the design concepts produced by the Domus Academy. And what we observed was that kids had no problems at all in overruling the intended use of the mock-up and produce new opportunities for actions and relationship with existing objects.
In the POGO dramatization we observed how childrens’ behaviour was guided by a merge between the sensory-motor affordance of the mock-ups and the meaning of the current situation. The same object, for example the pogo torch (a device to capture and project sounds and images), was used as a way to talk to yourself in one situation or as a way to move very large objects in another situation according to the meaning the children were negotiated in their play.
Video 3: A short clip of different situtations where children use the first generation of POGO mock-ups
This resulted in new functionalities that were not anticipated by the design team. The attempt to give new objects with specific functionalities to children in order to see what role the objects would play in their activity did not work. As the children did not get directions by a teacher, they appropriated the tool to the game they played, thus producing new meanings on the fly. These new meanings were however linked to the actions (movements) made feasible by the objects.
It was pretty clear that we, the designers, were thinking about functions and they, the children, where negotiating and producing meaning through their activity: Meaning comes first, function later.
An interesting observation was that new opportunities for action introduced by the children (that is, specific manipulations of the torch or of the mambo) had a deontic power that sensorimotor affordances did not have: Children imposed the new opportunities for action onto their peers: “Nooo, it is not that way... look!” A situation never observed for any sensorimotor affordance (ways of grasping, pushing or waving).
Such observations inspired me to look for theoretical elaborations of the concept of affordance, which led me to Michael Tomasello’s idea of intentional affordances and their role in children’s cultural learning. Tomasello notes that children are involved in intentional mirroring process (imitation and in some sense emulation) and through these processes, the children start to perceive objects and artifacts as elements that evoke a set of affordances, beyond basic sensory-motor affordances:
This way of producing and sharing new objects’ attributes had a profound impact on the design of POGO and subsequently influenced the whole design strategy:
Most of this was done in playful sessions involving designers, psychologists, and teachers (and sometime selected children) where a mix of mock-ups, existing objects, semi-working prototypes were put on stage following a hint script and improvisational theatre techniques (Rizzo and Bacigalupo, 2004). All the materials were used to explore new territories of interactions and the ease of which the materials propagated among the team members was also tested.
A few year later, Banzi, the team leader and inventor of Arduino at Ivrea, made a similar point (although not mentioning the social dimension) by introducing the term Tinkering to the interaction design community (a term originally coined by Francois Jacob in 1977 in biology):
One of the best definitions of tinkering, as also acknowledged by Banzi, is the definition provided on the former website of the Exploratorium Museum in San Francisco:
Tinkering is, at its most basic, a process that marries play and inquiry.
I believe that to “tap into” the heuristic power of Activity Theory we need more analytical tools that, on the one hand,impactdesign processes, and, on other hand, may have an impact directly on the artifacts we design. To my mind, there is plenty of room at the interplay between sensorimotor and intentional affordances (Rizzo et al, 2009); a room that combines the dynamic and evolving relationship between non-mediated and mediated action, a room that needs to be explored and exploited in the design of human interaction with her/his environment.
16.10.4 Additional References
16.11 Commentary by Stephen Voida
In this chapter (as well as in numerous previous books and articles), Kaptelinin provides a thoughtful and comprehensive review of Activity Theory, its history, and some of the many ways that the conceptual framework has been taken up and appropriated by the CHI and interaction design community.
Whether explicitly acknowledged as such or not, Activity Theory has had a significant impact on the way that researchers and practitioners have approached the design and evaluation of interactive systems in the mobile and ubiquitous computing era. Bannon and Bødker introduced Activity Theory to the interaction design community at around the same time that Weiser published his seminal article establishing the vision for ubiquitous computing (Weiser 1991). While this wasn’t a coordinated or intentional effort, both researchers were responding in their own way to the limitations of computing technology and the way that we conceptualized people’s relationships with computers at the time. The desktop computing paradigm of the early 1990s placed practical limitations on the contexts in which human–computer interaction could occur, but the movement towards making computers smaller, more mobile, and more often embedded into other objects made it clear that computational tools would soon permeate the everyday world and play an much more significant role in all kinds of human activities. Activity Theory broke from the established theories of interaction-as-dialog and cognition-as-information processing to provide a lens for understanding how humans might interact with ubiquitous computational technologies in a much greater breadth of contexts beyond number crunching and word processing in the workplace.
One of the most striking things about the relationship between Activity Theory and HCI is the framework’s continued success and longevity as a relevant way of thinking about the mediating role of computational tools in the face of a dynamic and rapidly evolving technological landscape. Not only has Activity Theory been adopted as a general-purpose analytic tool within HCI (cf. section 16.3.4, above), the conceptual framework has been extended to better support reflection about the temporality and interconnectness of activities in knowledge work organizations (Boer, van Baalen & Kumar 2002) and to incorporate the notion of external environmental factors (Döweling, Schmidt & Göb 2012). It has also been used as the basis for new methodologies that aim to make sense of empirical data collection carried out in complex, collaborative work environments, such as hospitals (Bardram & Doryab 2011).
But perhaps the most significant re-purposing of Activity Theory has been in re-casting what was primarily an analytic, inspirational, and discursive tool to one that has served as a guidepost in both the design and implementation of interactive systems. While the early command-line and windowed GUI interface paradigms were largely focused on supporting the creation or manipulation of a single file, document, or electronic artifact at a time, Activity Theory challenged the premise that computational support should focus on interaction with a single, decontextualized document at a time. Activity Theory’s emphasis on articulating the dynamic, at times complex, and occasionally conflicting relationships among subjects, tools/artifacts, and social/environmental context has both influenced the structure of various personal information management and desktop interfaces (including those enumerated by Kaptelinin in section 16.3.5), as well as the underlying data representations that are used to organize electronic artifacts and support further exploration of what has become known as the “activity-based” or “activity-oriented” computing movement.
In my research, I have found that the theoretical framework provided by Activity Theory-and particularly the modern instantiations articulated by Engeström (1987) and Boer et al. (2002)-align and resonate surprisingly well with empirical observations of the ways that information workers organize their workspaces (e.g., Malone 1983), the ways that they transform themselves in the process of carrying out information work (e.g., Kidd 1994), and how they handle transitions among and interruptions within ongoing activities throughout the work day (e.g., González & Mark 2004). In the systems that I built to support information work and explore the role of activity in interface design, these points of resonance helped to shape both the systems’ interface design and their underlying data structures. The Kimura system (MacIntyre et al. 2001) displayed interactive visualizations of ongoing activities on an electronic whiteboard to facilitate multitasking and activity awareness. Each of the visualizations brought together representations of the computer application windows (mediating tools) that had been used over the course of the activity, along with icons representing the people (community) with whom further collaboration would be required in order to bring the activity to a successful conclusion. The Giornata system (Voida, Mynatt & Edwards 2008) extended Kimura’s model of activities to include discrete electronic resources and broadened the system’s focus from primarily supporting multitasking to also facilitate collaboration and evolving personal information management practices. The data structure behind each activity in Giornata encoded a user-generated series of tags describing the current goals or meaning of the activity (which could change over time); a flexible set of documents and applications, both live and archival, representing the computational tools used to mediate, transform, and generate information content; and a “palette” of contact icons allowing quick access to and information sharing with the other people associated with the activity. Over the course of this research, I identified a number of key challenges that are brought to the fore when an activity-theoretical perspective is used in the design process for both desktop and ubiquitous computing systems (Voida, Mynatt & MacIntyre 2007, Voida 2008). Like Kaptelinin, Nardi & Macaulay’s activity checklist (1999), these challenges serve as scaffolding to transition between various facets of the Activity Theory framework and the articulation of concrete system requirements.
Even though Activity Theory has been part of the theoretical tool belt in HCI for nearly two decades, there are still areas in which the flexibility of the framework raises practical issues about how to apply its concepts most effectively. For example, the hierarchical structure of activities and the inherent variability in the granularity at which people describe-and organize-their ongoing activities sometimes makes it difficult to adequately model the relationships at play in empirical data, especially across multiple individuals or multiple activities. Although an Activity Theory analysis may be carried out at any of the levels of the hierarchy (e.g., operation, action, activity, or aggregate/higher-level activity), different people tend to articulate their activities at different levels of detail, depending on, for example the scope (i.e., complexity, anticipated duration, or importance) of the activity, the person’s role in the activity, and the perceived expertise or familiarity level of the interviewer/listener. Likewise, computational systems that aim to explicitly model activities as part of the user experience often cannot anticipate at what level of detail individuals might wish to represent their work-for example, “editing a revision of a chapter” versus “writing a book” (Voida, Mynatt & MacIntyre 2007). These differences sometimes make it difficult to anticipate what specific types of computational support might be appropriate or helpful at a given time. Furthermore, facilitating collaboration through computational activity representations when participants have created representations of their work at different levels of granularity can be problematic.
One of the relatively underutilized aspects of Leontiev’s framework in interaction design is his focus on the continual development of activity systems. Historically, most personal computing systems have been designed to represent the state of a data structure at a particular point in time (the present). Representations of temporality have tended to exist as simple linear state-management tools (e.g., undo and redo) or very formal representation of milestones and revision numbers, such as those found in version control and transaction-based database systems. In most conventional operating systems and mobile computing platforms, users must maintain their own representations of information-through-time, constructing and maintaining their own artifact histories, often by duplicating documents at each milestone or using auxiliary information systems (like e-mail) to archive the state of a document through multiple points in time. Providing better solutions for these actions is becoming ever more urgent as computational systems enable the creation and sharing of more and more content through an increasing diversity of platforms and online services. If our computational systems were to better reflect the activity-theoretical idea that activities continually develop and evolve, we might arrive at better support for information management, long-term information curation, externalization, and routinization, but the interaction techniques for dealing with this kind of temporality are neither widely utilized in mainstream computing systems nor will they be as familiar as the typical application- and document-centric interaction paradigm.
16.11.1 Additional References
16.12 Commentary by Klaus B. BaerentsenKlaus is working hard on his commentary - please stay tuned!
16.13 Commentary by Ellen Christiansen
Designers need tools for understanding the context of whatever new artifact they are working on. Certainly, among academically trained interface designers, activity theory has been instrumental to such understanding since Bødker’s book ‘Through the Interface’ came out. It was, as pointed out by Kaptelinin in this chapter, the very same year (1991) Carroll, in a book with the title ‘Designing Interaction’ moved the HCI agenda from ‘interface’ to ‘interaction’, while pointing out that for this purpose: designing interaction, cognitive psychology alone fell short as a foundation understanding context.
Bødker’s book ‘Through the Interface’ established the ‘subject-instrument-object’-relationship as an indispensible syntax for legitimate sentences in the field of HCI. She made it clear that the interface is not an endpoint, but a window to a world of activity, which people are inclined to embark on anyways. Hence, the success of an interface depends on the degree to which the interaction brings this world closer. Since then, humans have come to experience interaction in an explosion of ways, on digital arenas for artwork, gaming, and amusements of all sorts, to the extent that today we have a hard time telling what, regarding interaction, is text, and what is context, what is work and what is leisure.
Hence, when, in the field of HCI, activity theory have matured to take position between what Kaptelinin calls “the visible landmarks of the theoretical landscape of Human-Computer Interaction”, we may ask if activity theory is more than a memorial. From a practitioners’ point of view, at least, we may ask, not what activity theory delivered to yesterday’s HCI-designers and design thinking, but what it has to offer the designers of tomorrow.
Given that digital technology pervades all aspects of human life, almost as the air we breathe, it can be tempting to dismiss questions about context all together: Designers were never able to predict use, and who would know today, which needs to fulfill tomorrow? Soon it will be hard to distinguish a non-robotic human from a robotic one, Human-Computer Interaction may dissolve, interaction designers may work in global app-stores, and the only thing we know for sure is that we breathe, and that the stock market is a roller-coaster.
More persistent than these fluctuations, however, is the fact of the pendulum: what comes up, must go down. While, at this moment rational thought and critical reflection about context seems out of fashion, in the next moment, which may well be soon, designers will experience a craving for being able to reduce complexity of context in ways they can comprehend, communicate, criticize, and improve. To satisfy that hunger the concept of activity, and the models for analysis presented in Kaptelinin’s chapter 16 on ‘Activity Theory’ will be a place to start to feed. Here you find key concepts for understanding the way humans interact with the world: ‘tools’, ‘mediation’, and ‘development’ in relation to ‘action’, ‘activity’ and ‘operation’, modeled in a hierarchical structure, a key rack, where to hang your experiences. Taken as a tool for designerly thinking, activity theory will help designers to communicate, sort out, categorize and evaluate experiences of any kind imaginable - definitely not an end-point, but possibly an access-point for communication about artifacts, of a trustworthy kind.Not only is Victor Kaptelinin extraordinarily well read in the research literature on how and why to apply activity theory, he is also sufficiently experienced as a design practitioner to know designers’ needs regarding tools for thinking. Therefore, his account of activity theory in this chapter provides both a good blend of key-rack models as well as a scholarly grounding of the theory behind the key-racks.