CHAPTER 2: LITERATURE REVIEW

2.1 MULTIMODALITY IN EDUCATION

Allow me to introduce the literature review with this short video and a few words about the work ahead.

2.1.1 Multimodal meaning making

Society is becoming more global and individuals are increasingly inter-connected. Technology is becoming ubiquitous and mobile devices are now a constant presence in everyday life, and we are witnessing a change in how people interact with each other and the world around them. Communication is increasingly shaped by new technologies and new ways for interacting with and deriving meaning from the environment around us. New aspects of communication are emerging from new ways of dealing with text, image, action, and sound.
These new advances in technology have implications for how people learn and how it can be incorporated and explored in education.

Over the last 20 years, we have seen an increase in new digital formats, incorporating the use of images, video and sound, and challenging the dominance of the printed written form and moving towards the medium of the screen (Mills, 2015). These changes in the communication landscape call for a review of traditional pedagogies, moving towards a multimodal approach to education.
In the following video clip, Burruss (2016) gives a very brief introduction to the need for a multimodal approach to literacy. The author proposes the growing presence of digital media requires a rethinking of what is literacy. She is not proposing the elimination of text, but rather the inclusion of different modes along with text. Images and video are not mere illustration, but they are an integral, and equally important part of the message – and this shift necessitates enabling students to become fluent in multimodal learning:

2.1.2 The use of multimodality in education

The use of multiple modes of making meaning can apply to education. Reading and writing today increasingly involves a growing variety of forms, to include images, digital forms, interaction, hyperlinks, and new web formats to be viewed with the ever-increasing accessibility of varying mobile devices. Students have become producers of content to be shared in social media platforms and they are generating a greater variety of text that include different media, such as videos, images, and audio. For educators, these changes call for a new understanding of the different modes and their affordances and the potential for how they can be used in the classroom to aid the learning process (Cope & Kalantzis, 2016; Jewitt, 2005; Mills, 2017).

Currently, an increasing number of scholars and practitioners are conducting research on a variety of multimodal products and text types, such as children’s animations, multimodal mapmaking, social media, and video gaming, as media literacies, to name a few (Gee, 2004; Hull & Nelson, 2005; Rowsell & Walsh, 2011; Vasudevan, Schultz & Bateman, 2010). Researchers are increasingly investigating new possibilities of digital spaces and multimodal practices.

Mills (2015) sees a shift in how reading is becoming progressively more interactive, where readers are now often engaged and responding to the content they encounter; and writing now has a much larger and immediate potential audience with the global connection afforded by the Internet. Both reading and writing can now be simultaneous activities in texting and social media, in turn requiring new ways of thinking about the skills necessary to function in this multimodal environment (Mills, 2015).

Mills (2015) also stresses the importance of addressing the social dimensions within education that draws on multiple modes of meaning making, such as power relations and situated practice. She points to the issues of diversity and making sure all groups have a voice; and the role of multimodal curriculum design in addressing these concerns. In the following video, Mills (2015) shares the work produced by a group of students in Australia, where they explore their cultural identities:

Jewitt (2005, 2008) looks at how reading and writing is changing through the interaction with the computer screen. Image, sound, and movement interact on the screen and reconfigure the relationship between text and images. The author believes writing is increasingly incorporating more visual elements and images are increasingly gaining center stage, challenging the salient place of the written word in education. Bezemer and Jewitt (2010) propose the increase in multimodal approaches to educational research, with images that serve not just as decorations to the page, but as semiotic resources in their own right.

Jewitt (2008), however, also reminds us that much research has been carried out on language and the visual and verbal modes of communication, but substantially less has been studied about the use of sound, gestures, movement, and other modes of communication. While research on multimodal ways of making meaning has tried to account for different ways of representing knowledge and meaning, there are particular modes, such as taste, smell, and touch, that have received less attention than the contrast between the written versus visual modes. There is great potential and a serious need for looking into and accounting for the role of all the senses – touch and movement, smell and taste, tactile, and kinetic – in learning with multimodal resources. (Jewitt, 2008; Mills 2017). In the following video, Jewitt discusses the layering of different modes to convey ideas and make sense of the interaction:

One group of researchers who studied the use of touch is Simpson, Walsh and Rowsell (2013). They looked at how mobile devices, such as iPads, allowed students to take multimodal, multidirectional paths in their reading and meaning making. The researchers in this study were particularly interested in the mode of touch and how it affects students’ reading paths. They studied several dyads of students working together and collaborating in reading assignments using iPads. In each case, it appeared clear that students used touch to both follow multidirectional paths, but also to help and share information with each other. These authors were interested in learning how students used a series of movements, such as touching, tapping, and sliding to follow non-linear pathways for reading and writing. They looked at the choices of movement made by the students and how touch itself influences how students work and how they make sense of the material to be studied.

Previous research on learning styles has pointed to the importance of kinesthetic learning modes, however, research on the specific role of touch in reading with mobile devices, is still very limited (Mills, 2015). Given the expansion of touchscreen technology, new research is needed to understand how touch becomes an integral aspect of the learning process and how students use touch to interact with the content. In these new digital spaces, the use of touch enables students to explore different pathways and the dynamic multidirectional nature of reading becomes apparent. More research is needed on the role of touch in digital reading practices (Simpson et al., 2013).
Luke (2003) proposes a departure from the traditional pedagogy as a linear process of passive reception and mastering of the dominant modes of information to the incorporation of digital technologies and the new modes of textual practice. Luke suggests that the classroom should not be a departure from the type of communication people do every day in the world. The classroom should not discourage children from exploring, mixing and reinventing from the diverse sources and modes they are now encountering outside school. Luke’s work also follows a social constructivist theory of education as a departure from the traditional pedagogy of linear knowledge transmission from teacher to student. Technology for Luke is not at the center of the learning process, computers are simply one resource among many. Rather, a critical, learner- centered constructivist pedagogy is proposed, with self-reflection and analysis of the socio-cultural context where learning takes place (Luke, 2003).

For Hull & Nelson (2005), the power of multimodality is not simply the addition of images and music juxtaposed to text and thus increasing the meaning of such text. Rather, the authors propose what they call a “process of braiding” (p. 225), or “orchestration,” where the combined total transcends the sum of its parts – offering different kinds of meaning (not possible through each isolated mode alone).

Selfe (2007) offers some guidance for teachers trying to incorporate multimodal approach in their teaching; and he offers the following graph showing some of the challenges faced by teachers new to multimodality.

Figure 1: Selfe - Challenges teachers face

2.1.3 Classroom research on the use of multimodality across subject matter

Here is a call for the inclusion of multimodality in the classroom:

An increasing number of studies have been conducted to investigate how multimodality is employed in the classroom. Marchetti and Cullen (2016) describe a couple of studies focused on the use of different visual materials in the language classroom. In their studies, the authors claim that students showed preference for learning new language from images and other visual materials, rather than text alone. The majority of students preferred to work with the aid of visuals for vocabulary learning as well as for speaking activities in class. The authors concluded that it is useful to use a variety of modes for students to interact, and the alternation of different modes at different times over the course of a lesson seemed beneficial. Students responded positively to the addition of images and external audio; and the interaction of the different modes in the learning process, which assisted in the acquisition of new language forms (Marchetti & Cullen, 2016).

In another study, Dusenberry, Hutter and Robinson (2015) describe the use of multimodality in a technical communication course. The authors describe three different examples of multimodal student work; the production of infographics as part of a writing assignment; conducting and incorporating data from research interviews; and preparing a software presentation. The authors attempted a departure from the traditional linear school composition, to a new model where the students have agency in their own learning and are required to effectively filter, evaluate, and remix information to create new multimodal artifacts to demonstrate their learning, and learn to communicate with a diverse audience. In a layered and multidirectional approach, students are both consumers and designers. Students were asked to filter information generated from diverse sources and modes; and then considering the needs of their intended audience, they had to negotiate between the content and the mode of transmission to find the appropriate medium and form. Although the authors admit that students expressed frustration at times, they reported positive outcomes in the end (Dusenberry et al., 2015).
Smith, Kiili and Kauppinen (2016) looked at how university students build arguments in written essays versus multimodal videos. They analyzed how the different modalities enable students to produce different types of arguments and offer different affordances to students. In their study, the authors found that the more traditional and familiar mode of the written essay allowed students to follow pre-established paths to form well-structured and balanced arguments and counter-arguments. In contrast, the multimodal assignments provided more freedom and greater creativity on the part of the students to create unique arguments drawing from a greater variety of resources, such as including music to elicit specific feelings from the audience. When the students worked with the less familiar multimodal video elements, they demonstrated greater awareness of their audience in building their arguments; and used multiple perspectives to build their case. Smith et al. (2016), did not propose to eliminate the written essay from the curriculum; rather, they argued that while the written essay offers a stable and familiar space for students to practice building a well-balanced and well-structured argument, the multimodal approach offered students more flexibility to try to appeal to their audience. In this study the work done by the students using the different modes were not just repetitions of each other, but in fact allowed students to do different things, be more creative and add new layers to their work and the message being conveyed (Smith et al., 2016).
Vasudevan, Schultz, and Bateman (2010) worked with composition writing in school. They engaged in an ethnographic study of a multimodal storytelling project with a fifth-grade group of students. The authors found that as students had to learn and use new forms of composition, in the process they began to develop new literate identities. The authors found that by extending the composing process beyond the traditional written only forms, students work shifted in significant ways to reflect their increased engagement within the classroom, and in turn started to see greater levels of success in the academic setting – through their engagement with multiple modalities of text production, the students shifted their own modes of participation in the school curriculum. The students’ choice and engagement with the different modes of communication, opened up new ways of engaging with their environment and with school in ways not possible before. They gained new voice and were able to tell stories that could not be told through writing alone (Vasudevan, Schultz, & Bateman, 2010).

Warschauer and Liaw (2010) look at the use and application of emerging technologies to adult literacy and language learning. The primary focus of the authors is adult education, in particular those students trying to learn a new language due to immigration (second language acquisition). The authors note that multimodality is nothing new in language learning. Because of its focus on communication, audio recordings have been used in the language classroom for decades, in the form of audio-cassettes and later CDs. Part of the difference now with new emerging technologies is the potential for students to become co-producers of content, from podcasts to new audio and video production tools, collaborative writing, blogs, wikis, online networking, virtual environments and games. But, the authors still find that many adult teachers (and learners) still feel uncomfortable and lack the experience on how to use technology in their classrooms. They conclude teacher professional development in the area of technology integration is still lagging behind and should be addressed (Warschauer & Liaw, 2010).

In the following video, Burruss (2016) shares the work of one of her students and the student’s delight when her video reached 300 views! Something unheard of, when the typical school work usually has an audience of just one teacher:

2.1.4 Criticism and caution

Although much has been written about the benefits of using multimodal meaning making resources in education, not all studies on the use of a multimodal approach in the classroom have reported positive findings. Fadel and Lemke (2008) review a number or research and case studies of multimodal practices in varying classrooms settings. Their findings showed overall positive, albeit sometimes mixed results on the use of various multimodal teaching activities. Overall, they found that students who engaged in multimodal practices, on average, outperformed students who engaged with one single mode; but the authors also described some situations in which not all multimodal activities were equally beneficial. Fadel and Lemke (2008) speculate that unless students have been trained to use and understand visual input, the impact of multimedia may be reduced. The authors conclude that pedagogy is more strongly correlated to achievement than media usage alone.

When analyzing research on multimodal learning, Fadel and Lemke (2008) also look at how students process information and whether they learn better by listening, reading, speaking or doing something. The authors question the often-cited quoted statement that students learn better when doing something; they claim this belief is unsubstantiated and does not hold up against the results of numerous research studies. The authors conclude that doing is not always more efficient than seeing, and seeing is not always more effective than reading. Fadel and Lemke (2008), stress that informed educators should understand the different affordances and the optimal balance of modes; and that curriculum design depends and varies across content, context, and the individual learner; and the answer is never a one size fits all.

Multimodality has also encountered some criticism from authors who feel multimodality has been used by teachers and policymakers without much understanding of the theory or the ideas behind it (Bazalgette & Buckingham, 2013). The authors caution against the oversimplification and the tendency to use the term multimodal to mean just about anything that involves the use of computers. They see the term being used to promote the inclusion of digital materials in the classroom without critical consideration of content or understanding of what multimodality really means. The educator needs to be a professional with repertoire of instructional choices to meet the needs of varied students and subject matter.
Bazalgette and Buckingham (2013) also provide a critique of Kress’ analysis of the value and use of multimodality. They criticize much of the literature on multimodality as being limited to stating that different modes have different affordances, but fail to provide a clear analysis of each affordance, or how the different modes should be used. This remains an important point. The authors warn against treating multimodality as a panacea that can solve all the problems in education, and against oversimplifying it to mean just about anything and assuming that all teachers need to do is incorporate different media to their lessons and that alone will take care of everything. Multimodality should not be perceived as a silver bullet to solve all problems in education (Bazalgette & Buckingham, 2013).

Another important point brought up by Mills (2015) is what she calls the dark side of learning with multimodal digital resources. Teachers must be knowledgeable of data mining, and aware of the types of information that can be gathered from the students, shared and used by others. This brings up many potential issues related to security of information and ethical considerations regarding who owns the data and what can and cannot be shared with others. Security risks associated with new technology advances go beyond the scope of this study, but should be acknowledged as something teachers and students must learn to contend with and mitigate. The current advances in technology and data mining carry with them significant implications for teachers and policy makers. Technology and its multimodal affordances present great potential to be explored, but also risks that need to be understood and considered.

2.2 MULTIMODALITY IN ACADEMIA AND DISSERTATION WRITING

Scholars are conducting a lot of research on the topic of multimodality, but not necessarily producing multimodal works themselves. In this video I provide a brief introduction to the literature review regarding multimodality in academia and in dissertation writing:

In spite of all the recent advances in technology, academic work is still predominantly done in the traditional print text format. The number of scholars who are actually trying to produce multimodal works is still somewhat limited (Archer 2010; Ball, 2004). The same limitations and challenges are felt by doctoral students trying to produce multimodal dissertations, and facing countless obstacles from doubts and limitations imposed by their dissertation committees to specific requirements for depositing the work in the school archives (Adams & Blair, 2016; Andrews & England, 2012; Davidson et al., 2009).

In this section we turn to the question of what counts as scholarship, and how to evaluate multimodal research. I start by looking at what are some of the limitations and challenges faced by those trying to conduct and publish multimodal works. First, I look at academic work in general and the standard for academic publishing, the peer review journal, and how it shapes the expectations of researchers and those who wish to pursue a career in the university setting. I then look at how some of these same challenges and arguments also serve to limit and restrict the pursuit and creation of multimodal works among doctoral students working on their dissertations. I review a few existing examples of multimodal doctoral dissertations and how they can serve to inform those trying to incorporate multimodality in their own work. Finally, I look at the question of how to evaluate multimodal works and the need to further expand and define a framework for how to assess the quality, validity and reliability of multimodal scholarship.

If we accept the premises of the New Learning Theory (Cope & Kalantzis, 2016) and the importance of incorporating a multimodal approach in education, then it stands to reason that scholars, too should be conducting their own work using different modes. However, there is a gap in the literature about multimodality in regards to the existence of multimodal works. As stated before, a lot has been written about multimodality, but the number of multimodal scholarly works, especially in education, are limited; and examples of multimodal dissertations in Education are still hard to find.

Anderson (2006) confirms:
“Scholars who compose (or want to compose) multimodal texts to advance knowledge in the field still face significant hurdles as to whether such work will count towards tenure or promotion. In addition, the dichotomy between support for teaching multimodal composition and researching (i.e., producing) multimodal composition as scholarship needs to be examined so that schools recognize this disparity between what instructors are able to teach versus what they are able to research.” (p.79)
The same obstacles faced by scholars trying to produce multimodal research and publish multimodal works, is also faced by doctoral students trying to produce a multimodal doctoral dissertation. Most universities in the US still have a very narrowly prescribed format for what a doctoral dissertation should look like; and many professors who advise doctoral candidates, have been slow to embrace multimodality.

2.2.1Multimodality in scholarly work

One of the first challenges one encounters when trying to look at multimodality in academia is the question of terminology. Ball (2004) calls attention to the confusion of terminology in relation to what new advances in technology enables people to create. The term digital text is often used, but may simply refer to a traditional written text being reproduced as a PDF file for archiving purposes. Online scholarly publications often define digital text too broadly to mean just about anything that is to be viewed on a screen.

Ball (2004) offers the term new media to refer to “texts that juxtapose semiotic modes in new and aesthetically pleasing ways and, in doing so, break away from print traditions so that written text is not the primary rhetorical means” (p. 405).
In the words of Literat et al (2018), “. . . now that we can easily produce, preserve and distribute multimodal content through digital channels, the limitations of paper-based formats should no longer define what counts as scholarly knowledge” (p.568).
And yet, that seems to be far from the reality encountered by scholars trying to produce multimodal works in universities across the US (Archer, 2017; Ball, 2004; Literat et al., 2018). In the following video, Chery Ball (2015) discusses her own questions and challenges creating a digital tenure portfolio:

Literat et al. (2018) also address the question of “what is scholarship, and what activities must scholars engage in?” The authors argue for the need to increase the inclusion of multimodal research in academic inquiry. By promoting and valuing different ways of thinking and expressing knowledge, multimodal research can help expand participation in the production of knowledge and help expand the consumption and the audience for academic work beyond the academic circles (Literat et al., 2018).

The central question we address here concerns the implications of widening our ideas of acceptable forms of inquiry, analysis and representation in academic scholarship. (p.567)

It is ironic to note that, while their impetus was to promote the inclusion of multimodality in academic work, Literat and her team (2018), chose to write and publish their article in text-only form. The issue of balancing the affordances of multimodality and the practical reality of producing multimodal works in the traditional academic context, continues to challenge researchers trying to produce multimodal scholarship. A large number of researchers and scholars continue to engage in the discussion of the inclusion of multimodality, but continue to do so in the traditional written text format (Literat et al., 2018; Ball 2004; Archer, 2010).

Ball (2004) differentiates between what she calls new media scholarship versus scholarship about new media:
Composition and new media scholars write about how readers can make meaning from images, typefaces, videos, animations, and sounds . . . but most scholars don’t compose with these media . . . they do not seem to value creating new media texts for scholarly publications to explore the multimodal capabilities of new technologies. (p. 407)

Even online journals such as Kairos, designed specifically for scholars to explore new forms of publication, still contain a large number of articles that follow the traditional text-based forms of composing, with an occasional image or video embedded, but the written text is clearly the most important aspect of the piece. The other multimodal elements may serve as illustration, but the text is what carries the main argument, following the traditional criteria for scholarly work (Ball, 2004). When it comes to academic discourse, there has not been as much exploration of what argumentation may look like in visual, oral, or other alternative modes (Archer, 2010).

Wysocki (2005) questions the predominance of the traditional writing format in academic work: “. . . it is the neat rows of typographically clean letters on letter-size white paper that are necessary for serious thought” (p.55). The author investigates the constraints of traditional writing and what is or what can be consider ed academic writing. The title of her article, “Awaywithwords” can be seen as: “a way with words” or “away with words” – a play with how even the space between the letters and how space is used on the page, can affect meaning. The big question for Wysocki (2005), is to understand what is gained and what is lost in the different types of communication, in particular when looking at the new affordances of computer technology. The author invites the reader to reflect and to question established assumptions about our textual practices and how these practices can be used to expand or to keep knowledge in the hands of a few (Wysocki, 2005).

Another author trying to question the predominance of the written word in academia is Sousanis (2018). In the following image, Sousanis (2018) makes the case for the inclusion of the visual mode, “not as an afterthought” but integral to the meaning making process:

2.2.2 The predominance of print-text in academic discourse

Students are traditionally taught how to conduct and develop academic argumentation as a traditional form of work expected in higher education, which is done in the form of the written essay (Archer, 2017; Anderson et al., 2006; Andrews et al., 2012). The form of the written academic argument has a long and established tradition and has taken a fixed format with very specific criteria to be followed. In the academic discourse, knowledge is presented in a logical and linear sequence, supported by evidence – following the rationalist paradigm with its focus on logic, evidence, and citation.

The long-standing dominance of print-based practices in the academy has made certain practices taken for granted or invisible to the eye. Multimodality may press the reassessment of discoursal practices by providing a different angle to look at discoursal practice in the academy. (Archer, 2017, p. 68)

Hiippala (2017) looks at the academic research monograph in light of the current interest in multimodality. He admits that in spite of all the current interest in multimodality, the research monograph continues to be dominated by the written text format, which can be occasionally illustrated by figures, diagrams, tables and other graphic elements. Typically, in the research dissertations images and visual elements are used to support the dominant mode, the written word. Hiippala (2016) contends that the complexity of the contents in the dissertation is such that is it may be helped by using a single mode and applying the linearity of the written text. He states:

Unpacking the highly compressed meanings of academic discourse already requires significant effort from the reader. For this reason, the shallow hierarchy in the layout structure facilitates the reader's access to the content. (p. 21)
Palmeri (2007) expresses his frustration and dilemma in trying to incorporate multimodality in his doctoral dissertation, versus the requirements imposed by the Ohio State University. Although his dissertation topic was multimodality, he felt forced to produce his work in the traditional print-text format.

The institutional culture of the university still strongly pushed me to conceive of the dissertation as primarily a print alphabetic text. Moreover, the institutional pressures of the academic job market also influenced my choice to compose this dissertation primarily with words. (Palmeri, 2007, p.26)

Incorporating multimodality into the academic argument may encourage scholars and academics to re-evaluate their own practice and to critically incorporate new practices such as collage and remixing (Archer, 2017), which may lead and allow the reader to choose different paths and follow different formats.

2.2.3 The peer reviewed journal

The academic journal has long been considered the standard for disseminating scientific and scholarly knowledge; and publishing in academic journals is still considered a requirement for tenure advancement opportunities in most universities. However, the great majority of academic journal publications is still not yet able (or willing) to accept multimodal works. The reality is that publication requirements are still primarily text-based, which in turn also shape the type of work scholars normally end up producing if they wish to be considered for promotion and career advancement. Most academic publishing opportunities still limit scholars to presenting their work through text, with the occasional addition of images or graphics, sometimes limited to an appendix at the end of the work (Ball, 2004; Anderson et al., 2006; Cope and Phillips, 2014).
In the following Ted Talk, Stone (2016) advocates for making academic work more accessible to the public in general and also offers some insight and questions about how academic research and publishing work in universities in the U.S.:

New technological advances create new possibilities for the academic journal, and opening up new avenues for academics, researchers, and publishers to produce multimodal works. In fact, most academic journals now have an online presence and researchers expect to find a digital copy of all print publications (Cope & Phillips, 2014). But, it is also important to note, as mentioned before, that many of these digital publications are simply PDF copies of the print magazine. One must be careful to differentiate the terms digital and multimodal (Ball, 2004).

One of the big changes over the last two decades is the availability and the manner in which publications are accessed. A trip to the library is no longer the primary option as researchers can now access most works online; and mobile devices enable users to work anywhere, anytime – changing how research is carried out (Cope & Phillips, 2014). The question of open access also brings up the question and the potential to expand the audience of who can get access to knowledge – it is no longer limited only to subscribers. Cost is a big factor when dealing with the question of open access as publishers are currently trying to find solutions and new alternatives to the traditional models (Cope & Phillips, 2014). Although the question of open access is not exclusively tied to multimodality, digitization of scholarly work does have profound implications for the accessibility of scholarly work and a major issue concerning the peer reviewed journal (Cope, 2012).

An additional consideration is the new possibilities and affordances created by new technological advances, such as social networking and other collaborative tools; and the possibility to incorporate sound, video and other forms of animation that require new digital platforms and novel means of reproduction and dissemination. This growing trend also has tremendous implications for researchers interested in exploring multimodality, as well as publishers trying to keep up with the technological requirements for such work (Anderson et al., 2006; Andrews et al., 2012; Archer, 2017; Ball, 2012).

Multimodality opens up new doors and offers the potential to expand the reach of scholarly work outside the walls of the university. It is important to recognize the historic value, the current state and the challenges faced by the academic journal, since it is still considered by many as the standard measure of scholarly work (Anderson et al., 2006; Ball, 2012; Cope and Phillips, 2014)

2.2.4 Challenges and criticism to multimodality in academia

Not all scholars are so enthusiastic about the role of visual forms of communication in academic discourse (Gourlay, 2016; Krause, 2004; Palmeri, 2007). Contrary to proponents of the inclusion of visual argumentation in academia, Gourlay (2016) argues that the written language may, in fact, be the best mode for developing a complex academic argument:
. . . despite the many advantages of multimodal argumentation – the features of conventional written text remain well-suited to the particular demands of extended and complex development of propositional context and intertextual academic argument. (p. 79)

Gourlay (2016) argues that visual images alone, while they may add to the understanding of the whole, lack the ability to convey complex argumentation without the complement of the written text to accompany. The author warns against the temptation to “demonize” the conventional text. She argues that in many instances visual images are more susceptible to individual interpretation and may require the understanding of previously accepted context, background and cultural information. In the end, the author concludes that the written text is still more precise and better suited for carrying a rigorous academic argument (Gourlay, 2016).

Krause (2004, 2012) cautions against being too overly enthusiastic with the incorporation of multimodality, in particular video, in academic writing. He argues that while multimodality may be a valuable addition to the written text, few professors and teachers have the necessary skills, training or experience with video and multimedia production to be in a position to guide and evaluate multimodal writing. Palmeri (2007) expresses similar concerns with the additional burden imposed on scholars trying to master multiple technology programs and attempting to incorporate different modes into their work – rather than focusing on what they know how to do best: writing.

In addition to the challenges of producing multimodal scholarship, the challenges of carrying out multimodal research has also received some attention in the literature. Jewitt (2012) has become increasingly interested in the use of video in academic research. Her focus, however, is not in the production of multimodal scholarship, but rather the analysis of visual data in research. She has carried out extensive research on the use of video, its advantages and the challenges for the researcher. Jewitt (2012) takes on the analysis of video in research from several different perspectives: the history of video in social research, the different ways video can be recorded and used, the technical and ethical aspects and choices involved in recording video, and the analysis of video data.

In the following video, Carey Jewitt answers the question about why she does not present her work through multimodal means:

Different modes offer different affordances and bring with them, different constraints. An additional and unique challenge of working with video clips, for example, is the issue of attribution, copyright and fair use (Wysocki, 2005). Another compounding challenge in working with new media formats is the question of archiving and distribution. New and different platforms and programs are arising all the time and as older technology becomes obsolete, it creates problems for accessing older materials that are no longer compatible with current devices (Wysocki, 2005; Kuhn, 2013).

In the following video, Wysocki (2017) discusses the added challenge with regards to multimodal works regarding the complexity of creating citations to the ever-increasing diversity of source formats:

Another consideration for multimodal scholarly publications is the notion that aesthetics takes on a more prominent role in the communication process. This focus on the visual aspect of the work, leads some critics to question whether these visual elements may become distractors, taking away from the scholarly value of the work (Ball, 2004; Gourlay, 2016).

2.2.5 Examples of multimodal scholarship

Although the great majority of academic publications is still focused on text-based works and has limited capacity to incorporate sound and moving images, there are a few initiatives currently taking place and trying to expand the notion of what scholarship may look like. Below are some examples of new initiatives to incorporate multimodality into academia.
Kairos: A Journal of Rhetoric, Technology, and Pedagogy - Kairos is a peer-reviewed online journal focused on publishing digital and multimodal scholarly articles. The first issue of Kairos was published in 1996, and their stated mission is:

To publish scholarship that examines digital and multimodal composing practices, promoting work that enacts its scholarly argument through rhetorical and innovative uses of new media.

Kairos publishes "webtexts," which they define as “texts authored specifically for publication on the World Wide Web.” Kairos publishes on topics related to the use of technology in education, English studies, communication, and related fields. They also publish reviews of print and other digital media, as well as interviews with scholars and other interactive exchanges.
AERA - American Educational Research Association is an online, open source national research society. Their mission, according to their website, is, “to advance knowledge about education, to encourage scholarly inquiry related to education, and to promote the use of research to improve education and serve the public good.” AERA aims to is promote the dissemination and practical application of educational research. Although their focus is not the incorporation of multimodality, it is worth noting here given their concern with expanding the accessibility of scholarship. In the following video, Bill Cope (2014) discusses the importance of AERA, available for free online:

Research for All - is another free, open-access, peer-reviewed online journal, focusing on collaborative research. It started in 2017, with the intent to create and promote engagement between researchers, and “non-academics” interested in joining the conversation (Ilagan, 2019). The founders of Research for All were interested in creating a venue that would be familiar and acceptable to the academic circles, with the established rigor of academia; but that would also be open and encouraging to other communities to help expand and enrich the dialogue.

Published twice a year, Research for All gives voice to those who often go unheard in academia – such as those in NGOs, theatre, local TV, commercial enterprises, NHS, museum and government, teachers in schools or further education, students, and freelance participation practitioners.

In the following video, Sophie Duncan, Pat Gordon-Smith and Sandy Oliver discuss the creation of Research for All and the impact they have observed in how the online journal is creating opportunities for teachers and other community members to participate and contribute.

Scalar – is an online platform resulting from the Alliance for Networking Visual Culture initiative, which seeks to expand the possibilities and the practice of creating multimodal scholarly works, incorporating video and other rich media. Scalar is a free, open-source authoring and publishing platform designed to allow authors to create and publish multimodal scholarship online. Scalar allows users to compose using media from multiple sources and insert their own writing in different ways and it does not require special technical skills on the part of the author. You can watch their video trailer below:

Other disciplines, especially in the sciences, have been more pro-active in embracing multimodality. One example from the sciences is JoVE.com. JoVE publishes peer-reviewed scientific video articles.

Articles consist of high-quality video demonstrations and detailed text protocols which facilitate scientific reproducibility and productivity. The scope of the journal includes novel techniques, innovative applications of existing techniques, and gold standard protocols in the physical and life sciences.

These journals and platforms are examples of the current effort and trend in trying to expand the notion of academic scholarship and trying to expand access and collaboration between academia and the community. They are also an attempt to open up the opportunity for non-academics to participate and contribute to the knowledge making process.

2.2.6 Multimodality in dissertation writing

Within the different types of academic writing, the doctoral dissertation is another example of a genre that has a long-standing tradition and very specific requirements to be followed. Andrews & England (2012) examine the current landscape of new forms of dissertation. They do not limit their investigation exclusively to multimodal dissertations, but also consider all forms of digital work (traditional conventional print work transposed to digital formats). Andrews and England (2012) analyze how new technologies have opened up new possibilities for composing and producing new forms of work, to include the use of images, video, audio, and web pages with hypertext. These innovations create new possibilities for the genre of dissertation writing that were not available before. But, they also create new challenges for universities and libraries regarding how to evaluate and store these new formats (Andrews et al., 2012; Anderson et al., 2006). The following table shows a small sampling of the possibilities for multimodal dissertations according to Andrews and England (2012):

Figure 3: Examples of digital and multimodal dissertations

Multimodal research brings additional challenges and questions to the doctoral student, from data collection, to recording and analysis, such as sound and video recording, website design and layout, and the many implications that different modes present in terms of creation and production. There are also questions of interpretation and evaluation, ethical and cultural issues, and issues of copyright and reproduction, as well as implications of methodology and methods for research (Andrews & England, 2012). A big question that comes up when students and researchers begin to explore new technologies and new modes of communication revolve around the challenges of how different modes can be used to carry out an academic argumentation. Written academic argumentation is typically an expected and required component of a dissertation (Andrews et al., 2012; Archer, 2017).

Adams and Blair (2016) relate some of the trajectory they went through with the production and publication of a multimodal doctoral dissertation in 2011 at Bowling Green State University. The authors describe the frustration of having to submit a PDF version of the dissertation. This requirement is still the norm for most universities and is guided more by regulatory policy rather than technological limitations. The authors describe the technical and bureaucratic challenges faced by students trying to conduct multimodal dissertations: . . . explain the hesitation of many graduate students who are focused on completing their degrees in a timely manner. . . . There is not only a larger time investment in regards to technological literacy learning curves and working with digital data but also in having to make the argument for the digital format of the dissertation. (para. 13)

Despite all the challenges and difficulties encountered by students trying to pursue new dissertation forms, one of the benefits of new technologies, according to Andrews et al. (2012), is the increase in the availability of dissertations in digital form and the increase in readership – primarily by other students looking for other successful examples of work being done in their respective fields.

Such sharing often happens globally and opens the channels for transcultural exchanges of theoretical, conceptual and pedagogical approaches in research . . . Hence, digital sharing not only increases readership but also opens new communities of sharing and new ways of publicly sharing knowledge. (p.5)

2.2.7 Research in education

In order to fully appreciate the resistance against multimodal dissertations in the field of education, it may be helpful to take a look at the history of education schools in the United States and its low status compared to the “hard” sciences. Labaree (2004) analyzes some of the obstacles facing the field of education research and the lack of consensus in methodology and purpose. Education research is viewed by some as inferior to other academic fields, which prioritize “pure” research, but when education researchers try to focus on theory, they are criticized for being detached from the reality of school life (Labaree, 2004; Lagemann, 1997).

Hard vs. Soft Knowledge – Education research suffers from a low status in the university setting, falling in the “soft” categories of applied knowledge rather than “hard” or pure knowledge; and as a consequence, we have seen a strong pressure for education researchers to pursue the type of research methodologies that can be reproduced, verified and validated as definitive. Educational researchers have worked very hard to establish processes that incorporate quantitative methodologies, and statistical tools to enhance their claims of validity and reliability of research results and findings (Labaree, 2004; Lagemann, 1997), and the more conservative scholars in education remain resistant to the recent increase in a more interpretive approach to educational research.

Over the last couple of decades, however, there has been a strong push against the predominance of quantitative research methods in education, calling for greater freedom and more interdisciplinary approaches to research. And as a result, we see a tug of war among education researchers between the need for greater freedom from quantitative constraints and the fear by some of losing the methodological rigor they have fought so hard to attain (Labaree, 2004).

The field of education is in a unique position in that it deals with matters of interest to the population in general and should therefore be able to speak to a wider audience. Labaree (2004) comments:
A paper that is truly interesting in a field such as math or biochemistry—that is, at the leading edge of theoretical development—is one that should be completely incomprehensible to an apprentice in the field, much less a layperson. (p.81)
There is a dichotomy between the practical nature of the role of the teacher and the analytical/theoretical nature of the role of the researcher (Lagemann, 1997). Teachers base their analysis in terms of their classroom experience. The challenge is to engage and train teachers to be able to consume and produce research analysis and interpretation, that conforms to standards of validity and rigor, and contributes something new to the field – and at the same time, understand and address the particulars of each case and its context. Education research has to contend with the challenge of bridging the “cultural divide” between teachers and researchers (Labaree, 2004)
… the students complain that the faculty’s vision of a doctoral program in a professional school of education is bizarrely academic in all the most pejorative meanings of that term: abstrusely theoretical, impractical, book-bound, and cut off from the real world of educational practice. (p. 103)

The future of education research calls for greater understanding and collaboration between teachers and researchers, scholarship, and practice; it calls for developing new relationships among all the stakeholders, including those directly involved and directly affected by the study and practice of education. Education requires scholarship that erodes boundaries and encourages more cross-discipline practices (Lagemann, 1997), and gives researchers the “satisfaction of knowing that they are working on issues that matter” (Labaree, 2004).

2.2.8 Examples of multimodal dissertations

To date, only a few trail blazers have succeeded in presenting truly multimodal dissertations in the field of education. In some disciplines, such as anthropology and visual arts, images, video and sound files have long been used as evidence artifacts. But, so far in this research, I have only encountered a handful of multimodal dissertations in education.
Sousanis (2015) is the first to write his entire doctoral dissertation in comics format for Teachers College at Columbia University. His work was later published by Harvard University Press with the title “Unflattening.” Sousanis’ work has received very enthusiastic reviews and has generated a lot of interest from a wide audience around the world and is now being translated into many different languages. For a brief introduction to his work, you can watch the following video, where Sousanis discusses his book:

In the following image, taken from Sousanis' book, the author makes a case for learning to see from different perspectives and expanding our view of the world:

Figure 4: multidimensional view

In defense of his work and its unconventional format, Sousanis says “I hand this out to people on the street,” (he says of the comic book) “people around the world are reading a doctoral dissertation, and that’s really exciting.”

Rebecca Zak successfully defended her doctoral dissertation in video format in 2014. Her dissertation question was: how can we nurture creativity in education? Her dissertation took the form of five videos and an accompanying blog. Each of the five parts of the video corresponds to one of the chapters of the traditional dissertation: the rationale, the literature review, the methodology, the observations and her recommendations. The five videos put together take the form of a documentary, composed of dozens of clips procured from YouTube and representing the main scholars in her field. The accompanying blog served as self-reflection, and to address issues that arose from the experience, as well as to address specific topics that would not be suited for the video, such as copyright concerns and to accommodate additions to the dissertation after each video was completed. Zak (2014) admits that working with video has its own challenges and that it is harder to edit the video after it is finished in comparison to editing a written document in Word format. But, she believes her choice of creating a multimodal dissertation was worth the effort.
As of October 2014, the videos have garnered over 37,000 combined views, and have been seen in 195 countries worldwide — far greater than the level of attention my work would have received otherwise.

You can see here Zak’s introduction to her dissertation:

Virginia Kuhn (2013) is another example of what is possible in terms of multimodal dissertations. She successfully defended one of the first “media-rich” digital dissertation at the University of Wisconsin–Milwaukee, in 2005. According to Kuhn (2013) although there may be an increased interest in the topic of multimodality and a growing number of scholars are studying and writing about it, still today the reality is that the majority of those who serve on tenure boards and committees have no experience evaluating multimodal work; and there is a bias against digital scholarship. Among new faculty in most universities, they often feel the pressure to produce traditional peer-reviewed articles published in more traditional text-based journals.
Kuhn makes the case for trying to expand the notion of scholarship:
. . . dissertations can circulate across online networks, linking to other forms of discourse and feeding into the public sphere.

Perhaps this, in turn, would help break us out of the ivory tower, keeping us vital, relevant, and connected to the world.

In the following video, Kuhn talks a little bit about her ideas for the expansion of scholarship to include visual and other modes of communication:

Lee (2014) defended her dissertation as a “hybrid, image and word integrated, multi-media text” at Ohio University in 2011. She describes the process, the obstacles and frustrations in trying to defend and submit a multimodal dissertation:

The purpose of its design is to disrupt logos-centric monologue and gendered assumptions about authority with pictures of artifacts, symbols, depictions of the rhetorical feminine. Footnotes identify and contextualize each image. In this way, images take authoritative positions and respond to ideas the text discusses, rather than merely accompanying or echoing words on its pages. (p.95)

The following images from Lee’s dissertation demonstrate her use of layout, typography, watermarks and other visual devices:

Figure 5: Sample page of Lee’s dissertation

Although Lee’s advisor, dean, and most of her committee supported her work and what she was trying to do, the university’s Thesis and Dissertation (TAD) service guidelines prevented her from submitting her work as planned. The guidelines imposed many technical constraints and requirements, from the size of the margins on the paper, the location and size of images and including the position and space between text and images on the page. Filing her hybrid text-image dissertation was an uphill battle according to Lee:
Sixty e-mails, two weeks, and some feverish, tearful phone calls later, formal submission of the first hybrid dissertation of its kind at my university took place when, against my wishes, TAD staff converted my PDF file into a PowerPoint. . . . My dissertation is done and filed, but I would not wish the political problems that encumbered my process upon anyone. (p.98)
Lee calls for a revision of traditional practice in dissertation requirements to become aligned with current technology developments and new media-rich compositions.

Resistance to new ways of making meaning not only disserves faculty and students expected to innovate in multimodal environments but also elides crucial new media literacies. (p.99)
Although Lee’s work only contained text and images (no video, sound, or other digital artifacts), the difficulties she encountered illustrate some of the problems still encountered by doctoral students trying to produce multimodal dissertations.

2.3 THEORETICAL FOUNDATIONS OF MULTIMODALITY

Before proceeding with the theoretical foundations of multimodality, it is helpful to take a look at the definition and explanation of the terms multimodality and multimodal learning and define some of the key terminology encountered in the literature and this work.

2.3.1 Definition of terms and keywords

Multimodality

Multimodality refers to the use of different modes for communication, namely textual, aural, linguistic, spatial, and visual resources, used to compose messages. Key to multimodal perspectives on literacy is the basic assumption that meanings are made and interpreted through many different ways or modes through which people communicate. Spoken and written language are just part of the vast repertoire of available options for meaning making (Kress, 2000, 2005, 2010; Jewitt, 2008).

Mode

In the context of multimodality, the term mode refers to the organized use of all the available resources for purposes of meaning making, such as images, spoken and written language, videos, sound and music, gestures, etc. (Kress, 2000; Kress and VanLeeuwen, 2006). All modes contribute to the whole in the understanding of a particular communication event, but at the same time all modes are partial. Each mode, including speaking and writing, contribute to the construction of meaning in different ways; but no one mode alone can account for the entire process of communication. Each mode plays a unique role in meaning making (Kress, 2000, 2006, 2010). Below, Kress explains what is Mode:

Social Semiotics

The study of multimodality originates in social semiotics. Semiotics is the study of signs and symbols and how they ar e used for communication. Social semiotics focuses on the social context within which communication takes place; not an abstract study of the forms used, but rather the process undertaken by the individuals within their group context. Social semiotics looks at the choices of resources and modes made by the participants, such as images, video, printed text; the different affordances of each mode; and the social and cultural context involving each communication event (Kress, 2000, 2010, 2016).

In the following video, Kress talks about how the cultural context influences the choice of modes:

Multiliteracies

Multiliteracy refers to the teaching of reading and writing through a combination of two or more modes, to include, printed text, still and moving images, sound and music, gestures, and more. The interest in multimodal literacy is a relatively new and growing field of research. The basic assumption begins with the recognition that reading and writing nowadays is increasingly tied to the use of multimodal and digital texts. The theoretical framework of multiliteracies posits multimodality in literacies as a key principle of situated learning, which focuses on the students’ experiences; explicitly connecting meaning to their social and cultural contexts; and transformational practice, where students recreate meaning into their own experience (Cope & Kalantzis, 2000; New London Group, 1996; Jewitt 2008).

Multimodal vs. Multimedia

These two terms may be found interchangeably in the literature and their use seems to be contingent upon the point of view or the intended audience. The term multimedia is often found in general, non-technical articles in reference to the use of film and video in the classroom, whereas the term multimodal is often preferred in academia. Multimodal emphasizes the design and process, while multimedia often refers to the technology and the product (Lauer, 2009).

2.3.2 Multimodality: Theoretical background

The interest and the growth in the number of studies about multimodality in education is associated with the seminal work by The New London Group (1996), a collaboration of scholars from around the world to discuss how new technologies were driving changes in how people communicate and how it called for a review of the conventional print-based teaching and learning.

The authors argue that the multiplicity of communications channels and increasing cultural and linguistic diversity in the world today call for a much broader view of literacy than portrayed by traditional language-based approaches. (The New London Group, 1996, p. 60).

The purpose of the New London Group was to consider the future of literacy teaching in the contemporary global world. They considered education as a “mission to provide students with the necessary skills and opening equal opportunities and access to their chosen paths in society” (The New London Group, 1996, p. 60). The New London Group questioned what schools can do and how to engage in a critical dialogue of developing a curriculum that serves in the design for social futures. The work done by the London Group is significant in laying out the principles of multiliteracies and serving as the instigator for the discussion and the creation of a metalanguage of design and new pedagogy – it has served as inspiration and catalyst for numerous classroom-based research and subsequent scholarly work and inquiry.

In this video, Mary Kalantzis introduces the work of the New London Group and their Multiliteracies Project. Within that project, the importance of multimodality is highlighted:

Cope and Kalantzis (2000, 2010, 2016, 2017) are two of the original scholars from the New London Group and they continue the work on multiliteracies and multimodal meaning making. They explore the different ways (or modes) of using text and language and understanding the world around us, and the role of new technologies in the transition from the traditional practice of literacy teaching to a model of multimodal literacy. They call for an expansion of our notion of literacy, especially with regards to the emergence of digital media. Spoken discourse, for Cope and Kalantzis, is not simply the aural representation of written discourse. Spoken language follows a different set of rules from standard written discourse; and the grammar of spoken and written texts are very different from each other.

In this video, Mary Kalantzis (2016), talks about the evolution from spoken language to the development of written language to the new media and new literacies:

Cope and Kalantzis (2000, 2016, 2017) call for a pedagogy that does not condemn the visual to a lesser role; a literacy that does not focus exclusively or predominantly on the written text. To be an effective communicator, or educator nowadays, requires the incorporation of multimodal text that allow for image and text to work together. The authors stress the need to rethink how we approach literacy and teaching. All modes must be considered, incorporated and reflected upon in this new literacy, or what the authors call “New Learning” – contextualized in the society where it takes place and with greater agency on the part of the learner.

Another author who was also a member of the New London Group, and who has become a key scholar in the field of multimodality and our understanding of multimodal meaning making, is Gunther Kress (2000, 2001, 2006, 2010). Kress studies social semiotics, and how different signs carry meaning and how they are used within a particular culture for expressing and communicating ideas. For Kress (2006), communication is a social process and in communication, each mode does something slightly different and serves a different purpose. Writing allows us to do things we cannot do with images, and vice-versa. The different modes are not just a mere repetition of each other, they carry different affordances – they allow us to do different things.

Kress and van Leeuwen (2006) provide the framework for reading images. The authors use the term “grammar of visual design” to signify that images contain structures of meaning that follow certain explicit and implicit culture-bound rules of shared understanding. Kress (2015) argues against the traditional approach to investigate each mode: writing, images, gestures, etc., through their individual disciplines of linguistics, art, anthropology and so on. For Kress (2015), the integration of the different modes and the utilization of multimodality tools enable us to understand how each mode contributes to and affects the message and the communication process. A multimodal approach allows us to have a richer meaning than any one single mode would enable us to do (Kress, 2000, 2001, 2006, 2010).

It is interesting to note that Kress (2010) points out how multimodality is not a theory – he states there is no theory of multimodality. Multimodality, he claims, is an approach for investigating how different modes contribute to the message and how they interact with each other to create a communication event. Although we cannot speak of a single theory of multimodality, there are many emerging descriptions and methodologies for studying multimodal phenomena (O'Halloran & Smith, 2011; Jewitt, 2012). Kress (2010), talks about a multimodal approach where each mode adds a different dimension to the whole; and can only be understood within its social context:

All scholars from the New London Group (1996), undertake the study of multimodality from a social perspective; not as some isolated abstract concept, but imbedded in the social context within which it takes place. It becomes important to consider the social and cultural setting in which learning takes place. The use of multimodality as an educational approach attempts to bring the learner to the center of the learning process (New London Group, 1996; Cope & Kalantzis, 2016, 2017).

Multimodal literacy is positioned within the theoretical framework of situated learning and social constructivism - making meaning from the real lives and experiences of the learner, situated within the socio-cultural context in which it occurs. To better understand the significance of multimodality, it may be helpful to look at the work of Paulo Freire’s Critical Pedagogy and Vygotsky’s Sociocultural Theory of Cognitive Development.

Paulo Freire (1981, 1989), is well known for his work with adult literacy. He is concerned with the critical understanding of education – a critical way of thinking and a critical way of knowing. Freire questions who makes determinations of what forms of teaching, what forms of knowledge are acceptable and what is taught in schools. Freire defends the rights and the value of all forms of knowledge, all forms of speech, which must be acknowledged and respected by teachers. This implicates, implicitly, harnessing multimodal ways of making meaning. He does not suggest that the dominant culture and knowledge should be forgotten. Much the contrary. He proposes that every human being should have access and the right to acquire the dominant knowledge. Freire’s pedagogy and entire work is devoted to creating social justice, to enabling the dominated and oppressed populations to overcome and free themselves from the oppression they suffer. Freire’s work is about allowing the dominated to find their voice and their place in society (1981, 1989).

Freire (1981) opposes what he calls the Banking educational model – teachers lecturing, and filling up the heads of students; and students listening passively, without interrupting, without questioning, without any engagement in the production of knowledge. Although Freire did not mention multimodality and did not focus on the use of technology in education, his ideas about the value of the varied meaning making resources people brought to the classroom, and his insistence on a critical pedagogy offer valuable insight for the application of multimodality in the classroom, allowing students to express themselves through different modes and valuing what the student brings to the education process – through critical reflection and action. Through a dialogical process.

In the following video, we hear Freire in his own words talking about the importance of having a critical understanding of education and the concept of how language and teaching are intrinsically connected to power. He emphasizes the importance of valuing what the student brings with them to the education process:

Vygotsky’s work can also serve to reinforce the importance of undertaking the study of multimodality from a social perspective. Vygotsky (1962) proposed that learning happens through the social interaction between the learner and the teacher or mentor. For Vygotsky, learning is not an isolated phenomenon that occurs exclusively inside the learner’s head. Instead, learning is understood as a social event that happens through interaction of the learner and the teacher and other learners – this involves multiple modes of meaning making, language, gestures, touch, space, and sound for example.

Vygotsky’s observations of how children learn to speak their first language led him to conclude that language develops primarily through the multiple ways in which meaning evolves via social interaction. Through his systematic observations of how children interact with adults and other children in their environment, he concluded that it is this social interaction that enables children to learn and develop their language skills. For Vygotsky, all knowledge exists within culture, and social interaction is fundamental to the development of cognition; and different contexts create different forms of development. All cognitive processes (language, thought, reasoning) develop through social interaction. Learning, according to this model, is not something to be passed from the teacher to the learner, or something that can be acquired independently by the learner. Learning is a socio-cultural process, with a focus on the interaction between the learner and the teacher, as well as other learners (Vygotsky, 1962). This also implicitly implicates multimodal ways of making meaning. In any given situation, learners and their interlocutors have a variety of resources available to them and they can utilize images, speech, sound, touch, etc.

In their New Learning theory, Cope and Kalantzis (2017) propose seven affordances of the digital to guide learning and teaching in the 21st century: 1. learning is now ubiquitous - anywhere, anytime; 2. active knowledge making - the learner as knowledge maker; 3. multimodal meaning - using text, image, sound, new media; 4. recursive feedback - formative assessment, constructive feedback learning analytics; 5. collaborative intelligence - peer to peer learning, sourcing social memory; 6. metacognition - critical self-reflection; 7. differentiated learning - flexible, adaptive learning addressing each student according to their interests and needs.

Figure 6: New Learning (Cope & Kalantzis, 2016)

It is the multimodal aspect of learning, that the digital now makes far more accessible to the learning/teaching and meaning making that I wish to focus my attention to and explore in this study.

2.3.3 Evaluating multimodal academic works

One of the issues that comes up in much of the literature on multimodality in academia is the question of analysis and evaluation of the work; and the importance of making sure images, videos and other multimodal elements contribute to the argument, not merely serve as decoration. Multimodal analysis and evaluation must follow the same rigor, methodological and theoretical protocols established for text-based studies (Kuhn, 2013; Blevins, Rice & Carpenter, 2015). Many scholars, while interested in the new affordances created by new media, do not yet know how to compose in these new formats; and they may question how to approach and how to assess the scholarly value of these new forms of publication (Ball, 2004). Kuhn (2013) states:

We have to create standards for digital scholarship. They should be firm enough to ensure rigor yet flexible enough to allow for continued innovation. Most important, however, the standards should be set by the scholarly community, not by outside entities or by corporate interests. (p. 12)

Blevins, Rice & Carpenter (2015) look at the differences in design decisions necessary for online publishing, and the challenges faced by authors when trying to create and submit work that does not follow traditional and established criteria. Publishing multimodal scholarly work presents a unique set of decisions the author needs to consider that are different from print text-based work. Blevins et al. (2015) provide the following infographic to highlight important considerations for designing scholarly multimodal texts:

Figure 7: Designing scholarly multimodal texts

Blevins and her team (2015) stress the need to shift the work process and the need for authors to become also designers and learn to attend to form:
. . . shifting identity from “writer” to “designer” influences the way individual creators understand their own rhetorical and communication options and strategies. When scholars who previously identified as writers of the written word on a static page become designers who have the capability–and exigency–to create something multimodal on digital platforms, the underlying foundation of rhetorical choices expands and can create both a sense of creative freedom and overwhelming possibility. (p.8)
Ball (2012) also discusses the issue of assessment in multimodal scholarship. The author is an editor for the online journal Kairos (mentioned above). Kairos uses the term “webtext” to include the different types of multimodal work. In the following video, Cheryl Ball explains the use of the term webtext:

Ball (2004) discusses the need to create a rubric for evaluating scholarly multimedia. The author admits that although she is the editor of Kairos, and regularly has the responsibility of evaluating article submissions to the online journal, the task is not an easy one: “Kairos has no standard set of criteria that the editorial board uses to evaluate webtext submissions. In some ways, that lack of criteria is purposeful.” And she admits to using the “I-know-it-when-I-see-it” way of evaluating scholarly multimedia.

The key is to strike a balance between convention and innovation, even as the line between image and text, between orality and literacy, between art and critique and, indeed, between scholarship and pedagogy grows ever more fuzzy. (p.65)
Ball (2004) offers an initial framework that can be a starting point for analyzing multimodal work:

Figure 8: Multimedia Parameters

But, in the end Ball (2004) concludes that it is impossible to find or create a rubric that works for all multimodal works by reminding the reader that:
. . . the rubric needs to be created fresh, for each kind of project . . . in fact, there are no set criteria for Kairos submissions, as each piece must be evaluated on its own terms in relation to that moment and to technology and media and genre, in time. (p.68)

Other authors have looked at the question of evaluation of multimodal texts, not from the perspective of publishing requirements, but rather trying to understand the relation between the different modes and how they combine to make meaning – trying to understand how to systematize and decipher the meaning of multimodal texts. Van Leeuwen (2011) focuses on the question of how to study, and evaluate the relation of different modes and the function that images and text play in terms of their relation to each other on the page. He looks at the layout, the composition, framing, as well as the use of diagrams, color, and typography, to try to understand the function that each element plays in how to transmit and interpret information. His work is not about the value of multimodal scholarship, but rather on understanding the relationship between text and image:

Figure 9: Text Images Relation

Van Leeuwen (2011) concludes that multimodal texts require new forms of reading; it is no longer linear, no longer organized sequentially or intended to be consumed from beginning to end. The overall structure of digital texts and websites is much more a matter of design and layout and may be taken as a whole and as a collection of individual parts that contribute to the overall meaning of the whole. He states:
. . . the new writing is rapidly gaining importance, rapidly becoming the dominant form of multimodal communication. It is therefore a crucial task for visual analysis to develop tools for analyzing and interpreting it. (p.29)

Bateman et al. (2017) propose the Genre and Multimodality (GeM) Framework to describe and analyze multimodal works across different genres of text. The authors contend that it is important to address the issue of genre to help us compare different types of work, given the wide variation in the use of layout across different types of documents and texts. For Bateman et al. (2017), genre is used to account for the variation in document structure depending on the social purposes and communicative objectives of the work. The GeM framework provides a series of layers to help analyze a document. They define these layers as: 1. base layer, carrying the content; 2. layout layer to account for the organization of the content, graphic elements, and layout; 3. rhetorical layer for describing discourse relations; and 4. navigation layer, describing the structures and that guide the user in moving through the document. Hiippala (2017) offers the following diagram to help us understand the GeM model:

Figure 10: GeM Framework

Cope & Kalantzis (in print) offer a different type of framework to analyze multimodal texts which they call transpositional grammar. The authors are interested in how to analyze meaning across different forms:
> text < > image < > space < > object < > sound < > speech <

For Cope and Kalantzis (in print), meaning can be expressed in many different forms, although never exactly in the same way – thus transpositional grammar attempts to parse the meaning and understand the fluidity between text and speech and all the other forms in between – where all forms are inter-connected and interweaved in a state of constant multimodal meaning making. The authors use the word grammar in a wider sense to refer to patterns of meaning, they attempt to develop a common language with which to talk about all forms of meaning. Cope and Kalantzis (in print) use 5 questions to help analyze any instance of meaning making:
What is it about? - Reference
Who or what is doing it? – Agency
What holds it together? – Structure
What else is it connected to? – Context
What is it for? – Interest

These five questions can be applied to make sense of the world across and between text and speech and the other forms of communication. Text and speech are fundamentally different from each other. Text is more closely aligned with image and space; and speech is more closely aligned with sound and body. For Cope and Kalantzis (in print), transpositional grammar does not focus on fixed meanings but rather with movement and change. The same concept can be expressed in different forms, but when transposition is made from one form to another, the meaning is never quite the same. Multimodality then refers to the juxtaposition of forms and layering of meaning. All five meaning-functions are present in every meaning. Here, transposition occurs as we shift our attention from one function to another. And within each meaning, there is constant movement (Cope & Kalantzis, in print).

Archer (2010) argues that sometimes text and visual images may work in unison and complement each other in building an argument, but sometimes they may also work in different ways and provide different information and serve different purposes. Multimodal texts then raise special challenges and raise important and specific issues in assessment. New criteria are required for evaluating multimodal texts, such as analyzing how each mode serves a particular purpose and how it is used in different contexts.

2.3.4 Criteria and recommendations for producing multimodal academic work

While no definitive and complete framework has yet been agreed upon by academia to evaluate multimodal scholarship, here is a set of recommendations and guidelines found on the Kairos Style Guide for aspiring authors:
Design Requirements - The design edit consists of checking for readability, accessibility, usability, and sustainability

Rhetorical Considerations

All media and design elements should be non-gratuitous and facilitate or enact the rhetorical and aesthetic argument
All links should contribute to the possible meanings and readings of the texts
Links must be as current and accurate as possible
Offsite/external links should open in a new browser window
Accessibility and Usability – all videos and sound files must be accompanied by a text transcription
Sustainability - need to be able to archive everything that is published
Coding Requirements
Modified APA Citation Style - Kairos follows a modified version of APA, 6th edition
Common Grammar, Style, and Usage Errors
Davidson et al. (2009) describe their experience in advising and producing dissertations using visual sources and they offer some advice to others wanting to produce works that deviate from the standard norms. One point they stressed was the need for students to back their choices and the use of visuals with strong methodological theories. The following table shows a summary of their recommendations.

The literature reviewed here regarding the value and evaluation of multimodal scholarship does not yet provide a complete and conclusive guide for students wishing to pursue a multimodal dissertation. Many questions still remain concerning how to determine the appropriate mode for a particular purpose, how to acquire the skills necessary to master the technological demands of such work, and how to overcome the requirements and limitations imposed by instructors and institutions. These are some of the issues and questions I hope to address in this work going forward.aph. Click here to add your own text and edit me. It's easy.

Sonia Estima