Long and Short Papers
The following long and short papers were selected through a rigorous peer-review process. Long and short papers will both get a 20-minute time slot (including 5 minutes Q&A) for presentation at the conference. All papers sessions will be held in the Quays Theatre and The Lowry.
Session 1: 360 Media
- 5th June, 16:00-17:00
- Session Chair: David Green (UWE, Bristol)
Exploring Visual Guidance in 360-degree Videos
- Marco Speicher – DFKI, Saarland Informatics Campus, Saarbrücken, Saarland, Germany
- Christoph Rosenberg – Saarland University, Saarland Informatics Campus, Saarbrücken, Saarland, Germany
- Donald Degraen – Intel Visual Computing Institute (IVCI), Saarland Informatics Campus, Saarbrücken, Germany
- Florian Daiber – DFKI, Saarland Informatics Campus, Saarbrücken, Germany
- Antonio Krüger – DFKI, Saarland Informatics Campus, Saarbrücken, Germany
Abstract: 360-degree videos offer a novel viewing experience with the ability to explore virtual environments much more freely than before. Technologies and aesthetics behind this approach of film-making are not yet fully developed. The newly gained freedom creates challenges and new methods have to be established to guide users through narratives. This work provides an overview of methods to guide users visually and contributes insights from an experiment exploring visual guidance in 360-degree videos with regard to task performance and user preferences. In addition, smartphone and HMD are used as output devices to examine possible differences. The results show that using viewers preferred HMD over smartphone and visual guidance over its absence. Overall, the Object to Follow method performed best, followed by the Person to Follow method. Based on the results, we defined a set of guidelines for drawing the viewers’ attention in 360-degree videos.
What Are Others Looking at? Exploring 360° Videos on HMDs with Visual Cues about Other Viewers
- Ville Mäkelä – LMU Munich, Munich, Germany, Tampere University, Tampere, Finland
- Tuuli Keskinen – Tampere University, Tampere, Finland
- John Mäkelä – Tampere University, Tampere, Finland
- Pekka Kallioniemi – Tampere University, Tampere, Finland
- Jussi Karhu Tampere – University, Tampere, Finland
- Kimmo Ronkainen – Tampere University, Tampere, Finland
- Alisa Burova – Tampere University, Tampere, Finland
- Jaakko Hakulinen – Tampere University, Tampere, Finland
- Markku Turunen – Tampere University, Tampere, Finland
Abstract: Viewing 360° videos can be an immersive experience, especially when using a head-mounted display (HMD). Current research identifies the need to guide viewers, as the freedom to rotate the view may make them miss things. We explore a unique, automatic approach to this problem with dynamic guidance methods called social indicators. They utilize gaze data collected from viewers to recognize popular areas in 360° videos, which are then visualized to subsequent viewers. We developed two different social indicators and evaluated them in a 30-participant user study. We found that although the indicators show great potential in subtly guiding users and improving the experience, finding the balance between guidance and self-exploration is vital. Also, users had varying interest towards indicators that represented a larger audience but reported a clear desire to use the indicators with their friends. We also present guidelines for providing dynamic guidance for 360° videos.
Camera Heights in Cinematic Virtual Reality: How Viewers Perceive Mismatches Between Camera and Eye Height
- Sylvia Rothe – LMU Munich University, Munich, Germany
- Boris Kegeles – LMU Munich, Munich, Germany
- Heinrich Hussmann – LMU Munich, Munich, Germany
Session 2: Immersion and Content
- 6th June, 09:30-10:30
- Session Chair: Guy Schofield (University of York)
Development of a Questionnaire to Measure Immersion in Video Media: The Film IEQ
- Jacob M. Rigby – UCL Interaction Centre, University College London, London, United Kingdom
- Duncan P Brumby – UCL Interaction Centre, University College London, London, United Kingdom
- Sandy J. J. Gould – School of Computer Science, University of Birmingham, Birmingham, United Kingdom
- Anna L Cox – UCL Interaction Centre, University College London, London, United Kingdom
Abstract: Researchers and practitioners are keen to understand how new video viewing practices driven by technological developments impact viewers’ experiences. We detail the development of the Immersive Experience Questionnaire for Film and TV (Film IEQ). An exploratory factor analysis based on responses from 414 participants revealed a four-factor structure of (1) captivation, (2) real-world dissociation, (3) comprehension, and (4) transportation. We validated the Film IEQ in an experiment that replicated prior research into the effect of viewing on screens of varying size. Responses captured by the Film IEQ indicate that watching on a small phone screen reduces the viewer’s level of comprehension, and that this negatively impacts the viewing experience, compared to watching on a larger screen. The Film IEQ allows researchers and practitioners to assess video viewing experiences using a questionnaire that is easy to administer, and that has been empirically validated.
Deb8: A Tool for Collaborative Analysis of Video
- Guilherme – Carneiro School of Computer Science, University of St Andrews, St Andrews, United Kingdom
- Miguel Nacenta – School of Computer Science, University of St Andrews, St Andrews, United Kingdom
- Dr Alice Toniolo – School of Computer Science, University of St Andrews, St Andrews, United Kingdom
- Gonzalo Mendez Escuela – Superior Politécnica del Litoral, Guayaquil, Ecuador Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
- Aaron J Quigley – SACHI, School of Computer Science, University of St Andrews, St Andrews, Fife, United Kingdom
Abstract: Public, parliamentary and television debates are commonplace in modern democracies. However, developing an understanding and communicating with others is often limited to passive viewing or, at best, textual discussion on social media. To address this, we present the design and implementation of Deb8, a tool that allows collaborative analysis of video-based TV debates. The tool provides a novel UI designed to enable and capture rich synchronous collaborative discussion of videos based on argumentation graphs that link quotes of the video, opinions, questions, and external evidence. Deb8 supports the creation of rich idea structures based on argumentation theory as well as collaborative tagging of the relevance, support and trustworthiness of the different elements. We report an evaluation of the tool design and a reflection on the challenges involved.
Viewers’s Visions of the Future: Co-Creating Hyper-Personalized and Immersive TV and Video Experiences
- David Geerts – Meaningful Interactions Lab (mintlab), KU Leuven, Leuven, Belgium imec, Leuven, Belgium
- Evert van Beek – Department of Human Information and Communication Design, TU Delft, Delft, Netherlands
- Fernanda Chocron – Miranda Graduate Programme in Communication and Information, Federal University of Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil
Session 3: Video in Society
- 6th June, 11:00-12:00
- Session Chair: Teresa Chambel (University of Lisbon)
Audience and Expert Perspectives on Second ScreenEngagement with Political Debates
- Katerina Gorkovenko – DJCAD, University of Dundee, Dundee, United Kingdom
- Nick Taylor – DJCAD, University of Dundee, Dundee, United Kingdom
Abstract: Televised debates remain a key point in elections, during which there are vast amounts of online activity, much of it conducted through personal devices or second screens. Amidst growing recognition of the influence of online political discourse, we explore the issues and opportunities arising at this specific point in election cycles, using a design-led multi-stakeholder approach to understand both the audience and expert perspectives. Workshops with debate viewers highlighted six key issues and possible solutions, which were encapsulated in four speculative design concepts. These were used to prompt further discussion with political and media experts, who were able to identify the implications and challenges of addressing the opportunities identified by the participants. Together, these perspectives allow us to unravel some of the complexities of designing for this multifaceted problem.
Exploring the Limits of Linear Video in a Participatory Mental Health Film
- Simona Manni – Digital Creativity Labs, University of York, York, United Kingdom
- Prof. Marian F Ursu – Digital Creativity Labs, Dept. of Theatre, Film and Television, University of York, York, North Yorkshire, United Kingdom
- Jonathan Hook – Digital Creativity Labs, Dept. of Theatre, Film and Television, University of York, York, United Kingdom
Abstract: Participatory filmmaking offers opportunities to counterbalance stereotypes about mental health often endorsed by the mainstream media, by involving participants who have a lived experience of mental health problems in production. It is our experience, however, that the linear videos traditionally resulting from such processes can fail to fully accommodate and represent a plurality of participant voices and viewpoints and, as a consequence, may lead to oversimplified accounts of mental health. Interactive film, on the other hand, could open up a space of opportunities for participatory films that allow multiple voices and complex representations to coexist. In this paper, we explore this opportunity by reviewing a linear film produced by five men with mental health problems in 2016 about isolation and recovery. Through a series of creative workshops, the film was deconstructed by its participants, who analysed which additional possibilities of both form and content that could be revealed if the film was transformed into a non-linear interactive film. Our findings reveal several expressive needs that a non-linear interactive film could more easily accommodate and opportunities for making participatory filmmaking truly dialogic by allowing an active exchange with audiences that preserves, rather than streamlines, the tension between collective views and personal accounts.
The Living Room of the Future
- Neelima Sailaja – Mixed Reality Laboratory, University of Nottingham, Nottingham, United Kingdom
- Professor Andy Crabtree – School of Computer Science, University of Nottingham, Nottingham, United Kingdom
- James Colley – Mixed Reality Laboratory, University of Nottingham, Nottingham, United Kingdom
- Adrian Gradinar – Imagination Lancaster, Lancaster University, Lancaster, Lancashire, United Kingdom
- Paul Coulton – LICA, Lancaster University, Lancaster, United Kingdom
- Mr Ian Forrester – BBC R&D, Manchester, United Kingdom
- Lianne Kerlin – BBC R&D, BBC, Salford, Manchester, United Kingdom
- Phil Stenton – BBC Research & Development, BBC, Salford, Manchester, United Kingdom
Session 4: Users and Devices
- 6th June, 14:30-15:30
- Session Chair: Florian Block (University of York)
Exploring Online Video Watching Behaviors
- Frank Bentley – Yahoo, Sunnyvale, California, United States
- Max Silverman – Oath (Formerly Yahoo), Sunnyvale, California, United States
- Melissa Bica – Computer Science, Human-Centered Computing, University of Colorado Boulder, Boulder, Colorado, United States
Abstract: Laptop and desktop computers are frequently used to watch online videos from a wide variety of services. From short YouTube clips, to television programming, to full-length films, users are increasingly moving much of their video viewing away from television sets towards computers. But what are they watching, and when? We set out to understand current video use on computers through analyzing full browsing histories from a diverse set of online Americans, finding some temporal differences in genres watched, yet few differences in the length of videos watched by hour. We also explore topics of videos, how users arrive at online videos through referral links, and conclude with several implications for the design of online video services that focus on the types of content people are actually watching online.
Methods for Device Characterisation in Media Services
- Ana Dominguez – Vicomtech, San Sebastian, Spain
- Professor Julian Florez – TECNUN, Navarra University, Donostia/San Sebastián, Spain
- Alberto Lafuente – ATC, University of the Basque Country, San Sebastián, Spain
- Stefano Masneri – Vicomtech, Donostia, Spain
- Iñigo Tamayo – Vicomtech, Donostia, Spain
- Dr. Mikel Zorrilla – Digital Media, Vicomtech, Donostia-San Sebastian, Gipuzkoa, Spain
Abstract: Content distribution in video streaming services and interactive TV multi-screen experiences could be improved through the characterisation of each involved device. These applications could provide a much more coherent user experience if the underlying system is able to detect the type of each device, since a device type exhibits specific features that should be considered on the adaptation process. Some of these features are attainable in the browser through different libraries and APIs, but their limited reliability forces to turn to more abstract levels of characterisation. The device type property (smartphone, tablet, desktop, tv) allows to classify devices according to a set of common specific features and its detection at runtime is very helpful for the adaptation process. In the context of a more extensive research, this paper compares three different methods of Web-based device type detection improving upon the existing state of the art and which have been validated through a hybrid broadcast broadband interactive multi-screen service.
Browsing for Content Across Pay TV and Video On Demand Options
- Jennifer McNally – Verizon, San Jose, California, United States
- Elizabeth Harrington Diederich – Verizon, Waltham, Massachusetts, United States
Session 5: Interactive and Responsive
- 6th June, 16:00-17:00
- Session Chair: Sha Li (University of York)
Interacting with Smart Consumer Cameras: Exploring Gesture, Voice, and AI Control in Video Streaming
- David A. ShammaFXPAL, Palo Alto, California, United States
- Jennifer MarlowFXPAL, Palo Alto, California, United States
- Laurent DenoueFXPAL, Palo Alto, California, United States
Abstract: Livestreaming and video calls have grown in popularity due to the increased connectivity and advancements in mobile devices. Our interactions with these cameras are limited as the cameras are either fixed or manually remote controlled. Here we present a Wizard-of-Oz elicitation study to inform the design of interactions with smart 360° cameras or robotic mobile desk cameras for use in video-conferences and live-streaming situations. There was an overall preference for devices that can minimize distraction, as well as, preferences for devices that demonstrate an understanding of video-meeting context. We find participants dynamically grow with regards to the complexity of interactions which illustrate the need for deeper event semantics within the Camera AI. Finally, we detail interaction techniques and design insights to inform the next generation of personal video cameras for streaming and collaboration.
BookTubing Across Regions: Examining Differences based on Nonverbal and Verbal Cues
- Chinchu Thomas – Multimodal Perception Lab, International Institute of Information Technology, Bangalore, Karnataka, India
- Dinesh Babu Jayagopi – Multimodal Perception Lab, International Institute of Information Technology, Bangalore, Karnataka, India
- Daniel Gatica-Perez – Idiap-EPFL, Lausanne, Switzerland
Abstract: BookTubers are a rapidly growing community in YouTube who shares content related to books. Previous literature has addressed problems related to automatically analyzing opinions and mood of video logs (vlogs) as a generic category in YouTube. Unfortunately, the population studied is not diverse. In this work, we study and compare some aspects of the geographic/cultural context of BookTube videos, comparing non-western (Indian) and Western populations.The role played by nonverbal and verbal cues in each of these contexts are analyzed automatically using audio, visual, and text features. The analysis shows that cultural context and popularity can be inferred to some degree using multimodal fusion of these features. The best obtained results are an average precision-recall score of 0.98 with Random Forest in a binary India vs. Western video classification task , and 0.66 in inferring binary popularity levels of BookTube videos.