Keynote Talks

Gregory Grefenstette (INRIA Saclay, France)

Bio: Gregory Grefenstette is senior researcher at INRIA Saclay, France. An expert in information retrieval and natural language processing, Grefenstette established the field of Cross Language Information Retrieval by creating its first Workshop at SIGIR’96. He is also one of the pioneers of distributional semantics, following his PhD work « Exploring Automatic Thesaurus Generation » (Kluwer, 1994). Involved in information retrieval since the early TREC days, he has always been keen on large scale solutions to natural language processing problems, co-editing with Adam Kilgarriff a special issue of « Computational Linguistics » in 2003. Former chief scientist at the Xerox Research Centre Europe (1993-01), at Clairvoyance Corporation (2001-04), and with the French CEA (2004-08), and scientific director at Exalead (now part of Dassault Systèmes, 2008-13), he has been active in transferring research into products as inventor in 17 U.S patents. His current research interests are lifelogging and personal semantics.

Title: Personal Information Systems and Personal Semantics

Abstract: People generally think of Big Data as something generated by machines or large communities of people interacting with the digital world. But technological progress means that each individual is currently, or soon will be, generating masses of digital data in their everyday lives. In every interaction with an application, every web page visited, every time your telephone is turned on, you generate information about yourself, Personal Big Data. With the rising adoption of quantified self gadgets, and the foreseeable adoption of intelligent glasses capturing daily life, the quantity of personal Big Data will only grow. In this Personal Big Data, as in other Big Data, a key problem is aligning concepts in the same semantic space. While concept alignment in the public sphere is an understood, though unresolved, problem, what does ontological organization of a personal space look like? Is it idiosyncratic, or something that can be shared between people? We will describe our current approach to this problem of organizing personal data and creating and exploiting a personal semantics.

Slides

Mounia Lalmas (Yahoo Labs, London, UK)

Bio: Mounia Lalmas is a Director of Research at Yahoo Labs London. She also holds an Honorary Professorship at University College London. Prior to this, she held a Microsoft Research/RAEng Research Chair at the School of Computing Science, University of Glasgow. Before that, she was Professor of Information Retrieval at the Department of Computer Science at Queen Mary, University of London. From 2002 until 2007, she co-led the Evaluation Initiative for XML Retrieval (INEX), a large-scale project with over 80 participating organizations worldwide, which was responsible for defining the nature of XML retrieval, and how it should be evaluated. Her work now focuses on studying user engagement in areas such as native advertising, digital media, social media, and search, and across devices (desktop, tablet and mobile). She currently leads a team of scientists working on Advertising Quality. She also pursue research in social media and search.

Title: Evaluating the search experience: from Retrieval Effectiveness to User Engagement

Abstract: Building retrieval systems that return results to users that satisfy their information need is one thing; Information Retrieval has a long history in evaluating how effective retrieval systems are. Many evaluation initiatives such as TREC and CLEF have allowed organizations worldwide to evaluate and compare retrieval approaches. Building a retrieval system that not only returns good results to users, but does so in a way that users will want to use that system again is something more challenging; a positive search experience has been shown to lead to users engaging long-term with the retrieval system. In this talk, I will review state-of-the-art approaches concerned with evaluating retrieval effectiveness. I will then focus on those approaches aiming at evaluating user engagement, and describe current works in this area. The talk will end with the proposal of a framework incorporating effectiveness evaluation into user engagement. An important component of this framework is to consider both within- and across-search session measurement.

Slides

Doug Oard (University of Maryland, USA)

Bio: Douglas Oard is a Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies (Maryland’s iSchool) and the University of Maryland Institute for Advanced Computer Studies (UMIACS). Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland. His research interests center around the use of emerging technologies to support information seeking by end users. Additional information is available at http://terpconnect.umd.edu/~oard/.

Title:Beyond Information Retrieval: When and how not to find things

Abstract: The traditional role of a search engine is much like the traditional role of a library: generally the objective is to help people find things. As we get better at this, however, we have been encountering an increasing number of cases in which some things that we know exist simply should not be found. Some well known examples include removal of improperly posted copyrighted material from search engine indexes, and the evolving legal doctrine that is now commonly referred to as the “right to be forgotten.” Some such cases are simple, relying on users to detect specific content that should be flushed from a specific index. Other cases, however, are more complex. For example, in the aspect of the civil litigation process known as e-discovery, one side may be entitled to withhold entire classes of material that may not have been labeled in advance (bcause of attorney-client privilege). An even more complex example is government transparency, in which for public policy reasons we may want to make some information public, despite that information being intermixed with other information that must be protected. Professional archivists have long dealt with such challenges, so perhaps we should start thinking about how to build search engines that act less like a library and more like an archive. In this talk, I will use these and other examples to introduce the idea of “search among secrets” in which the goal is to help some users find some content while protecting some content from some users (or some uses). We’ll dive down to look at how this actually works today in a few specific cases, with particular attention to how queries are formulated and which parts of the process are, or might be, automated. With that as background, I will then offer a few initial thoughts on how we might evaluate such systems. I’ll conclude with an invitation to think together about how information retrieval researchers might, together with others, begin to tackle these challenges.

Tweets de @clef2015