Debiased offline evaluation of recommender systems: A weighted-sampling approach

Show simple item record Carraro, Diego Bridge, Derek G. 2020-07-06T08:55:56Z 2020-07-06T08:55:56Z 2020-03
dc.identifier.citation Carraro, D. and Bridge, D. (2020) 'Debiased offline evaluation of recommender systems: A weighted-sampling approach', in Proceedings of the 35th Annual ACM Symposium on Applied Computing, Brno, Czech Republic, 30 March-3 April, pp. 1435-1442. doi: 10.1145/3341105.3375759 en
dc.identifier.startpage 1435 en
dc.identifier.endpage 1442 en
dc.identifier.isbn 978-1-4503-6866-7
dc.identifier.doi 10.1145/3341105.3375759 en
dc.description.abstract Offline evaluation of recommender systems mostly relies on historical data, which is often biased by many confounders. In such data, user-item interactions are Missing Not At Random (MNAR). Measures of recommender system performance on MNAR test data are unlikely to be reliable indicators of real-world performance unless something is done to mitigate the bias. One way that researchers try to obtain less biased offline evaluation is by designing new supposedly unbiased performance estimators for use on MNAR test data. We investigate an alternative solution, a sampling approach. The general idea is to use a sampling strategy on MNAR data to generate an intervened test set with less bias --- one in which interactions are Missing At Random (MAR) or, at least, one that is more MAR-like. An example of this is SKEW, a sampling strategy that aims to adjust for the confounding effect that an item's popularity has on its likelihood of being observed. In this paper, we propose a novel formulation for the sampling approach. We compare our solution to SKEW and to two baselines which perform a random intervention on MNAR data (and hence are equivalent to no intervention in practice). We empirically validate for the first time the effectiveness of SKEW and we show our approach to be a better estimator of the performance one would obtain on (unbiased) MAR test data. Our strategy benefits from high generality properties (e.g. it can also be employed for training a recommender) and low overheads (e.g. it does not require any learning). en
dc.description.sponsorship Science Foundation Ireland and European Regional Development Fund (12/RC/2289-P2) en
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher Association for Computing Machinery en
dc.rights © 2020 Association for Computing Machinery. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from en
dc.subject Offline evaluation en
dc.subject Bias en
dc.subject Intervened test sets en
dc.title Debiased offline evaluation of recommender systems: A weighted-sampling approach en
dc.type Conference item en
dc.internal.authorcontactother Derek Bridge, Computer Science, University College Cork, Cork, Ireland. +353-21-490-3000 Email: en
dc.internal.availability Full text available en
dc.description.version Accepted Version en
dc.contributor.funder Science Foundation Ireland en
dc.contributor.funder European Regional Development Fund en
dc.description.status Peer reviewed en
dc.internal.IRISemailaddress en
dc.internal.IRISemailaddress en

Files in this item

This item appears in the following Collection(s)

Show simple item record

This website uses cookies. By using this website, you consent to the use of cookies in accordance with the UCC Privacy and Cookies Statement. For more information about cookies and how you can disable them, visit our Privacy and Cookies statement