Debiased offline evaluation of recommender systems: A weighted-sampling approach

dc.contributor.authorCarraro, Diego
dc.contributor.authorBridge, Derek G.
dc.contributor.funderScience Foundation Irelanden
dc.contributor.funderEuropean Regional Development Funden
dc.description.abstractOffline evaluation of recommender systems mostly relies on historical data, which is often biased by many confounders. In such data, user-item interactions are Missing Not At Random (MNAR). Measures of recommender system performance on MNAR test data are unlikely to be reliable indicators of real-world performance unless something is done to mitigate the bias. One way that researchers try to obtain less biased offline evaluation is by designing new supposedly unbiased performance estimators for use on MNAR test data. We investigate an alternative solution, a sampling approach. The general idea is to use a sampling strategy on MNAR data to generate an intervened test set with less bias --- one in which interactions are Missing At Random (MAR) or, at least, one that is more MAR-like. An example of this is SKEW, a sampling strategy that aims to adjust for the confounding effect that an item's popularity has on its likelihood of being observed. In this paper, we propose a novel formulation for the sampling approach. We compare our solution to SKEW and to two baselines which perform a random intervention on MNAR data (and hence are equivalent to no intervention in practice). We empirically validate for the first time the effectiveness of SKEW and we show our approach to be a better estimator of the performance one would obtain on (unbiased) MAR test data. Our strategy benefits from high generality properties (e.g. it can also be employed for training a recommender) and low overheads (e.g. it does not require any learning).en
dc.description.sponsorshipScience Foundation Ireland and European Regional Development Fund (12/RC/2289-P2)en
dc.description.statusPeer revieweden
dc.description.versionAccepted Versionen
dc.identifier.citationCarraro, D. and Bridge, D. (2020) 'Debiased offline evaluation of recommender systems: A weighted-sampling approach', in Proceedings of the 35th Annual ACM Symposium on Applied Computing, Brno, Czech Republic, 30 March-3 April, pp. 1435-1442. doi: 10.1145/3341105.3375759en
dc.publisherAssociation for Computing Machineryen
dc.rights© 2020 Association for Computing Machinery. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
dc.subjectOffline evaluationen
dc.subjectIntervened test setsen
dc.titleDebiased offline evaluation of recommender systems: A weighted-sampling approachen
dc.typeConference itemen
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
491.67 KB
Adobe Portable Document Format
Accepted Version
License bundle
Now showing 1 - 1 of 1
Thumbnail Image
2.71 KB
Item-specific license agreed upon to submission