Patents

  • US8676736

    (B2): Recommender systems and methods using modified alternating least squares algorithm

  • US2012030159

    (A1): Recommender Systems and Methods

Publications

  • Quadrana, Massimo, Alexandros Karatzoglou, Balázs Hidasi, and Paolo Cremonesi. Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks. arXiv preprint arXiv:1706.04148 (2017).

    Abstract: Session-based recommendations are highly relevant in many modern on-line services (e.g. e-commerce, video streaming) and recommendation settings. Recently, Recurrent Neural Networks have been shown to perform very well in session-based settings. While in many session-based recommendation domains user identifiers are hard to come by, there are also domains in which user profiles are readily available. We propose a seamless way to personalize RNN models with cross-session information transfer and devise a Hierarchical RNN model that relays end evolves latent hidden states of the RNNs across user sessions. Results on two industry datasets show large improvements over the session-only RNNs.
  • Balázs Hidasi, and Alexandros Karatzoglou. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847 (2017).

    Abstract: RNNs have been shown to be excellent models for sequential data and in particular for data that is generated from users in an session-based manner. The use of RNNs provides impressive performance benefits over classical methods in session-based recommendations. In this work we introduce novel ranking loss functions tailored to RNNs in the recommendation setting. The better performance of these losses over alternatives, along with further tricks and improvements described in this work, allow for an overall improvement of up to 35% in terms of MRR and Recall@20 over previous session-based RNN solutions and up to 53% over classical collaborative filtering approaches. Unlike data augmentation-based improvements, our method does not increase training times significantly.
  • Balázs Hidasi, et al. Parallel recurrent neural network architectures for feature-rich session-based recommendations. Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 2016.

    Abstract: Real-life recommender systems often face the daunting task of providing recommendations based only on the clicks of a user session. Methods that rely on user profiles – such as matrix factorization – perform very poorly in this setting, thus item-to-item recommendations are used most of the time. However the items typically have rich feature representations such as pictures and text descriptions that can be used to model the sessions. Here we investigate how these features can be exploited in Recurrent Neural Network based session models using deep learning. We show that obvious approaches do not leverage these data sources. We thus introduce a number of parallel RNN (p-RNN) architectures to model sessions based on the clicks and the features (images and text) of the clicked items. We also propose alternative training strategies for p-RNNs that suit them better than standard training. We show that p-RNN architectures with proper training have significant performance  improvements over feature-less session models while all session-based models outperform the item-to-item type baseline.
  • Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. (17 February 2016). Session-based recommendations with recurrent neural networks. ICLR 2016,Proceedings of the 4th International Conference on Learning Representations.

    Abstract: We apply recurrent neural networks (RNN) on a new domain, namely recommender systems. Real-life recommender systems often face the problem of having to base recommendations only on short session-based data (e.g. a small sportsware website) instead of long user histories (as in the case of Netflix). In this situation the frequently praised matrix factorization approaches are not accurate. This problem is usually overcome in practice by resorting to item-to-item recommendations, i.e. recommending similar items. We argue that by modeling the whole session, more accurate recommendations can be provided. We therefore propose an RNN-based approach for session-based recommendations. Our approach also considers practical aspects of the task and introduces several modifications to classic RNNs such as a ranking loss function that make it more viable for this specific problem. Experimental results on two data-sets show marked improvements over widely used approaches.
  • Benjamin Kille, Fabian Abel, Balázs Hidasi, and Sahin Albayrak. Using interaction signals for job recommendations. SIREMTI 2015Workshop on Situation Recognition by Mining Temporal Information, held in conjunction with the 7th International Conference on Mobile Computing, Applications and Services MobiCASE 2015.

    Abstract: Job recommender systems depend on accurate feedback to improve their suggestions. Implicit feedback arises in terms of clicks, bookmarks and replies. We present results from a member inquiry conducted on a large-scale job portal. We analyse correlations between ratings and implicit signals to detect situations where members liked their suggestions. Results show that replies and bookmarks reflect preferences much better than clicks.
  • Martha Larson, Domonkos Tikk, Roberto Turrin. (20 September 2015). Overview of ACM RecSys CrowdRec 2015 Workshop: Crowdsourcing and Human Computation for Recommender Systems. RecSys 2015, Proceedings of the 9th ACM Conference on Recommender Systems. p 341-342

    Abstract: CrowdRec 2015 provides the recommender system community with a forum at which to discuss crowdsourcing and human computation. Systems that explicitly collect information from human annotators to improve recommendations are becoming more widespread. At this year’s workshop, we highlight incentivization and the issue of avoiding bias. We take a special look at how recommender systems can influence collective behavior, and the contribution that the crowd can make to recommender system evaluation.
  • Balázs Hidasi, Domonkos Tikk. (14 July 2015). Speeding up ALS learning via approximate methods for context-aware recommendationsKnowledge and Information Systems, Springer London. pp 1-25

    Abstract: Implicit feedback-based recommendation problems, typically set in real-world applications, recently have been receiving more attention in the research community. From the practical point of view, scalability of such methods is crucial. However, factorization-based algorithms efficient in explicit rating data applied directly to implicit data are computationally inefficient; therefore, different techniques are needed to adapt to implicit feedback. For alternating least squares (ALS) learning, several research contributions have proposed efficient adaptation techniques for implicit feedback. These algorithms scale linearly with the number of nonzero data points, but cubically in the number of features, which is a computational bottleneck that prevents the efficient usage of accurate high factor models. Also, map-reduce type big data techniques are not viable with ALS learning, because there is no known technique that solves the high communication overhead required for random access of the feature matrices. To overcome this drawback, here we present two generic approximate variants for fast ALS learning, using conjugate gradient (CG) and coordinate descent (CD). Both CG and CD can be coupled with all methods using ALS learning. We demonstrate the advantages of fast ALS variants on iTALS, a generic context-aware algorithm, which applies ALS learning for tensor factorization on implicit data. In the experiments, we compare the approximate techniques with the base ALS learning in terms of training time, scalability, recommendation accuracy, and convergence. We show that the proposed solutions offer a trade-off between recommendation accuracy and speed of training time; this makes it possible to apply ALS-based methods efficiently even for billions of data points.
  • István Pilászy. (20 September 2015). Neighbor methods vs. matrix factorization: case studies of real-life recommendations. RecSys ’15 Proceedings of the 9th ACM Conference on Recommender Systems.

  • Balázs Hidasi. (September 2015). Context-aware preference modeling with factorization. RecSys ’15Proceedings of the 9th ACM Conference on Recommender Systems. pp 371-375. Paper/Presentation/Poster

    Abstract: This work focuses on solving the context-aware implicit feedback based recommendation task with factorization and is heavily influenced by the practical considerations. I propose context-aware factorization algorithms that can efficiently work on implicit data. I generalize these algorithms and propose the General Factorization Framework (GFF) in which experimentation with novel preference models is possible. This practically useful, yet neglected feature results in models that are more appropriate for context-aware recommendations than the ones used by the state-of-the-art. I also propose a way to speed up and enhance scalability of the training process, that makes it viable to use the more accurate high factor models with reasonable training times.
  • Bottyán Németh. (20 September 2015). Scaling up Recommendation Services in Many Dimensions. RecSys ’15 Proceedings of the 9th ACM Conference on Recommender Systems.

  • Balázs Hidasi and Domonkos Tikk: General factorization framework for context-aware recommendations, Data Mining and Knowledge Discovery, May 2015.

    Abstract: Context-aware recommendation algorithms focus on refining recommendations by considering additional information, available to the system. This topic has gained a lot of attention recently. Among others, several factorization methods were proposed to solve the problem, although most of them assume explicit feedback which strongly limits their real-world applicability. While these algorithms apply various loss functions and optimization strategies, the preference modeling under context is less explored due to the lack of tools allowing for easy experimentation with various models. As context dimensions are introduced beyond users and items, the space of possible preference models and the importance of proper modeling largely increases. In this paper we propose a general factorization framework (GFF), a single flexible algorithm that takes the preference model as an input and computes latent feature matrices for the input dimensions. GFF allows us to easily experiment with various linear models on any context-aware recommendation task, be it explicit or implicit feedback based. The scaling properties makes it usable under real life circumstances as well. We demonstrate the framework’s potential by exploring various preference models on a 4-dimensional context-aware problem with contexts that are available for almost any real life datasets. We show in our experiments—performed on five real life, implicit feedback datasets—that proper preference modelling significantly increases recommendation accuracy, and previously unused models outperform the traditional ones. Novel models in GFF also outperform state-of-the-art factorization algorithms. We also extend the method to be fully compliant to the Multidimensional Dataspace Model, one of the most extensive data models of context-enriched data. Extended GFF allows the seamless incorporation of information into the factorization framework beyond context, like item metadata, social networks, session information, etc. Preliminary experiments show great potential of this capability.
  • Gábor Takács and Domonkos Tikk: Alternating least squares for personalized ranking, ACM Recsys 2012: Proceedings of the sixth ACM conference on Recommender systems, 83-90.

  • Gábor Takács, István Pilászy, Bottyán Németh, and Domonkos TikkScalable collaborative filtering approaches for large recommender systems, Journal of Machine Learning Research, 10: 623-656, 2009.

    The collaborative filtering (CF) using known user ratings of items has proved to be effective for predicting user preferences in item selection. This thriving subfield of machine learning became popular in the late 1990s with the spread of online services that use recommender systems, such as Amazon, Yahoo! Music, and Netflix. CF approaches are usually designed to work on very large data sets. Therefore the scalability of the methods is crucial. In this work, we propose various scalable solutions that are validated against the Netflix Prize data set, currently the largest publicly available collection. First, we propose various matrix factorization (MF) based techniques. Second, a neighbor correction method for MF is outlined, which alloys the global perspective of MF and the localized property of neighbor based approaches efficiently. In the experimentation section, we first report on some implementation issues, and we suggest on how parameter optimization can be performed efficiently for MFs. We then show that the proposed scalable approaches compare favorably with existing ones in terms of prediction accuracy and/or required training time. Finally, we report on some experiments performed on MovieLens and Jester data sets.
  • Balázs Hidasi and Domonkos TikkApproximate modeling of continuous context in factorization algorithms. CARR 2014: 4th Workshop on Context-awareness in Retrieval and Recommendation, held in conjunction with 36th European Conference on information retrieval. April 13-16, Amsterdam.

  • Balázs HidasiFactorization models for context-aware recommendations. Infocommunications Journal VI/4, December 2014.

    Abstract: The field of implicit feedback based recommender algorithms have gained increased interest in the last few years, driven by the need of many practical applications where no explicit feedback is available. The main difficulty of this recommendation task is the lack of information on the negative preferences of the users that may lead to inaccurate recommendations and scalability issues. In this paper, we adopt the use of contextawareness to improve the accuracy of implicit models—a model extension technique that was applied successfully for explicit algorithms. We present a modified version of the iTALS algorithm (coined iTALSx) that uses a different underlying factorization model. We explore the key differences between these approaches and conduct experiments on five data sets to experimentally determine the advantages of the underlying models. We show that iTALSx outperforms the other method on sparser data sets and is able to model complex user–item relations with fewer factors.
  • Alan Said, Domonkos Tikk, and Paolo Cremonesi: Benchmarking – A methodology for ensuring the relative quality of recommendation systems in software engineering.In Recommendation Systems in Software EngineeringRobillard, M.P., Maalej, W., Walker, R.J., Zimmermann, Th. (Eds.), Springer, 2014, ISBN 978-3-642-45135-5.

  • Alan Said, Martha Larson, Domonkos Tikk, Paolo Cremonesi, Alexandros Karatzoglou, Frank Hopfgartner, Roberto Turrin, and Joost Geurts: User-item reciprocity in recommender systems: Incentivizing the crowd. ProS 2014: Workshop on UMAP Projects Synergy, held in conjunction with 22nd Conference on User Modelling, Adaptation and Personalization. July 7-11, Aalborg, Denmark

    Abstract: Data consumption has changed significantly in the last 10 years. The digital revolution and the Internet has brought an abundance of information to users. Recommender systems are a popular means of finding content that is both relevant and personalized. However, today’s users require better recommender systems, able of producing continuous data feeds keeping up with their instantaneous and mobile needs. The CrowdRec project addresses this demand by providing context-aware, resource-combining, socially-informed, interactive and scalable recommendations. The key insight of CrowdRec is that, in order to achieve the dense, high-quality, timely information required for such systems, it is necessary to move from passive user data collection, to more active techniques fostering user engagement. For this purpose, CrowdRec activates the crowd, soliciting input and feedback from the wider community.
  • Balázs Hidasi and Domonkos TikkContext-aware item-to-item recommendation within the factorization framework, CARR 2013: 3rd Workshop on Context-awareness in Retrieval and Recommendation, held in conjunction with 6th ACM Int. Conf. on Web Search and Data Mining

  • Bottyán NémethGábor TakácsIstván Pilászy and Domonkos Tikk: Visualization of movie features in collaborative filtering, Proc. of the 12th IEEE Int. Conf. on Intelligent Software Methodologies, Tools and Techniques (SoMeT 2013), pp. 229–233

  • Balázs Hidasi and Domonkos Tikk: Initializing Matrix Factorization Methods on Implicit Feedback DatabasesJournal of Universal Computer Science, Volume 19, Issue 12, Pages 1834-1853.

  • Alejandro Bellogín, Pablo Castells, Alan Said and Domonkos TikkWorkshop on Reproducibility and Replication in Recommender Systems Evaluation – RepSys. Recsys 2013: 7th ACM Conference on Recommender Systems, Hong Kong, pp 485-486.

  • Martha Larson, Alan Said, Yue Shi, Paolo Cremonesi, Domonkos Tikk and A Karatzoglou: Activating the Crowd: exploiting user-item reciprocity for recommendation, CrowdRec 2013: Workshop on Crowdsourcing and Human Computation for Recommender Systems, held in conjunction with 7th ACM Conference on Recommender Systems (Recsys’13)

  • Alan Said, Domonkos Tikk, and Andreas Hotho: The challenge of recommender systems challengesACM Recsys 2012: Proceedings of the sixth ACM conference on Recommender systems, 9-10.

  • Dávid ZibriczkyZoltán PetresMárton Waszlavik and Domonkos TikkEPG content recommendation in large scale: a case study on interactive TV platform. Machine Learning with Multimedia Data – Special session at the 12th IEEE International Conference on Machine Learning and Applications (ICMLA’13)

    Abstract: Recommender systems in TV applications mostly focusing on the recommendation of video-on-demand (VOD) content, though the major part of users’ content consumption is realized on linear channel programs, termed EPG content. In this case study we present how we tackled the EPG recommendation task, which exhibits several differences compared to the VOD scenario, including the lack of explicit user feedbacks, the magnitude of cold start problem, as well as data cleaning and feature selection necessary to be applied on raw consumption data. We provide both offline and online model validation. First we showcase the typical approach in machine learning by evaluating models against recall in an offline setting. Then, we investigate in depth the real-world results of the recommendation app using the pre-trained models, and analyze how personalized recommendation influence users watching behavior. The experimentation results are based on our recommender system deployed at a Canadian IPTV service provider using Microsoft Media room middleware.
  • Balázs Hidasi and Domonkos TikkEnhancing matrix factorization through initialization for implicit feedback databases,CARR 2012: 2nd Workshop on Context-awareness in Retrieval and Recommendation, pp. 2-9

  • Dávid Zibriczky, Balázs Hidasi, Zoltán Petres, and Domonkos TikkPersonalized recommendation of linear content on interactive TV platforms: beating the cold start and noisy implicit user feedback, TVMMP 2012: International Workshop on TV and Multimedia Personalization, held in conjunction with UMAP 2012: 20th conference on User Modeling, Adaptation, and Personalization

    Abstract: Recommender systems in TV applications mostly focus on the recommendation of video-on-demand (VOD) content, although the major part of users’ content consumption is realized on linear channel programs (live or recorded), termed EPG programs. The accurate collaborative filtering algorithms suitable for VOD recommendation cannot be directly carried over for EPG program recommendation. First, EPG program recommendation features the cold start problem; a significant part of EPG programs are new in the system. Second, and more importantly, without explicit user feedbacks (ratings) the algorithms have to model user preference based on the noisy and less directly interpretable implicit user feedbacks. In this paper, we present several approaches that overcome these difficulties, by applying pre-filtering on noisy low-level data and taking into account channel preferences of users and program metadata if available to cope with the cold start. Using time-dependent tensor factorization approaches, the temporal preferences of users are also reflected in recommendation, that also hints on the person watching the TV. Experiments were performed on a dataset of SaskTel, a Canadian IPTV service provider using Microsoft Mediaroom middleware.
  • Domonkos TikkFrom a toolkit of recommendation algorithms into a real business: the Gravity R&D experience, Workshop of Recommender systems challenge 2012, held in conjunction with ACM Recsys 2012: the sixth ACM conference on Recommender systems

  • Alan Said, Domonkos TikkKlára Stumpf, Yue Shi, Martha Larson, and Paolo Cremonesi: Recommender Systems Evaluation: A 3D BenchmarkRUE 2012: Workshop on Recommendation Utility Evaluation: Beyond RMSE, held in conjunction with ACM Recsys 2012: the sixth ACM conference on Recommender systems

  • Nikos Manouselis, Alan Said, Domonkos Tikk, Jannis Hermanns, Benjamin Kille, Hendrik Drachsler, Katrien Verbert, and Kris Jack: Recommender systems challenge 2012ACM Recsys 2012: Proceedings of the sixth ACM conference on Recommender systems, 353-354.

  • Balázs Hidasi and Domonkos Tikk: Fast ALS-Based tensor factorization for context-aware recommendation from implicit feedback, ECML-PKDD 2012: Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases – Volume Part II, 67-82.

  • Gábor Takács, István Pilászy, and Domonkos Tikk: Applications of the conjugate gradient method for implicit feedback collaborative filtering, ACM Recsys 2011: Proceedings of the 5th ACM conference on Recommender systems, 297-300

  • István Pilászy, Dávid Zibriczky, and Domonkos TikkFast ALS-based matrix factorization for explicit and implicit feedback datasets, ACM Recsys 2010: Proceedings of the 4th ACM conference on Recommender systems, 71-78

    Abstract: Alternating least squares (ALS) is a powerful matrix factorization (MF) algorithm for both explicit and implicit feedback based recommender systems. As shown in many articles, increasing the number of latent factors (denoted by K) boosts the prediction accuracy of MF based recommender systems, including ALS as well. The price of the better accuracy is paid by the increased running time: the running time of the original version of ALS is proportional to K3. Yet, the running time of model building can be important in recommendation systems; if the model cannot keep up with the changing item portfolio and/or user profile, the prediction accuracy can be degraded. In this paper we present novel and fast ALS variants both for the implicit and explicit feedback datasets, which offers better trade-off between running time and accuracy. Due to the significantly lower computational complexity of the algorithm – linear in terms of K – the model being generated under the same amount of time is more accurate, since the faster training enables to build model with more latent factors. We demonstrate the efficiency of our ALS variants on two datasets using two performance measures, RMSE and average relative position (ARP), and show that either a significantly more accurate model can be generated under the same amount of time or a model with similar prediction accuracy can be created faster; for explicit feedback the speed-up factor can be even 5-10.
  • István Pilászy and Domonkos Tikk: Recommending new movies: even a few ratings are more valuable than metadataACM Recsys 2009: Proceedings of the 3rd ACM conference on Recommender systems, 93-100.

    Abstract: The Netflix Prize (NP) competition gave much attention to collaborative filtering (CF) approaches. Matrix factorization (MF) based CF approaches assign low dimensional feature vectors to users and items. We link CF and content-based filtering (CBF) by finding a linear transformation that transforms user or item descriptions so that they are as close as possible to the feature vectors generated by MF for CF. We propose methods for explicit feedback that are able to handle 140,000 features when feature vectors are very sparse. With movie metadata collected for the NP movies we show that the prediction performance of the methods is comparable to that of CF, and can be used to predict user preferences on new movies. We also investigate the value of movie metadata compared to movie ratings in regards of predictive power. We compare our solely CBF approach with a simple baseline rating-based predictor. We show that even 10 ratings of a new movie are more valuable than its metadata for predicting user ratings.