Authors:
(1) Yunshan Ma, National University of Singapore;
(2) Xiaohao Liu, University of Chinese Academy of Sciences;
(3) Yinwei Wei, Monash University;
(4) Zhulin Tao, Communication University of China and a Corresponding author;
(5) Xiang Wang, University of Science and Technology of China and affiliated with Institute of Artificial Intelligence, Institute of Dataspace, Hefei Comprehensive National Science Center;
(6) Tat-Seng Chua, National University of Singapore.
Table of Links
Conclusion and Future Work, Acknowledgment and References
4 RELATED WORK
We review the literature about bundles, including: 1) bundle recommendation and construction, and 2) bundle representation learning.
4.1 Bundle Recommendation and Construction
Product bundling is a mature marketing strategy that has been applied in various application scenarios, including fashion outfit [11, 29], e-commerce [42], online music playlist [4, 7], online games [14], travel package [32], meal [28], and etc.. Personalized bundle recommendation [5, 9, 37] is the pioneering work that first focuses on bundle-oriented problems in the data science community. Soon after that, researchers realize that just picking from predefined bundles cannot satisfy people’s diverse and personalized needs. Thereby, the task of personalized bundle generation [1, 6, 13, 17, 24, 44] is naturally proposed where the model aims to automatically generate a bundle from a large set of items catering to a given user. It has to simultaneously deal with both users’ personalization and item-item compatibility patterns, where the user-item interaction is specifically utilized for personalization modeling. In this paper, we only focus on bundle construction, which is committed to generate more bundles to enrich the bundle catalog for the platform. In addition, most of the bundle-oriented research in general domain still falls into the id-based paradigm, where very few domains, such as the fashion domain, have explored multimodality. We extend the multimodal learning to one more domain of music playlist. Moreover, we also leverage user feedback to multimodal bundle construction.
4.2 Bundle Representation Learning
Bundle representation learning is the crux of all the bundle-oriented problems. Initial studies [39] treat a bundle as a special type of item and just use the bundle id to represent it. Naturally and reasonably, people get to consider the encapsulated items within a bundle to generate more detailed representation. The simplest method is performing average pooling over the included items [51]. Later on, sequential models, such as Bi-LSTM [21], are utilized to capture the relations between two consecutive items. However, the items within a bundle are not ordered essentially, and sequential models cannot well capture all the pair-wise correlations. To address the limitation, attention models [9, 24, 33], Transformer [3, 30, 35, 40, 43, 46] and graph neural networks (GNNs) [5, 16, 38, 54, 55] are leveraged to model not only every pair of items within a bundle, but also the higher-order relations by stacking multiple layers.
Even though many efforts have been paid to the item correlation learning to achieve good bundle representation, the multimodal information has been less explored. Multimodal information, such as textual, visual, or knowledge graph information of items, demonstrates to be effective in general recommendation [45, 47, 48]. In the fashion domain, visual and textual features have been extensively investigated for pairwise mix-and-match [20, 52] or outfit compatibility modeling [12, 41]. However, these works have not been extended to other domains, such as music playlist, where the audio modality has been rarely studied in the bundle recommendation or construction problem. More importantly, we argue that the user-item interaction information, which is widely utilized in the personalized recommendation problem, can serve as an additional modality in bundle construction. Sun et al. [42] leverage a pre-trained CF model to obtain item representation to enhance the bundle completion task, while they have not fully justify the rationale and motivation. To the best of our knowledge, none of the previous works put together all the user-item interaction, bundle-item affiliation, and item content information for bundle construction.
This paper is available on arxiv under CC 4.0 license.