
-
1. Multi-LLM Collaboration for Medication Recommendation -
2. Beyond Similarity: Personalized Federated Recommendation with Composite Aggregation -
3. LLM as Explainable Re-Ranker for Recommendation System -
4. AskNearby: An LLM-Based Application for Neighborhood Information Retrieval and Personalized Cognitive-Map Recommendations -
5. Q-BERT4Rec: Quantized Semantic-ID Representation Learning for Multimodal Recommendation -
6. Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation -
7. UserSimCRS v2: Simulation-Based Evaluation for Conversational Recommender Systems -
8. Toward a benchmark for CTR prediction in online advertising: datasets, evaluation protocols and perspectives -
9. CoDeR+: Interest-aware Counterfactual Reasoning for Sequential Recommendation -
10. Rethinking Convolutional Neural Network in Multimodal Sequential Recommendation
1. Multi-LLM Collaboration for Medication Recommendation
Huascar Sanchez, Briland Hitaj, Jules Bergmann, Linda Briesemeister
https://arxiv.org/abs/2512.05066
As healthcare increasingly turns to AI for scalable and trustworthy clinical decision support, ensuring reliability in model reasoning remains a critical challenge. Individual large language models (LLMs) are susceptible to hallucinations and inconsistency, whereas naive ensembles of models often fail to deliver stable and credible recommendations. Building on our previous work on LLM Chemistry, which quantifies the collaborative compatibility among LLMs, we apply this framework to improve the reliability in medication recommendation from brief clinical vignettes. Our approach leverages multi-LLM collaboration guided by Chemistry-inspired interaction modeling, enabling ensembles that are effective (exploiting complementary strengths), stable (producing consistent quality), and calibrated (minimizing interference and error amplification). We evaluate our Chemistry-based Multi-LLM collaboration strategy on real-world clinical scenarios to investigate whether such interaction-aware ensembles can generate credible, patient-specific medication recommendations. Preliminary results are encouraging, suggesting that LLM Chemistry-guided collaboration may offer a promising path toward reliable and trustworthy AI assistants in clinical practice.
2. Beyond Similarity: Personalized Federated Recommendation with Composite Aggregation
Honglei Zhang,Haoxuan Li,Jundong Chen,Sen Cui,Kunda Yan,Abudukelimu Wuerkaixi,Xin Zhou,Zhiqi Shen,Yidong Li
https://dl.acm.org/doi/10.1145/3779442
Federated recommendation aims to collect global knowledge by aggregating local models from massive devices, to provide recommendations while ensuring privacy. Current methods mainly leverage aggregation functions invented by federated vision community to aggregate parameters from similar clients, e.g., clustering aggregation. Despite considerable performance, we argue that it is suboptimal to apply them to federated recommendation directly. This is mainly reflected in the disparate model structures. Different from structured parameters like convolutional neural networks in federated vision, federated recommender models usually distinguish itself by employing one-to-one item embedding table. Such a discrepancy induces the challenging embedding skew issue, which continually updates the trained embeddings but ignores the non-trained ones during aggregation, thus failing to predict future items accurately. To this end, we propose a personalized Federated recommendation model with Composite Aggregation (FedCA), which not only aggregates similar clients to enhance trained embeddings, but also aggregates complementary clients to update non-trained embeddings. Besides, we formulate the overall learning process into a unified optimization algorithm to jointly learn the similarity and complementarity. Extensive experiments on several real-world datasets substantiate the effectiveness of our proposed model. Our code is available at https://github.com/hongleizhang/FedCA.
3. LLM as Explainable Re-Ranker for Recommendation System
Yaqi Wang, Haojia Sun, Shuting Zhang
https://arxiv.org/abs/2512.03439
The application of large language models (LLMs) in recommendation systems has recently gained traction. Traditional recommendation systems often lack explainability and suffer from issues such as popularity bias. Previous research has also indicated that LLMs, when used as standalone predictors, fail to achieve accuracy comparable to traditional models. To address these challenges, we propose to use LLM as an explainable re-ranker, a hybrid approach that combines traditional recommendation models with LLMs to enhance both accuracy and interpretability. We constructed a dataset to train the re-ranker LLM and evaluated the alignment between the generated dataset and human expectations. Leveraging a two-stage training process, our model significantly improved NDCG, a key ranking metric. Moreover, the re-ranker outperformed a zero-shot baseline in ranking accuracy and interpretability. These results highlight the potential of integrating traditional recommendation models with LLMs to address limitations in existing systems and pave the way for more explainable and fair recommendation frameworks.
4. AskNearby: An LLM-Based Application for Neighborhood Information Retrieval and Personalized Cognitive-Map Recommendations
Luyao Niu, Zhicheng Deng, Boyang Li, Nuoxian Huang, Ruiqi Liu, Wenjia Zhang
https://arxiv.org/abs/2512.02502
The "15-minute city" envisions neighborhoods where residents can meet daily needs via a short walk or bike ride. Realizing this vision requires not only physical proximity but also efficient and reliable access to information about nearby places, services, and events. Existing location-based systems, however, focus mainly on city-level tasks and neglect the spatial, temporal, and cognitive factors that shape localized decision-making. We conceptualize this gap as the Local Life Information Accessibility (LLIA) problem and introduce AskNearby, an AI-driven community application that unifies retrieval and recommendation within the 15-minute life circle. AskNearby integrates (i) a three-layer Retrieval-Augmented Generation (RAG) pipeline that synergizes graph-based, semantic-vector, and geographic retrieval with (ii) a cognitive-map model that encodes each user's neighborhood familiarity and preferences. Experiments on real-world community datasets demonstrate that AskNearby significantly outperforms LLM-based and map-based baselines in retrieval accuracy and recommendation quality, achieving robust performance in spatiotemporal grounding and cognitive-aware ranking. Real-world deployments further validate its effectiveness. By addressing the LLIA challenge, AskNearby empowers residents to more effectively discover local resources, plan daily activities, and engage in community life.
5. Q-BERT4Rec: Quantized Semantic-ID Representation Learning for Multimodal Recommendation
Haofeng Huang, Ling Gai
https://arxiv.org/abs/2512.02474
Sequential recommendation plays a critical role in modern online platforms such as e-commerce, advertising, and content streaming, where accurately predicting users' next interactions is essential for personalization. Recent Transformer-based methods like BERT4Rec have shown strong modeling capability, yet they still rely on discrete item IDs that lack semantic meaning and ignore rich multimodal information (e.g., text and image). This leads to weak generalization and limited interpretability. To address these challenges, we propose Q-Bert4Rec, a multimodal sequential recommendation framework that unifies semantic representation and quantized modeling. Specifically, Q-Bert4Rec consists of three stages: (1) cross-modal semantic injection, which enriches randomly initialized ID embeddings through a dynamic transformer that fuses textual, visual, and structural features; (2) semantic quantization, which discretizes fused representations into meaningful tokens via residual vector quantization; and (3) multi-mask pretraining and fine-tuning, which leverage diverse masking strategies -- span, tail, and multi-region -- to improve sequential understanding. We validate our model on public Amazon benchmarks and demonstrate that Q-Bert4Rec significantly outperforms many strong existing methods, confirming the effectiveness of semantic tokenization for multimodal sequential recommendation. Our source code will be publicly available on GitHub after publishing.
6. Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation
Wei Yang, Rui Zhong, Yiqun Chen, Chi Lu, Peng Jiang
https://arxiv.org/abs/2512.01372
Multimodal recommendation aims to integrate collaborative signals with heterogeneous content such as visual and textual information, but remains challenged by modality-specific noise, semantic inconsistency, and unstable propagation over user-item graphs. These issues are often exacerbated by naive fusion or shallow modeling strategies, leading to degraded generalization and poor robustness. While recent work has explored the frequency domain as a lens to separate stable from noisy signals, most methods rely on static filtering or reweighting, lacking the ability to reason over spectral structure or adapt to modality-specific reliability. To address these challenges, we propose a Structured Spectral Reasoning (SSR) framework for frequency-aware multimodal recommendation. Our method follows a four-stage pipeline: (i) Decompose graph-based multimodal signals into spectral bands via graph-guided transformations to isolate semantic granularity; (ii) Modulate band-level reliability with spectral band masking, a training-time masking with a prediction-consistency objective that suppresses brittle frequency components; (iii) Fuse complementary frequency cues using hyperspectral reasoning with low-rank cross-band interaction; and (iv) Align modality-specific spectral features via contrastive regularization to promote semantic and structural consistency. Experiments on three real-world benchmarks show consistent gains over strong baselines, particularly under sparse and cold-start settings. Additional analyses indicate that structured spectral modeling improves robustness and provides clearer diagnostics of how different bands contribute to performance.
7. UserSimCRS v2: Simulation-Based Evaluation for Conversational Recommender Systems
Nolwenn Bernard, Krisztian Balog
https://arxiv.org/abs/2512.04588
Resources for simulation-based evaluation of conversational recommender systems (CRSs) are scarce. The UserSimCRS toolkit was introduced to address this gap. In this work, we present UserSimCRS v2, a significant upgrade aligning the toolkit with state-of-the-art research. Key extensions include an enhanced agenda-based user simulator, introduction of large language model-based simulators, integration for a wider range of CRSs and datasets, and new LLM-as-a-judge evaluation utilities. We demonstrate these extensions in a case study. https://github.com/iai-group/UserSimCRS
8. Toward a benchmark for CTR prediction in online advertising: datasets, evaluation protocols and perspectives
Shan Gao, Yanwu Yang
https://arxiv.org/abs/2512.01179
This research designs a unified architecture of CTR prediction benchmark (Bench-CTR) platform that offers flexible interfaces with datasets and components of a wide range of CTR prediction models. Moreover, we construct a comprehensive system of evaluation protocols encompassing real-world and synthetic datasets, a taxonomy of metrics, standardized procedures and experimental guidelines for calibrating the performance of CTR prediction models. Furthermore, we implement the proposed benchmark platform and conduct a comparative study to evaluate a wide range of state-of-the-art models from traditional multivariate statistical to modern large language model (LLM)-based approaches on three public datasets and two synthetic datasets. Experimental results reveal that, (1) high-order models largely outperform low-order models, though such advantage varies in terms of metrics and on different datasets; (2) LLM-based models demonstrate a remarkable data efficiency, i.e., achieving the comparable performance to other models while using only 2% of the training data; (3) the performance of CTR prediction models has achieved significant improvements from 2015 to 2016, then reached a stage with slow progress, which is consistent across various datasets. This benchmark is expected to facilitate model development and evaluation and enhance practitioners' understanding of the underlying mechanisms of models in the area of CTR prediction. Code is available at https://github.com/NuriaNinja/Bench-CTR
9. CoDeR+: Interest-aware Counterfactual Reasoning for Sequential Recommendation
Sitao Lin,Shuai Tang,Xiaofeng Zhang,Jianghong Ma,Ziao Wang
https://doi.org/10.1145/3778863
Sequential recommendation aims to predict users’ next interactions by analyzing historical behavioral data. Traditional methods typically focus on learning fine-grained feature representations or extracting high-level user preferences to enhance recommendation accuracy. However, they often overlook the dynamic nature of user demand, which can shift over short periods and may resemble random noise. In our previous work, we introduced CoDeR, a framework that captures demand shifts and mitigates confounding biases through backdoor adjustment. Despite its effectiveness, CoDeR has limitations in its causal relation modeling, particularly in neglecting the role of user interest as a confounder. In this work, we propose CoDeR+, an enhanced framework that refines key components of CoDeR. First, we extend the original User Demand Extraction module into Interest-aware User Demand Modeling, introducing two submodules that explicitly model user interest and integrate it into demand representations. Second, we introduce a new Robust Counterfactual Demand Reasoning module, where user interest is treated as an additional confounder alongside demand drift, improving the causal correction process. Additionally, we provide a rigorous theoretical analysis of the updated backdoor adjustment and propose a simplified probability estimation method that reduces computational complexity. Extensive experiments on four real-world datasets demonstrate the effectiveness of CoDeR+. The source code for both CoDeR and CoDeR+ is publicly available at https://github.com/hellolst23/CoDeR.
10. Rethinking Convolutional Neural Network in Multimodal Sequential Recommendation
hicheng Zhou,Xiangwu Meng,Yujie Zhang
https://dl.acm.org/doi/10.1145/3777377
Multimodal data can more comprehensively portray changes in user interests, and thus multimodal sequential recommendation(MSRS) has gained widespread attention in recent years. However, the MSRS faces two key challenges: (1)how to effectively model long-range dependencies in user interaction sequence; and (2)how to efficiently fuse multimodal features. To address these challenges, this paper proposes a novel multimodal sequential recommendation architecture based on pure convolutional neural network(CNN), named PCMSRec. PCMSRec contains two key innovations: first, by using the global receptive field of large kernel convolution, it models the long-range dependencies of multimodal user interaction sequence, breaking through the limitation that existing CNN-based methods can only capture local short-distance dependencies; second, by taking advantage of the high flexibility of the CNN architecture, it models the relationships among multimodal features of items through a carefully designed convolutional layer architecture and fusion strategy. Specifically, PCMSRec consists of two blocks: sequence-feature block and modal block. The sequence-feature block models long-range dependencies in user interaction sequence through large kernel convolutional layer and extracts item features by incorporating a bottleneck architecture. The modal block models the complex relationships between multimodal features using multiple convolutional layer. Experimental results on five public datasets show that PCMSRec outperforms existing methods.
欢迎干货投稿 \ 论文宣传 \ 合作交流
由于公众号试行乱序推送,您可能不再准时收到机器学习与推荐算法的推送。为了第一时间收到本号的干货内容, 请将本号设为星标,以及常点文末右下角的“在看”。
由于公众号试行乱序推送,您可能不再准时收到机器学习与推荐算法的推送。为了第一时间收到本号的干货内容, 请将本号设为星标,以及常点文末右下角的“在看”。

