Zhancun Mu

Student


Curriculum vitae



YuanPei College

Peking University



A Contextual Combinatorial Bandits Approach to Negotiation


Conference paper


Yexin Li, Zhancun Mu, Siyuan Qi
The Forty-first International Conference on Machine Learning, 2024

OpenReview
Cite

Cite

APA   Click to copy
Li, Y., Mu, Z., & Qi, S. (2024). A Contextual Combinatorial Bandits Approach to Negotiation. The Forty-first International Conference on Machine Learning.


Chicago/Turabian   Click to copy
Li, Yexin, Zhancun Mu, and Siyuan Qi. “A Contextual Combinatorial Bandits Approach to Negotiation.” The Forty-first International Conference on Machine Learning, 2024.


MLA   Click to copy
Li, Yexin, et al. A Contextual Combinatorial Bandits Approach to Negotiation. The Forty-first International Conference on Machine Learning, 2024.


BibTeX   Click to copy

@inproceedings{yexin2024a,
  title = {A Contextual Combinatorial Bandits Approach to Negotiation},
  year = {2024},
  publisher = {The Forty-first International Conference on Machine Learning},
  author = {Li, Yexin and Mu, Zhancun and Qi, Siyuan},
  booktitle = {}
}

Negotiation serves as a cornerstone for fostering cooperation among agents with diverse interests. Learning effective negotiation strategies poses two key challenges: the exploration-exploitation dilemma and dealing with large action spaces. However, there is an absence of learning-based approaches that effectively address these challenges in negotiation. This paper introduces a comprehensive framework to tackle a wide range of negotiation problems. Our approach leverages contextual combinatorial multi-arm bandits, with bandits resolving the exploration-exploitation dilemma and the combinatorial characteristic handles large action spaces. Building upon this framework, we introduce NegUCB, a novel method that also handles common issues such as partial observations and complex reward functions in negotiation. Notably, NegUCB is contextual and tailored for fullbandit feedback without constraints on the reward functions. Under mild assumptions, NegUCB ensures a sub-linear regret upper bound that remains independent of the negotiation bid cardinality. Experiments conducted on three representative negotiation tasks also demonstrate the superiority of our approach in learning negotiation strategies.


Share



Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in