Download PDFOpen PDF in browser

The Problem of Behaviour and Preference Manipulation in AI Systems

EasyChair Preprint no. 7281

7 pagesDate: January 2, 2022


Statistical AI or Machine learning can be applied to user data in order to understand user preferences in an effort to improve various services. This involves making assumptions about either stated or revealed preferences. Human preferences are susceptible to manipulation and change over time. When iterative AI/ML is applied, it becomes difficult to ascertain whether the system has learned something about its users, whether its users have changed/learned something or whether it has taught its users to behave in a certain way in order to maximise its objective function. This article discusses the relationship between behaviour and preferences in AI/ML, existing mechanisms that manipulate human preferences and behaviour and relates them to the topic of value alignment.

Keyphrases: Artificial Intelligence, auto induced distributional shift, behaviour change, choice architecture, cooperative inverse reinforcement learning, human preference, Inverse Reinforcement Learning, Libertarian Paternalism, manipulation, preferences, value alignment

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Hal Ashton and Matija Franklin},
  title = {The Problem of Behaviour and Preference Manipulation in AI Systems},
  howpublished = {EasyChair Preprint no. 7281},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser