Dynamic Pricing and Placement for Distributed Machine Learning Jobs

EasyChair Preprint no. 3232

9 pagesDate: April 22, 2020


Nowadays distributed machine learning (ML)jobs usually adopt a parameter server (PS) framework to train models over large-scale datasets. Such ML job deploys hundreds of concurrent workers, and model parameter updates are exchanged frequently between workers and PSs. Current practice is that workers and PSs may be placed on di‚erent physical servers, bringing uncertainty in jobs’ runtime. Also, existing cloud pricing policy often charges a fixed price according to the job’s runtime. Although this pricing strategy is simple to implement, such pricing mechanism is not suitable for distributed ML jobs whose runtime is stochastic and can only be estimated according to its placement after job admission. To supplement existing cloud pricing schemes, we design a dynamic pricing and placement algorithm, DPS, for distributed ML jobs. DPS aims to maximize cloud provider’s prot, which dynamically calculates unit resource price upon a job’s arrival, and determines job’s placement to minimize its runtime if o‚ered price is accepted to users. Our design exploits the multi-armed bandit (MAB) technique to learn unknown information based on past sales. DPS balances the exploration and exploitation stage, and selects the best price based on the reward which is related to job runtime.Our learning-based algorithm increases the provider’s profit, and achieves a sub-linear regret with both the time horizon and the total job number, compared to benchmark pricing schemes. Extensive evaluations using real-world data also validates the efficacy of DPS.

Keyphrases: dynamic pricing, MAB, machine learning

