Introduction to online optimization: introduction

online learning protocol:

图片说明
characteristic:
limited feedback

Exponentially weighed average forecaster

Bounded convex loss and expert regret

Hoeffding’s's inequality
lemma 2.1:


Therorem: For any convex loss taking values in [0,1], the Exp strategy satisfies:

Exp-concave loss and expert regret

Lower bound

General convex and bounded loss is unimprovable.


Anytime strategy

For any convex loss with values in [0,1], the exp strategy with time-varying parameter satisfies for all :

Subdifferentiable loss with bounded subgradient

online finite optimization

For any loss with values in [0,1], the finite Exp strategy with parameter satisfies with probability at least :

全部评论

相关推荐

点赞 收藏 评论
分享
牛客网
牛客企业服务