クッキーの日記

DLM

状態空間モデル

位相的データ解析

R

数理論理学

本読みまとめ

勉強会参加ログ

このブログについて

github.com/CookieBox26/ML

Zenn

数式が崩れていたらその記事単体を表示すると解消するかもしれないです。

論文読みまとめ

最終更新日： 2017-02-19
参考になりそうな論文をとりあえずメモしておくページ

確率的勾配降下法
深層学習（基礎）
強化学習（基礎）
強化学習（応用）
位相的データ解析（基礎）

確率的勾配降下法

表題	Shun-ichi Amari. Natural Gradient Works Efficiently in Learning, Neural Computation, Vol. 10, No. 2, pp. 251-276 (1998).
リンク	http://www.maths.tcd.ie/~mnl/store/Amari1998a.pdf
備考	自然勾配の原論文。

表題	Diederik Kingma and Jimmy Ba: Adam: A Method for Stochastic Optimization, arXiv:1412.6980 (2014).
リンク	https://arxiv.org/pdf/1412.6980v8.pdf
備考	Adam の原論文。

深層学習（基礎）

表題	George Cybenko: Approximation by Superpositions of a Sigmoidal Function (1989).
リンク	http://www.dartmouth.edu/~gvc/Cybenko_MCSS.pdf
備考	ニューラルネットワークの普遍性定理（Universal Approximation Theorem）の原論文。

強化学習（基礎）

表題	R. J. Williams: Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning, Machine Learning, Vol. 8, Issue 3, pp. 229-256 (1992).
リンク	http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf
備考	方策勾配のREINFORCEアルゴリズム。

表題	R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation, Advances in Neural Information Processing Systems 12, pp. 1057-1063 (2000).
リンク	https://webdocs.cs.ualberta.ca/~sutton/papers/SMSM-NIPS99.pdf
備考	方策のパラメータ勾配の表式、アクター・クリティックのパラメータ更新式など。

表題	Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite-time Analysis of the Multiarmed Bandit Problem. Machine Leraning, 47(2/3):235-256 (2002).
リンク	https://homes.di.unimi.it/~cesabian/Pubblicazioni/ml-02.pdf
備考	UCBアルゴリズムの原論文。 $\varepsilon$ -greedy 方策において $\varepsilon$ を減衰させたときのリグレットも。

表題	Sebastien Bubeck and Nicolo Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. (2012)
リンク	https://arxiv.org/pdf/1204.5721.pdf
備考	様々な問題設定の多腕バンディットタスクについてリグレットを解析したサーベイ。

強化学習（応用）

表題	David Silver et al., Mastering the Game of Go with Deep Neural Networks and Tree Search (2016)
リンク	http://airesearch.com/wp-content/uploads/2016/01/deepmind-mastering-go.pdf
備考	AlphaGo。
メモ	雑記： AlphaGoって何 - クッキーの日記

表題	Barret Zoph, Quoc Le, Neural Architecture Search with Reinforcement Learning (2016)
リンク	https://openreview.net/forum?id=r1Ue8Hcxg
備考	RNN をどんな風に設計するか自体を強化学習にやらせていると思う。

位相的データ解析（基礎）

表題	Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. Topological persistence and simplification（2002）
リンク	https://www.cs.duke.edu/~edels/Papers/2002-J-04-TopologicalPersistence.pdf
備考	パーシステントホモロジーの原論文。

表題	Robert Ghrist, Barcodes: The Persistent Topology of Data（2008）
リンク	https://www.math.upenn.edu/~ghrist/preprints/barcodes.pdf]
備考	バーコード（＝データ点群から位相情報を抽出したフォーマットの1つ）。