AI Glossary: What Is Gradient Descent (GD)? Definition & Meaning

勾配降下法

勾配降下法は広く使われている最適化アルゴリズム in 機械学習 and statistics, particularly for training models. The core idea behind gradient descent is to minimize a function by iteratively adjusting parameters in the direction of the steepest descent, which is identified by the gradient of the function.

Specifically, gradient descent starts with an initial set of parameters (or weights) and calculates the gradient, which is a vector that points in the direction of the steepest increase of the function. To minimize the function, parameters are updated by moving a small step in the opposite direction of the gradient. This step size is determined by a value known as the 学習率.

このプロセスは次のステップで要約できます：

初期のパラメータセットを選択します。
パラメータに関しての勾配を計算する損失関数パラメータに関しての勾配を計算する
勾配の逆方向に移動させてパラメータを更新します。
収束するまでこのプロセスを繰り返します。収束は、パラメータの変化が事前に定められた閾値より小さくなるときに起こります。

勾配降下法にはいくつかのバリエーションがあります：

バッチ勾配降下法: Uses the entire dataset to compute the gradient, which can be slow for large datasets.
確率的勾配降下法（SGD）： Uses one random sample to update parameters, which introduces variability but can be faster and help escape local minima.
ミニバッチ勾配降下法: 少量のサンプルを用いることで、両者の利点を組み合わせます。

Gradient descent is essential for training various models, including linear regression, neural networks, and support vector machines, making it a fundamental concept in the 人工知能の分野.