Member-only story

Principal Component Analysis: Mathematical derivation of PCA and application on the breast cancer dataset

9 min readOct 6, 2021

Hello ladies and gentle ladies(men)!!I have the honor and pleasure to present to you the beautiful, the useful, the undisputed champion of linear dimensional reduction, welcome PCA. In this article we would be explaining some major concepts related to PCA. We will further describe the mathematical concept behind the PCA algorithm, i.e covariance, eigen decomposition, and singular value decomposition and look at some limitations of PCA as a dimensional reduction technique. We would conclude this article by looking at the application of PCA in the dimensional reduction of the breast cancer dataset.

Prerequisite :

In partaking in this article we hope you have a basic understanding of the following

covariance and correlation
Eigenvalue and EigenVector
Matrix multiplication
python

Assumptions:

For the success and understanding of this article we would make some basic assumption,

Our dataset has been Explored, clean and ready to be served.
Our prerequisite have been meet

Lastly we Assume all our assumptions have been meet. So have a nice read.

PCA, “what is PCA you may ask?”, long story short “this is an orthogonal linear transformation that transforms a higher dimensional state to a lower dimensional space while maximizing the variance and maintaining the global scope of dataset/matrix.”

From our definition, PCA dwells on three major concept orthogonal linear transformation, transformation from higher dimensional space to a lower dimensional space and maximizing the variance. Let’s look at the meaning of each term in it self.

orthogonal linear transformation:

This is a Linear transformation that moves Linear related data from a space of know basis, to a space of new basis that are orthogonal or 90º(π/2) to each other.

From linear algebra, transformation are basically vectors, hence “PCA is an algorithm where by Linear vectors in a know space are moved to a…

Principal Component Analysis: Mathematical derivation of PCA and application on the breast cancer dataset

Prerequisite :

Assumptions:

Written by Landry Placid

No responses yet