Submission details

- Archive (zip) of the source codes. Please do not submit data.
- A README file where you mention what programming language you used, Operating system name/version, Computer architecture.
- A PDF/DOC/DOCX/ODT file with your answers to the following problems.

Note

*Please do not use any library function(s) that can do the logistic regression for you.*

- Please download the dataset from this link [ bank.csv ].

- Load the data into memory. Then, convert each of the categorical variables into numerical. For example, the 6th column ("housing") is a categorical variable having two values {"no","yes"}. We can replace "no" with number 0, and "yes" with number 1 in the entire 6th column.
- Now, implement logistic regression with SSE as loss function. You need to solve it using the "Gradient Descent" algorithm.
- Perform a 10-fold cross-validation to classify the dataset using logistic regression you developed in step 2. Please report accuracy, precision, recall, F1-score in each step of cross-validation and also report the average of these individual metrics. Try with 3 different learning rates, α={0.01,0.1,1}
- Scale the features of the dataset using Min-Max scaling to [0,1] range, and repeat step 3. Please do not scale the y feature. And also do not scale the added column for the bias term having all 1s (i.e., x0=1 column)
- Scale the features of the dataset using standardization, and repeat step 3. Please do not scale the y feature. And also do not scale the added column for the bias term having all 1s (i.e., x0=1 column)
- Implement regularized logistic regression with SSE as loss function. Again, solve using the gradient descent algorithm.
- On the standardized dataset, repeat step 3 except using the
**regularized****logistic****regression**you developed in step 6, by varying the parameter, λ={0,1,10,100,1000}. - Summarize (using a plot, or a table) the classification performance metrics (i.e., accuracy, recall, precision, F1-score) you would obtain in each of the experiments above.