Project report work

Today I started working on a predictive model to find out if the income of the employee influences where the employee lives. To build a logistic regression first I removed rows with NaN values in the ‘REGULAR’ and ‘TOTAL GROSS’ and ‘POSTAL’ columns. After that I trained a logistic regression model to predict postal codes based on regular earnings, resulting in an accuracy of approximately 8%.  Later I attempted training of another logistic regression model to predict postal codes based on total gross earnings, resulting in an accuracy of approximately 7%. The confusion matrix is extensive and indicates that the model’s predictions are mostly zeros, which suggests that the model may not be performing well. This could be due to a variety of factors, including a possible imbalance in the dataset or the complexity of predicting postal codes from a single feature.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *