Machine Learning Question on missing values in training and test data

By Ritesh Sahu - May 24, 2022

I'm training a text classifier for binary classification. In my training data, there are null values in the .csv file in the text portion, and there are also null values in my test file. I have converted both files to a dataframe (Pandas). This is a small percentage of the overall data (less than 0.01).

Knowing this - is it better to replace the null text fields with an empty string or leave it as as empty? And if the answer is replace with empty string, is it "acceptable" to do the same for the test csv file before running it against the model?

Search This Blog

Theprogrammersfirst | A technical portal.

Machine Learning Question on missing values in training and test data

Comments

Post a Comment

Popular posts from this blog

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation

Today Walkin 14th-Sept