Data Hygiene is the process of cleaning up a data set. The process removes duplicates, corrects misspellings, makes sure the data follows standard formatting rules, fills in missing data and corrects misspellings and punctuation errors, among other things. Good data hygiene is important for other data analytics, such as predictive modeling, since it makes sure the model is being built on the best quality data.