Essential data manipulation approaches

Data Manipulation involves various aspects including structuring, validating, enriching, discovering and cleaning, etc., Here are some of the aspects involving to get to the right data before exploring the data and further training it.

  • Replacing spaces values appropriately
  • Dropping the null values based on the volume of missing data
  • Data type conversion (e.g., integer to float)
  • Replacing the category values to bring consistency based on business need
  • Binning – Converting numeric data to category buckets (e.g., Age into Age group Buckets)
  • Boolean Indexing – Filtering values of a columns based on another set of columns and its conditions. Boolean Indexing can be adopted based for these kind of scenarios (e.g., Filtering out female bank accounts for special schemes based on Age, Gender and Educational Qualification)
  • Fill the Not available values using mode/median and mean values

Leave a comment