Open source tools for Data Profiling

Data Profiling is nothing but analyzing the existing data available in a data source and identifying the meta data on the same. This post is an high level introduction to data profiling and just provide pointers to data profiling.

What is the use of doing data profiling?

  1. To understand the metadata characteristics of the data under purview.
  2. To have an enterprise view of the data for the purpose of Master Data Management and Data Governance
  3. Helps in identifying the right candidates for Source-Target mapping.
  4. Ensure data fits for the intended purpose
  5. It helps to identify the Data issues and quantify them.

Typical project types its being put to use:

  • Data warehousing/Business Intelligence Projects
  • Research Engagements
  • Data research projects
  • Data Conversion/Migration Projects
  • Source System Quality initiatives.

Some of the open source tools which can be used for Data Profiling:

Some links which points to understand various commercial players exists and there comparison and evaluation:

In the next post we will evaluate certain aspects of data profiling with any of the tools mentioned in this blog post.

Advertisements

One thought on “Open source tools for Data Profiling

  1. Pingback: Data Profiling: Step by Step connection analysis using Talend | My exploration in data analytics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s