Organising the Data Lake - The Central Role of an Information Catalog
There's a data deluge happening. The demand for data is increasing and the number of data sources is exploding. But how do we know what is going on, so that we can bring order to chaos in the rapidly moving digital world?
This paper looks at this problem and discusses how information catalogs enable us to organise and rapidly discover new data, track what data and insights are being produced, and publish these as services so they are easy for others to find and consume.
In this in-depth report, you will learn best practices for:
- Data profiling at scale
- Discovery of partitioned data sets
- Discovery of data lineage
- Schema generation on Hadoop data
- Advanced scalable and fault tolerant search
For immediate access, simply fill out the form.