background Layer 1

Data Governance — the foundation for implementing AI and a data-driven approach

Customer
Alatau City Bank
Project manager on the customer side
Yermek Karimov
Head of data governance
Year of project completion
2025
Project timeline
December, 2024 - October, 2025
Project scope
8400 man-hours
Goals


1. Establish data as a strategic asset.
2. Centralize metadata across the organization.
3. Standardize and formalize data management processes.
4. Enable data-driven decision-making — data as the fuel for artificial intelligence.
5. Foster a strong data culture within the Bank.
6. Label and classify data across the Bank’s systems.

Project Results

The uniqueness of this project lies in the fact that it was implemented within one of the largest Tier 2 Banks in Kazakhstan, completed in just 10 calendar months, and fully executed using in-house resources — without the involvement of any external vendors or consultants.

1. Systematic Data Management Established

1) A unified data policy and data management standards have been created.

2) Data Owners (19 top managers of the Bank, appointed within 1.5 months) and Data Stewards (97 employees across divisions) have been formally designated.

3) Roles, responsibilities, and accountability have been clearly defined for all participants in the data lifecycle.

2. Data Governance Infrastructure Built

1) Data Catalog — a centralized metadata repository.

2) Business Glossary — standardized business terminology and definitions.

3) Data Lineage — visualization of data flows from sources to reports.

4) Automated scanning configured for 15 data sources on a regular schedule.

3. Unified Understanding of Data Across the Business

1) A corporate data lexicon and consistent business definitions have been established.

2) Duplication and discrepancies in reporting have been eliminated.

3) Data provenance transparency has been ensured.

4) Over 10,000 data attributes from the Bank’s data warehouse have been documented.

5) More than 1,000 business terms have been created.

4. Reduction of Regulatory and Operational Risks

1) Accuracy of regulatory and financial reporting data has been significantly improved.

2) Risk of fines and incidents due to data errors has been minimized.

3) Audits and data source validations have become simpler and faster.

5. Development of a Data Management Culture

1) Employee awareness of data value has increased through targeted training programs.

2) The principle “Data is an asset” has been formally adopted.

3) A data governance framework with defined policies, roles, and responsibilities has been institutionalized.

4) A data-driven culture has been nurtured through education, engagement, and accountability.


6. Data Quality Process Implemented

1) Over 100 data quality rules have been developed and embedded into continuous monitoring.

2) Critical client data corrections have been performed.

3) A sustainable, ongoing data quality process has been established to ensure continuous oversight.

7. Machine Learning Advancement

1) Quality of the Bank’s ML models has significantly improved following data quality enhancements.

2) Current model accuracy exceeds 93%, reflecting the direct impact of cleaner and more reliable data.

8. Business Impact

1) Reporting and analytics preparation time reduced by 30%.

2) Accuracy of management decisions improved, with all key metrics revalidated.

3) Digital product deployment speed doubled, enabled by enhanced data transparency.


The uniqueness of the project

The Data Governance project is unique for the Bank as it establishes a unified data management environment, integrating strategy, technology, and a culture of responsible data use. Specifically:

1. Unified Data Ecosystem

The project builds an end-to-end data management system — from source systems to reporting and analytics. This ensures a Single Source of Truth (SSOT) for all business units within the Bank.

2. Foundation for Data-Driven Management

Data Governance serves as the cornerstone for transforming the Bank into a Data-Driven Organization, where decisions are made based on verified, high-quality data rather than intuition.

3. Transparency and Trust in Data

Through the implementation of a Data Catalog, Business Glossary, and Data Lineage, the Bank gains full transparency of data origins, standardized business definitions, and a clear understanding of data flows across systems.

4. Enhanced Control and Risk Mitigation

The project ensures data governance and regulatory compliance (National Bank of the Republic of Kazakhstan, ARFR, and other state authorities), reducing operational, reputational, and regulatory risks associated with inaccurate reporting data.

5. Cultural Transformation

Data Governance goes beyond technology — it fosters a data-centric culture, where employees view data as a strategic asset and take responsibility for its quality and integrity.

6. Integration with Digital Initiatives

The project provides the foundation for AI, analytics, and reporting initiatives, as well as data warehouse transformation and Data Mart development, enabling scalable and sustainable growth for the Bank.

7. Data Order as the Basis for Quality AI

Within the project, joint efforts with business units are conducted to establish data management processes, improve data quality, label and map data from source systems, and document metadata. These initiatives create a solid foundation for AI, accelerating data preparation and enhancing Feature Store quality, which ultimately reduces the development time for ML models.

Used software

To implement and develop the project, the Bank utilizes the open-source platform OpenMetadata. This tool has been customized by the Bank’s development team to meet internal requirements and enhanced with additional functionalities, including:

1) A monitoring report for detecting duplicate tables and data,

2) An intelligent assistant for supporting data management processes.

Additionally, the team has developed a custom parser designed to build data lineage across multiple ETL tools (four in total), ensuring comprehensive visibility and traceability of data flows within the Bank’s ecosystem.

Difficulty of implementation

1. Lack of Unified Understanding of Data Value
Initially, many employees saw data as an IT issue, not a strategic asset, slowing business engagement.

2. Resistance to Change
Process reengineering and new roles met organizational resistance; some Data Owners were reluctant to take responsibility.

3. Weak Data Culture
Documentation and consistent data management practices were missing across departments.

4. Complex Data Inventory
Dozens of systems, duplicate sources, and poor historical data quality complicated cataloging and standardization.

5. System-Level Challenges
Integration of four ETL tools, platform customization, and unification of Data Governance functions required major technical effort.

Despite these challenges, the project successfully overcame barriers through a comprehensive engagement strategy: 

1. Conducted three rounds of full-bank demonstrations throughout the year to promote understanding and adoption.

2. Organized extensive training sessions and developed methodological guidelines for Data Stewards. 

3. Gradually strengthened the data culture, ensuring that all participants recognized the strategic importance of Data Governance.

Project Description

The Data Governance project aims to establish a unified data management system within the Bank, ensuring accuracy, transparency, and controllability of information across all business processes.

The primary goal of the project is to foster a corporate data management culture, where every participant understands the value of data and their responsibility in maintaining its quality. This approach enables the Bank to make more accurate and data-driven decisions, improve reporting quality, and reduce regulatory risks.

Within the project, the following key components are being implemented:

1) Data Catalogue — a centralized metadata repository containing information about all data sources and datasets (Bank systems).

2) Business Glossary — a unified dictionary of business terms and indicators, ensuring consistent understanding of data across all departments.

3) Data Lineage — a data flow tracking tool that visualizes the movement of data from source to consumer, providing transparency and change control.

4) Role-Based Data Management Model — a governance framework defining clear responsibilities among Data Owners, Data Stewards, Data Consumers, and Data Users.

5) Data Management Processes — formalized and standardized processes governing data management across all Bank divisions. This includes attribute and report documentation, business term creation by data stewards, methodology for report development and enhancement, and structured data integration workflows among all process participants.

The implementation of Data Governance establishes the foundation for data-driven management, ensures regulatory compliance, enhances operational efficiency, and strengthens trust in data at all organizational levels.

Project geography
Republic of Kazakhstan
Additional presentations:
Data Governance — the foundation for implementing AI and a data-driven approach..pdf
We use cookies for analytical purposes and to deliver you the best experience with our website. Continuing to the site, you agree to the Cookie Policy.