Automating Data Governance with Apache Ranger and REST API

Blog | June 14, 2024 | By Kamlesh Kumar Pant

Streamlining Data Governance Using REST API

Enhance Data Security with Automated Governance

In a world in which safeguarding sensitive data is crucial for data handling and is a key focus for businesses, Apache Ranger provides a robust system for enforcing data security and controlling access on platforms like Hadoop, Hive, and HBase. This blog delves into ways to utilize REST API for automating Ranger policy management, including creating, updating, and enforcing policies automatically to enhance efficiency and security compliance while saving resources. 

For the uninitiated, Apache Ranger is a flexible and scalable platform for implementing data security and governance policies on a variety of data platforms. It gives organizations the ability to control access to important resources such as databases, tables, and files in their data environment.

Automating Data Governance 001

Introduction to Apache Ranger

Benefits of Automating Data Governance

Implementing Data Governance with REST API

Best Practices for Data Governance Automation

Overview of Data Governance Challenges

Key Features of Apache Ranger

Step-by-Step Guide to Using REST API

Ensuring Compliance and Security Through Automation

Governance Process in Ranger

Ranger’s governance process centres around creating, enforcing, and managing policies that dictate user access to data and resources. These policies come in various forms, each catering to specific access control needs. 

  • Resource-based Policies
    Resource-based policies are used to control access to specific resources within a data platform. These resources can include databases, tables, columns, files, directories, and more. 
  • Service-based Policies
    These policies control access to the service includes permissions such as start, stop, configure, and administer. 
  • Tag-based Policies
     This approach allows attributes or characteristics of the resources rather than explicitly naming each resource. For example, a policy might grant access to all files tagged as “confidential” or “sensitive.” 
  • Row-level Policies
    Row-level policies used to restrict access to specific rows or records within a dataset. 
  • Column-level Policies
    Column-level policies extend the granularity of access control to individual columns within a dataset. 
  • Masking Policies
    Instead of outright access denial, masking policies dynamically alter the data presented to users based on their permissions. 

Power of Policy Automation 

In this blog, we will learn how to create Resource-based Policies, Row filter Policy and Masking Policy. 

Automating Ranger policies using its REST API can greatly enhance efficiency and ensure consistency in access control management across the data ecosystem. Here are the steps to automate Ranger policies using its REST API:

Power of Policy Automation

Step 1:

Explore API Documentation

Familiarize yourself with the Ranger REST API documentation to understand available endpoints, request parameters, and response formats. 

Step 2:

Authentication and Authorization

Before interacting with the Ranger REST API, ensure that proper authentication and authorization mechanisms are in place and Ranger user has READ/WRITE privileges to enable POST/PUT/DELETE operations. 

For the purpose of this blog, we are considering user/password-based authentication where we will pass user and password information in “auth” parameter of GET/PUT/POST/DELETE API.

responseRequest = requests.get(UrlEndPoint, auth = (configdata['RANGER']['AUTHENTICATION']['UserName'],GetAuthAccessPassword()),headers = {"Content-Type" : "application/json"})
#print(UrlEndPoint, type(responseRequest.text))
LogMessageInProcessLog(UrlEndPoint)

Step 3:

Policy Creation

Get appropriate Ranger endpoint to create policies programmatically using POST API.

Get appropriate Ranger endpoint

Resource Based Policy (Access Policy)

To create an access policy, it is imperative that the “policyType” parameter in the request body is set to 0. Additionally, ensure that the following mandatory parameters are provided: 

  • Policy name 
  • Resources (Catalog/Schema/Table/Columns) 
  • Permissions (Select/Create/Alter etc) 
  • Associated users or groups 

Masking Policy

To create a masking policy, make sure that the “policyType” parameter in the request body is set to 1. Additionally, ensure that the following mandatory parameters are provided: 

  • Policy name 
  • Resources (Catalog/Schema/Table/Columns) 
  • Permissions (Select) 
  • Masking Policy Items (Mask Type, User/Group) 

Row Filter Policy

To create a row filer policy, “policyType” parameter in the request body must be set to 2. Additionally, ensure that the following mandatory parameters are provided: 

  • Policy name 
  • Resources (Catalog/Schema/Table) 
  • Permissions (Select)
  • Row Filter Policy Items (User/Group, Filter Query) 

These parameters are crucial for defining the scope and permissions of policy accurately. It is essential to adhere to organization’s security policies and compliance requirements when creating policies. By adhering to these standards, you can bolster data protection measures and ensure regulatory adherence. 

Step 4:

Policy Updates

Determine the circumstances that necessitate policy updates, such as changes in user roles or data access requirements. Utilize the designated API endpoints for policy modification, ensuring that updates are precisely reflected in Ranger’s configuration.


In conclusion, harnessing the power of API automation for Ranger policies offers a pathway to streamlined access control management and heightened security within data ecosystem. With this approach organizations can achieve greater efficiency, consistency, and scalability in policy management, ultimately strengthening their data governance practices.

kamlesh-kumar-pant
About the Author
Kamlesh has over two decades of experience in solution architecture, with a focus on big data, cloud, analytics, AI, and ML. He's adept at designing innovative solutions that not only drive business growth but also streamline efficiency. His commitment to staying ahead of tech trends makes him an asset to USEReady’s Data Value practice.
Kamlesh Kumar PantSenior Architect - L2 - Data Value | USEReady