Member-only story

Enabling Data Governance with Apache Ranger (Part 2)

Micro-managing Data Governance across Hadoop

6 min readMar 28, 2024

Welcome back! It's been a while since my last article and there have been some significant changes in Apache Ranger, as well as the Trino plugin for Ranger. But don't worry, I've updated my repository with all the latest updates to implement the newest version of Apache Ranger with a higher version of Trino. Let's dive right in!

What is Apache Ranger?

Apache Ranger is a framework to enable, monitor, and manage comprehensive data security across the Hadoop platform.
Apache Ranger has the following features:

Centralized security administration to manage all security-related tasks in a central UI or using REST APIs.
Fine-grained authorization to do a specific action or operation with a Hadoop component or tool managed through a central administration tool.
A standardized authorization method across all Hadoop components.
Centralized auditing of user access and administrative actions (security-related) within all the components of Hadoop.

Apache Ranger uses two key components for authorization:

Apache Ranger policy admin server
Apache Ranger plugin

Setting up Ranger is divided into 3 steps:

Setting up PostgreSQL with Apache Admin Sever
Setting up Ranger Trino plugin with Trino
Setting up policies to govern our data

In the first part, we covered the installation and setup of the Apache Admin Server with a Postgres backend and a locally hosted Trino to connect and query with. Visit my first blog if you missed it. So let us check if our server is up and running.

Enabling Data Governance with Apache Ranger (Part 2)

Micro-managing Data Governance across Hadoop

What is Apache Ranger?

Written by Mustafa Mirza

No responses yet