Adding a Ranger Access Permission Policy for Spark2x¶
Scenario¶
The Ranger administrator can use Ranger to set permissions for Spark2x users.
Note
After Ranger authentication is enabled or disabled on Spark2x, you need to restart Spark2x.
Download the client again or manually update the client configuration file Client installation directory/Spark2x/spark/conf/spark-defaults.conf.
Enable Ranger: spark.ranger.plugin.authorization.enable=true
Disable Ranger: spark.ranger.plugin.authorization.enable=false
In Spark2x, spark-beeline (applications connected to JDBCServer) supports the Ranger IP address filtering policy (Policy Conditions in the Ranger permission policy), while spark-submit and spark-sql do not.
Prerequisites¶
The Ranger service has been installed and is running properly.
The Ranger authentication function of the Hive service has been enabled. After the Hive service is restarted, the Spark2x service is restarted.
You have created users, user groups, or roles for which you want to configure permissions.
The created user has been added to the hive user group.
Procedure¶
Log in to the Ranger management page.
On the home page, click the component plug-in name in the HADOOP SQL area, for example, Hive.
On the Access tab page, click Add New Policy to add a Spark2x permission control policy.
Configure the parameters listed in the table below based on the service demands.
¶ Parameter
Description
Policy Name
Policy name, which can be customized and must be unique in the service.
Policy Conditions
IP address filtering policy, which can be customized. You can enter one or more IP addresses or IP address segments. The IP address can contain the wildcard character (
*
), for example, 192.168.1.10,192.168.1.20, or 192.168.1.*.Policy Label
A label specified for the current policy. You can search for reports and filter policies based on labels.
database
Name of the Spark2x database to which the policy applies.
The Include policy applies to the current input object, and the Exclude policy applies to objects other than the current input object.
table
Name of the Spark2x table to which the policy applies.
To add a UDF-based policy, switch to UDF and enter the UDF name.
The Include policy applies to the current input object, and the Exclude policy applies to objects other than the current input object.
column
Name of the column to which the policy applies. The value * indicates all columns.
The Include policy applies to the current input object, and the Exclude policy applies to objects other than the current input object.
Description
Policy description.
Audit Logging
Whether to audit the policy.
Allow Conditions
Policy allowed condition. You can configure permissions and exceptions allowed by the policy.
In the Select Role, Select Group, and Select User columns, select the role, user group, or user to which the permission is to be granted, click Add Conditions, add the IP address range to which the policy applies, and click Add Permissions to add the corresponding permission.
select: permission to query data
update: permission to update data
Create: permission to create data
Drop: permission to drop data
Alter: permission to alter data
Index: permission to index data
All: all permissions
Read: permission to read data
Write: permission to write data
Temporary UDF Admin: temporary UDF management permission
Select/Deselect All: Select or deselect all.
To add multiple permission control rules, click .
If users or user groups in the current condition need to manage this policy, select Delegate Admin. These users will become the agent administrators. The agent administrators can update and delete this policy and create sub-policies based on the original policy.
Deny Conditions
Policy rejection condition, which is used to configure the permissions and exceptions to be denied in the policy. The configuration method is similar to that of Allow Conditions.
¶ Task
Operation
role admin operation
On the home page, click Settings and choose Roles > Add New Role.
Set Role Name to admin. In the Users area, click Select User and select a username.
Click Add Users, select Is Role Admin in the row where the username is located, and click Save.
Note
After being bound to the Hive administrator role, perform the following operations during each maintenance operation:
Log in to the node where the Hive client is installed as the client installation user.
Run the following command to configure environment variables:
For example, if the Spark2x client installation directory is /opt/client, run source /opt/client/bigdata_env.
Run the following command to perform user authentication:
kinit Spark2xService user
Run the following command to log in to the client tool:
spark-beeline
Run the following command to update the administrator permissions:
set role admin;
Creating a database table
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database. (If you want to create a database, enter the name of the database to be created or enter * to indicate a database with any name, and then select the name.) Enter and select the corresponding table name on the right of table and column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select Create.
Deleting a table
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database. (If you want to delete a database, enter the name of the database to be created or enter * to indicate a database with any name, and then select the name.) Enter and select the corresponding table name on the right of table and column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select Drop.
Note
For CarbonData tables, only the owner of the corresponding database or table can perform the drop operation.
ALTER operation
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database, enter and select the corresponding table on the right of table, and enter and select the corresponding column name on the right of column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select Alter.
LOAD operation
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database, enter and select the corresponding table on the right of table, and enter and select the corresponding column name on the right of column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select update.
INSERT operation
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database, enter and select the corresponding table on the right of table, and enter and select the corresponding column name on the right of column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select update.
The user also needs to have the submit-app permission of the Yarn task queue. By default, the Hadoop user group has the submit-app permission of all Yarn task queues. For details about how to load a network instance to a cloud connection, see Adding a Ranger Access Permission Policy for Yarn.
GRANT operation
Enter the policy name in Policy Name.
Enter and select the corresponding database on the right of database, enter and select the corresponding table on the right of table, and enter and select the corresponding column name on the right of column. Wildcard characters (*) are supported.
In the Allow Conditions area, select a user from the Select User drop-down list.
Select Delegate Admin.
ADD JAR operation
Enter the policy name in Policy Name.
Click database, and select global from the drop-down list. On the right of global, enter related information and select *.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select Select/Deselect All.
VIEW and INDEX permissions
Enter the policy name in Policy Name.
On the right side of database, enter the database name and select the corresponding database. (If you want to delete a database, enter the database name and select *.) On the right side of table, enter a table name and select the view and index names. On the right side of column, enter a Hive column name, and select *.
In the Allow Conditions area, select a user from the Select User drop-down list.
Click Add Permissions and select permissions for the user as required.
Operations on other user database tables
Perform the preceding operations to add the corresponding permissions.
Grant the read, write, and execution permissions on the HDFS paths of other user database tables to the current user. For details, see Adding a Ranger Access Permission Policy for HDFS.
Note
After Spark SQL access policy is added on Ranger, you need to add the corresponding path access policies in the HDFS access policy. Otherwise, data files cannot be accessed. For details, see Adding a Ranger Access Permission Policy for HDFS.
The global policy in the Ranger policy is only used to associate with the Temporary UDF Admin permission to control the upload of UDF packages.
When Ranger is used to control Spark SQL permissions, the empower syntax is not supported.
Click Add to view the basic information about the policy in the policy list. After the policy takes effect, check whether the related permissions are normal.
To disable a policy, click to edit the policy and set the policy to Disabled.
If a policy is no longer used, click to delete it.
Data Masking of the Spark2x Table¶
Ranger supports data masking for Spark2x data. It can process the returned result of the select operation you performed to mask sensitive information.
Log in to the Ranger WebUI and click the component plug-in name, for example, Hive, in the HADOOP SQL area on the home page.
On the Masking tab page, click Add New Policy to add a Spark2x permission control policy.
Configure the parameters listed in the table below based on the service demands.
¶ Parameter
Description
Policy Name
Policy name, which can be customized and must be unique in the service.
Policy Conditions
IP address filtering policy, which can be customized. You can enter one or more IP addresses or IP address segments. The IP address can contain the wildcard character (
*
), for example, 192.168.1.10,192.168.1.20, or 192.168.1.*.Policy Label
A label specified for the current policy. You can search for reports and filter policies based on labels.
Hive Database
Name of the Spark2x database to which the current policy applies.
Hive Table
Name of the Spark2x table to which the current policy applies.
Hive Column
Name of the Spark2x column to which the current policy applies.
Description
Policy description.
Audit Logging
Whether to audit the policy.
Mask Conditions
In the Select Group and Select User columns, select the user group or user to which the permission is to be granted, click Add Conditions, add the IP address range to which the policy applies, then click Add Permissions, and select select.
Click Select Masking Option and select a data masking policy.
Redact: Use x to mask all letters and 0 to mask all digits.
Partial mask: show last 4: Only the last four characters are displayed.
Partial mask: show first 4: Only the first four characters are displayed.
Hash: Perform hash calculation for data.
Nullify: Replace the original value with the NULL value.
Unmasked(retain original value): The original data is displayed.
Date: show only year: Only the year information is displayed.
Custom: You can use any valid Hive UDF (returns the same data type as the data type in the masked column) to customize the policy.
To add a multi-column masking policy, click .
Deny Conditions
Policy rejection condition, which is used to configure the permissions and exceptions to be denied in the policy. The configuration method is similar to that of Allow Conditions.
Spark2x Row-Level Data Filtering¶
Ranger allows you to filter data at the row level when you perform the select operation on Spark2x data tables.
Log in to the Ranger WebUI and click the component plug-in name, for example, Hive, in the HADOOP SQL area on the home page.
On the Row Level Filter tab page, click Add New Policy to add a row data filtering policy.
Configure the parameters listed in the table below based on the service demands.
¶ Parameter
Description
Policy Name
Policy name, which can be customized and must be unique in the service.
Policy Conditions
IP address filtering policy, which can be customized. You can enter one or more IP addresses or IP address segments. The IP address can contain the wildcard character (
*
), for example, 192.168.1.10,192.168.1.20, or 192.168.1.*.Policy Label
A label specified for the current policy. You can search for reports and filter policies based on labels.
Hive Database
Name of the Spark2x database to which the current policy applies.
Hive Table
Name of the Spark2x table to which the current policy applies.
Description
Policy description.
Audit Logging
Whether to audit the policy.
Row Filter Conditions
In the Select Role, Select Group, and Select User columns, select the object to which the permission is to be granted, click Add Conditions, add the IP address range to which the policy applies, then click Add Permissions, and select select.
Click Row Level Filter and enter data filtering rules.
For example, if you want to filter the data in the zhangsan row in the name column of table A, the filtering rule is name <>'zhangsan'. For more information, see the official Ranger document.
To add more rules, click .
Click Add to view the basic information about the policy in the policy list.
After you perform the select operation on a table configured with a data masking policy on the Spark2x client, the system processes and displays the data.