• MapReduce Service

mrs
  1. Help Center
  2. MapReduce Service
  3. User Guide
  4. FAQs
  5. What Types of Jobs Are Supported by MRS?

What Types of Jobs Are Supported by MRS?

A job functions as a program execution platform provided by MRS. Currently, MRS supports MapReduce jobs, Spark jobs, and Hive jobs. Table 1 describes job characteristics.

Table 1 Job types

Type

Description

MapReduce

MapReduce is a programming model with parallel computing simplified, and is used for parallel computing of big data sets (over one TB).

Map divides one task into multiple tasks, and Reduce summarizes the processing results of these tasks and produces the final analysis result.

After you complete code development, pack the code into a JAR file in IDEA or Eclipse, upload the file to the MRS cluster for execution, and obtain the execution result.

Spark

Spark is a batch data processing engine with high processing speed. Spark has demanding requirements on memory because it performs computing based on memory. A Spark job includes:

  • Spark: ends with .jar, which is case-insensitive.
  • Spark Script: ends with .sql, which is case-insensitive.
  • Spark SQL: specifies standard Spark SQL statements, for example, show tables;.

Hive

Hive is a data warehouse framework built on Hadoop. Hive provides Hive query language (HiveQL), similar to structured query language (SQL), to process structured data. Hive automatically converts HiveQL in Hive Script to a MapReduce task to query and analyze massive data stored in the Hadoop cluster.

An example of a standard HiveQL statement is as follows: create table page_view(viewTime INT,userid BIGINT,page_url STRING,referrer_uel STRING,ip STRING COMMENT 'IP Address of the User');