mrjob

_images/logo_medium.png

mrjob is a Python 2.5+ package that helps you write and run Hadoop Streaming jobs.

mrjob fully supports Amazon’s Elastic MapReduce (EMR) service, which allows you to buy time on a Hadoop cluster on an hourly basis. It also works with your own Hadoop cluster.

To get started, install with pip:

pip install mrjob

Then read Getting started. Other common documentation destinations are:

Table of Contents

Indices and tables

Table Of Contents

Next topic

What’s New

This Page