University of California, Berkeley
EECS Department - Computer Science Division

CS3 Lecture 14 : MapReduce


Project Questions

Review - Fractals


MapReduce

Overview

Non-computer example : Sorting cards

Working in parallel

Real-world problems that currently use parallel computation

Programming in a way that makes parallel programming easy ... MapReduce

MapReduce in CS3 on one local machine

;; 1 + 4 + 9 + 16 + 25 + 36 + 49 + 64 + 81 + 100 = 385
STk> (reduce + (map square '(1 2 3 4 5 6 7 8 9 10)))
385

MapReduce in CS3 on a cluster of machines!

(reduce-map-letter reducer-fun mapper-fun filename-or-directory)
(reduce-map-word   reducer-fun mapper-fun filename-or-directory)
(reduce-map-sent   reducer-fun mapper-fun filename-or-directory)
i love
cs3
(reduce-arbitrarily reducer-fun (unsort (map mapper-fun file-as-list)))
Name of function How does it convert a file to a list to pass to map?

What does map see as its second argument if the file is "/ilovecs3"
(I.e., what is file-as-list)

Domain of mapper-fun?

reduce-map-letter

Ignores carriage returns and spaces (actually whitespace in general) and treats the file as a big list of all its letters

'( i l o v e c s 3 )
letters
reduce-map-word

Ignores carriage returns and treats the file as a big list of words

'( i love cs3 )
words
reduce-map-sent

Converts every line into a sentence, and makes a list of all the sentences.

'( (i love) (cs3) )
sentences

Show me the money! Let's see an example...

1
2
3
4
5
6
7
8
9
10


unix% ssh cs3-__@icluster.eecs.berkeley.edu
icluster1 [1] ~ > stk-simply
STk> (define (square x) (* x x))
STk> (define (identity arg) arg))
STk> (reduce + (map square '(1 2 3 4 5 6 7 8 9 10)))
385   ;; This should be 286, because the "10" will be treated as a "1" and "0". 385-102+12+02=286 STk> (reduce-map-letter + square "/1-10")
Mapreduce in progress! Your ID number is 933510232. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0614
Working............................................................................... 286   STk> (reduce-map-word + square "/1-10") Mapreduce in progress! Your ID number is 28540981. For progress info, see http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0279 Working...................................... 385   STk> (reduce-map-sent + (lambda (s) (reduce + (map square s))) "/1-10") Mapreduce in progress! Your ID number is 475145543. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0621
Working...................................................... 385   ;; Let's now try a non-associative, non-commutative reducer... STk> (reduce-map-word list identity "/1-10") Mapreduce in progress! Your ID number is 200766683. For progress info, see http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200904091835_1498 Working................................... (3 ((5 (8 (7 2))) ((1 9) ((4 6) 10))))   STk> (reduce-map-word list identity "/1-10") Mapreduce in progress! Your ID number is 1615407800. For progress info, see http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200904091835_1499 Working....................... (9 ((7 1) (2 ((6 10) (8 ((3 5) 4))))))

How about other files and examples?

Filename Contents
"/beatles-songs"

This one is small and has all Beatles song names. There are 13 files in this directory, which you can think of as being all in one file. The files are:

abbey-road
a-hard-days-night
beatles-for-sale
beatles-white-album
help
let-it-be
magical-mystery-tour
please-please-me
revolver
rubber-soul
sgt-peppers-lonely-hearts-club-band
with-the-beatles
yellow-submarine

"/gutenberg/shakespeare"
The collected works of William Shakespeare
"/gutenberg/dickens"
The collected works of Charles Dickens

;; I wonder how many times Shakespeare wrote the word love?
STk> (define (love? w) (equal? w 'love))
STk> (reduce-map-word + (lambda (w) (if (love? w) 1 0)) "/gutenberg/shakespeare")
Mapreduce in progress! Your ID number is 2052777448. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0689
Working...............................................................................
1214
 
;; Let's double-check that
STk> (reduce-map-sent + (lambda (s) (appearances 'love s))  "/gutenberg/shakespeare")  
Mapreduce in progress! Your ID number is 861649500. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0696
Working.............................................................
1214   ;; I wonder what words in the Beatles songs start with u? STk> (reduce-map-word se (lambda (w) (if (equal? (first w) 'u) w '())) "/beatles-songs")
Mapreduce in progress! Your ID number is 447796622. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0714
Working............................................................................
(us universe u.s.s.r.)    ;; What songs from the Beatles' Abbey Road have the word "the" in them? STk> (define (keep-the-sents s) ;; sentences with "the" in them pass through (if (member 'the s) (list s) ;; "buffer" the sentence with a list (list))) ;; the rest are turned into null lists.   STk> (reduce-map-sent append keep-the-sents "/beatles-songs/abbey-road") Mapreduce in progress! Your ID number is 1945545779. For progress info, see
http://icluster1.eecs.berkeley.edu:35702/jobdetails.jsp?jobid=job_200811231236_0734
Working............................................................................
((the end) (here comes the sun) (she came in through the bathroom window))

Summary

In lab this week

In life this week