brunch

What is the map/reduce?

the power of collective intelligence

by 정원혁

Do you know how many gas stations are in San Jose?

or How many CVSs are in Seoul?


Can you count how many birds are in this picture?

flock-of-birds-519146_960_720[1].jpg


We can be powerful when we are connected. We can do many wonderful things when we work together!


In 2004, Google introduced this word: MapReduce. To handle Big Data, you must know this concept. It's the base. Today, I will explain this tough word easily to you. Google made this framework to handle petabytes data even with an old and slow computers.


I heard that Google (or GM, whoever) asked very tricky questions to their interviewee, and the previous question was also one of them: "How many gas stations are in the San Jose?"


OK let's solve the mumber of birds question first. How can I count that? How could the newspaper reporter count huge number of birds and wrote, "Over a hundred thousands of birds flew from Russia to Nakdong river"? Did he really count that birds one by one and wrote that news or did he just lie?

Here's the answer: He just chose a square of 1cm by 1cm in the picture, and counted the birds in the square, and then multiply it by the numbers of squares. If he want more accurate, he can count more squares and average them.

이미지 4.png


Yes, that was the old way. We can populate just one or some of squares. We do not count the all birds actually. It's the statistics: the knowledge on the sample and the population.


Now it is chainging: why not all the birds? That is the big data! With the big data, I alone cannot count everything. I just count my own part and you do your part. This process is called "Map"

After we all count our own part and summarize those numbers. This process is called "Reduce"


Acutually this processes are assinged many computers: sometimes 10, sometimes 100 or more. In the past time, only super computers can join these processes but now any computers can join these processes. Some computers can count many squares, some slow computers can count only a square. Each computer can participate it's role.


Yes, I found a "collective intelligence" here. We can do just a small but significant part in a project. That's it!


Do you know how ants are working together? They cross a wide gap of leaves like the following picture.

266A914B543C76AE28BBC5



Yes! We can work together, think together and make it better. That is how computers simulate our humans. That is what we can learn fromt he MapReduce.


keyword
작가의 이전글어디로 갈까? - 취업과 선택