Nexus Weblogging
ChinaonRails
You are here ChinaonRails > 架构 > > Disco: a Map/Reduce framework for distributed computing

IceskYsl


1.481%

disconnected
登录后回复主题 | 切换简介显示 | 跳到回复(9)

Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.

Disco was started at Nokia Research Center as a lightweight framework for rapid scripting of distributed data processing tasks.

Erlang
Python



代码在Githb:
http://github.com/tuulos/disco/tree/master

Disco: a Map/Reduce framework for distributed computing

... by IceskYsl ... 3 月前 ... 302 次点击


Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.

Disco was started at Nokia Research Center as a lightweight framework for rapid scripting of distributed data processing tasks.

Erlang
Python



代码在Githb:
http://github.com/tuulos/disco/tree/master

2 - 9-8 10:28
404 深圳
skynet才是Ruby的Map/Reduce框架

robbin:Skynet --- ruby的类Google Map/Reduce框架
3 - 9-8 10:40
404 深圳
举一反三:
还有Python的Map/Reduce框架octopy,Python大道至简,就一个文件,即octo.py。

有助记忆:
瑞士名表OCTO(奥克吐)日历手表,始于1848年的OCTO表,名字取自英文八角形 OCTAGON。当年因车厢仪器板内的八日时钟及采用可行八日的长形机芯装嵌成的八角形腕表面世,殿定了OCTO表的独特设计及形象。故此,OCTO由品牌商标以至表形图案都围绕着八角形而设计,尽显特色。
4 - 9-8 10:53
404 深圳
还有Erlang的分布式的Map/Reduce框架erldmr,刚开始认识Map/Reduce的时候我以为就是分布式处理的,这下神了,Map/Reduce还能在Erlang的分布式上大行其道。另一个plists(一个Erlang写的取代Erlang lists的模块,用来做并行的list运算处理)也包含了一个简单的Map/Reduce实施,上次看见一篇老外的博文,基于mochiweb用Erlang/JSON写HTTP/WEB SERVER的时候用到过plists,忘了。。
5 - 9-10 9:20
bd7lx 深圳
http://railspikes.com/2008/9/9/ec2-mapreduce-slides

除了上面的介绍,也有实际的利用EventMachine做MapReduce
并行计算

http://nutrun.com/weblog/eventmachine-mapreduce/
7 - 9-14 13:38
404 深圳
Disco是Erlang写的开源Map/Reduce框架,但却是在Python程序中使用,应该比octopy更健壮。

8 - 10-4 1:01
404 深圳
Greenplum 可以用来进行大规模数据分析,数据并行处理,不单单是MapReduce,而是整合MapReduce+SQL;让程序员玩MapReduce,DBA写SQL,细节交给 Greenplum 的并行数据流引擎处理。



The Power of Parallel Computing for Large-Scale Data Warehousing and Analytics

联想到 Google 的结构化数据的分布存储系统——BigTable。BigTable同时使用了 GFS 和 MapReduce;就连GAE数据默认也是存储到BigTable里。还能联想到MySQL的表分区,拆表等,可惜没有深入考究,扯远了~
9 - 10-4 1:24
404 深圳
也有同类产品——ASTER,数据处理上由集群加MapReduce实现




看官网介绍:
查询上比传统RDBMS要快10倍;
简单地单击就能将数据从 GB 扩充升级到 PB级别;
可以缩减10倍的硬件成本;
貌似跟MySpace.com有合作,应该是收费服务。
看完之后有话想说?那就帮楼主加盖一层吧!

在回复之前你需要先进行登录
电子邮件或昵称
密码
© 2007 A Jesse Cai Production   -   About   -   京ICP备07020911号
a site powered by Project Babel