Dec 28, 2009

A Small Computing Cluster on Board

In my previous post, I mentioned a newly developed GPU-based parallel Gibbs sampling algorithm for inference of LDA. Of course, as you know, there are many other GPU-based parallel algorithms that can solve many interesting applications efficiently using NVidia's CUDA programming framework.

More over, by Googling "CUDA MapReduce", you will find MapReduce implementations based on CUDA and GPU, developed by researchers at UC Berkeley, U Texas, Hong Kong Univ. of Sci.&Tech, and etc.

About the supporting hardware, I recently noticed NVidia's Tesla processor board, which contains a 240-core Tesla 10 GPU and 4GB on-board memory. This card can be installed on workstations like Dell Precision T7500. At the time when this essay is written, the price of such a system is about RMB 43,000.

Last but not least, there are significant differences between GPU-clusters and relatively traditionally computer-clusters. Few listed here:
  1. There is no mature load-balancing mechanism on GPU-clusters. Currently, GPU-based parallel computing is in the early stage of CPU-based parallel computing, which I mean, no automatic balancing over processors used by a task, and no scheduling and balancing over tasks. This prevents multiple projects from sharing a GPU-cluster.
  2. GPU-cluster is based on shared-memory architecture, so it is suitable only for the class of computing-intensive but data-sparse tasks. I do not see more than a few problems in real world that fit in this class.
But anyway, it is not smart to compare GPU-based parallel computing directly with multi-core CPU based solutions, because the latter can be naturally incorporated into multi-computer parallel computing and achieve naturally much higher scalability.

1 comment:

CIIT Noida said...

CIITN is located in Prime location in Noida having best connectivity via all modes of public transport. CIITN offer both weekend and weekdays courses to facilitate Hadoop aspirants. Among all Hadoop Training Institute in Noida , CIITN's Big Data and Hadoop Certification course is designed to prepare you to match all required knowledge for real time job assignment in the Big Data world with top level companies. CIITN puts more focus in project based training and facilitated with Hadoop 2.7 with Cloud Lab—a cloud-based Hadoop environment lab setup for hands-on experience.

CIITNOIDA is the good choice for Big Data Hadoop Training in NOIDA in the final year. I have also completed my summer training from here. It provides high quality Hadoop training with Live projects. The best thing about CIITNOIDA is its experienced trainers and updated course content. They even provide you placement guidance and have their own development cell. You can attend their free demo class and then decide.

Hadoop Training in Noida
Big Data Hadoop Training in Noida