Editorial - Science & Technology 
High Performance Computing on the Mac
An Interview with Dr. Gaurav Khanna
By John Martellaro - OSXFAQ Senior Editor - Science & Technology
November 26, 2002
The elegance, craftsmanship, and UNIX-like nature of Mac OS X
along with great industrial design of the Macintosh hardware call
out for exploitation as a high performance computing (HPC) tool
for the scientist, engineer, and researcher. Whether you call it
HPC or high throughput computing (HTC), it all means the same
thing: utilizing the power of multple computers, hardware and
software, combined with specialized software and a high performance network to
tackle very difficult problems.
Dr. Gaurav Khanna at the Long Island University in New York is an
expert on this topic and has kindly agreed to be interviewed.
What follows is a lot of what you need to know to get started in
HPC. You'll also get some insight into the life of a theoretical
astrophysicist.
1. Tell us about yourself: your position at LIU, your background,
education, and your technical interests.
Let's go in chronological order. I received my initial education in my
home country, India. I was fortunate enough to get a chance to attend
amongst the best schools available there. I got a Bachelors degree in
Technology (B. Tech.) at the Indian Institute of Technology, Kanpur
with a major in Electrical Engineering and a minor in Physics. Then I
moved to the U.S. for graduate school. I attended Penn State's Center
for Gravitational Physics and Geometry and graduated with a doctorate
in Astrophysics in August 2000.
Right out of grad school, I joined Long Island University as a
tenure-track assistant professor where I have been ever since.
On campus here, I teach a variety of undergraduate courses, continue
my astrophysics research and also tinker with
computer technology, multimedia, and technical computing by being
involved actively with the on-campus Technology Center
(
http://techcenter.southampton.liu.edu/). This Tech Center is a
cross-disciplinary, faculty run facility that serves as a teaching
lab, a multimedia center, a parallel computing cluster and a place to
experiment with several cutting edge computer technologies. Graduate
school helped mature my early interest in computers, since I got a
chance to involve myself with several high performance computational
facilities at Penn State for my Black Hole related research. So, it's no
surprise that I found myself playing with and contributing to the Tech
Center facilities here, at LIU.
In addition to analytic and numerical modeling using computers, I
tremendously enjoy playing with Web based technologies, streaming
media, video editing etc. In fact, I am also responsible (with a
student assistant) for webcasting, streaming and digitally archiving (on
DVD) our on-campus Natural Science Faculty Seminar Series, called
BrainWASH.
2. How did you become interested in astronomy? Are you also
an amateur astronomer?
The single most influential person in my life is probably my father. He
happens to be a particle physicist, so naturally I got interested in
physics at a very early age. I'm not exactly certain how that
astronomy/astrophysics seeped in ... although I am certain that it was
very early (sometime in high school). I developed an almost obsessive
interest in General Relativity at that time!
Unfortunately, I am not an amateur astronomer. It's not that I am not
interested ... I just have not yet been able to find the time for it! I
will someday.
3. How long have you been using a Macintosh? What got you started
with Macs? What kinds of Macs are you currently using?
I got my first Mac in 1998 when I was a graduate student at Penn State. It
was an old Mac Classic that I picked up from university salvage!! I got
it simply because I always wanted to have one. As a child in India, I had
read so much about Apple Computers, but had never really gotten a chance
to have or play with a Mac. However, as I started out with that old Mac
Classic ... I found myself getting drawn to it, and soon I began buying more
recent models just because I wanted to do more. In a year or so, I was the
contented owner of a G3.
Currently I have four Macs. I have an old G3 Strawberry iMac (350 MHz,
96MB, OS 8.6, printer, scanner) that my two year old plays with. I also
have a G4 Cube (450 MHz, 0.5 GB RAM, 15" Apple Studio LCD, Mac OS X
Jaguar) and a 1.0 GHz DP G4 DDR PowerMac (my primary Mac, running
Jaguar) at home. At work, I have a G4 PowerMac (DP 450 MHz, 1.75 GB
RAM, Jaguar) with a 17" Apple Studio CRT. I also have a digital camcorder, a
Canon ZR-10 that talks to these Macs via FireWire.
4. What is it about the Mac, Mac OS X specifically, that appeals to
you? Why is Mac OS X your "scientist's desktop"? Are Macs in abundance at
LIU?
I love Macs, especially now with OS X. It's the only OS that is UNIX
based and has all the current writing, presentation, multimedia and
creativity software packages available for it. In my opinion that makes
it unique. Personally, I love it because I have always loved the
beautiful and intuitive Mac OS GUI and now with OS X I can also do my
scientific research work on it! It's the combination of that
workstation level OS with Aqua's simplicity and intuitiveness ... that
just does wonders for me!
Indeed, I can run my Black Hole collision and Gravity Wave simulations
on OS X, monitor their progress using graphing software like gnuplot
and proFit, make QT movies of the evolution, embed them and host them
on Web pages or in presentations, write technical papers about my
results ... all while sitting in the same chair at one machine ... my G4
PowerMac!
Let me give you an example. Last night at home, after dinner, I
started two different simulations on my dual processor G4 that model
gravity wave emission from a rotating black hole, when a small matter
particle falls into it. I set them to run in the background while I
started to do other things. Using iMovie and iDVD I started archiving
our faculty seminars from this semester on a DVD. And before I went to
bed, I had a professional looking DVD with a proper theme and
everything with the first month of talks and also some numerical data
ready to be included in a technical paper that I am currently working
on. Now, there you go. This one computer can handle two simultaneous
computationally intensive simulations and edit and encode
video and
burn it on a DVD. Tell me about another platform that even comes close!
Macs are not at all abundant at LIU. There are two small Mac labs in
the Creative Arts and Media division and that's it. All other labs are
PC labs. However, we are beginning to change that here. Like I
mentioned before, we have a cross-disciplinary Tech Center that
involves projects in creative arts, multimedia, teaching and
computational research. We originally had all the machines there as SGI
IRIX workstations (O2s and Octanes). Now we are in the process of
replacing them by G4 PowerMacs running OS X. In fact, we have a third of
all the SGI's replaced by Macs already in this facility.
5. You are the author of a package called "Numerical Computation
Tools". Tell us about the package: where to get it, and what do the
various pieces do?
NCT is an extensive collection of open source tools and libraries all
pre-compiled and tested for OS X, and therefore ready-to-go for use
in scientific research on Macs. The website is:
http://gravity.psu.edu/~khanna/hpc.html
Without too much detail ... here is what is available on that site.
FORTRAN: Without any doubt, F77 is still the most commonly used
programming language in numerical scientific modeling. The site has
several open source options for F77 on OS X available for download. It
also includes a port of HPF or "High Performance Fortran".
Message Passing Interface MPI and OpenMP: This is an open source set
of compilers and libraries that allow one to use the industry standard
framework for developing parallel applications (applications that can
execute parallely over a cluster of computers or processors).
Octave: This is an open source "Matlab clone". It is particularly
useful for large linear algebra problems. The site has an MPI based
Octave (that can run on a cluster) and also a (partially) AltiVec
optimized Octave available for download.
Cactus: An open source problem solving environment that is very
popular, in particular with astrophysicists. It has been developed
extensively to model Einstein's equations of General Relativity using a
variety of advanced computer science techniques, including parallel
computing, adaptive meshes and even Grid computing. The site provides details
and tricks on how to get Cactus to work on OS X.
RNPL: This is a great tool that takes as input details about the form of an
equation (usually a partial differential equation), and some parameters ... and
then spits out a C or FORTRAN Code that solves the equation numerically using
established iterative numerical techniques. It is used extensively by
computational astrophysicists. Again, an OS X port is available for download.
There are many other tools available, possibly of lesser significance.
There is also a full page devoted to the usage of the Velocity Engine
in scientific computation with some sample code and benchmarks, that
may be useful.
6. The NCT must have been a lot of work. What were the technical
motivations to write this package for the Mac?
I started porting some of those packages in connection with my own
codes and research. And some of them indeed took some effort ... so I
felt that it may make sense to have them available on the Web for other
scientists that are in the same boat. However, later on as I became
more experienced with porting to OS X, I did some of those packages
on request ... and as strange as it sounds ... some I even did for "sport"!
That aside, since OS X is genuinely UNIX underneath, it is a true
workstation class OS. So, many of the tools that have been
traditionally known to work on SGIs, Suns, etc. should with some effort
run on a Mac. And with all those updated and nicely documented
Developer Tools that Apple lets one have for free, the porting in
most cases is relatively easy.
Moreover, since the G4 benchmarks so well on creativity applications
like Photoshop, it's natural to wonder how it does for scientific
computation. So, part of the motivation for all this effort is to see
how the G4 and Apple hardware do in scientific research.
7. There is a concept called cluster computing related to problems
that are "tightly coupled" and there are compute farms (or
equivalently distributed processing systems) that tackle "parallel"
problems, like movie frame rendering. Can you go into some detail on
the two different technologies, explain the differences and typical
problems tackled with these two techniques, and mention notable
products for Mac OS X?
The main difference between these two types of parallel computing is of
bandwidth usage.
In "tightly coupled" problems, processes working on different
processors/computers need to communicate at high speed. This is because
the different processes need to share information in order for the
calculation to continue. As a crude example, imagine weather
forecasting. Imagine you have a cluster of 48 computers, each of which
is evolving weather patterns for each of the 48 contiguous U.S. states.
It should be clear that all these machines cannot carry on by
themselves because each computer shall need information about air
currents, flows, storms etc. at least from neighboring states, so that
that data can be factored in and a proper prediction can be made.
Much of scientific computation is of the "tightly coupled" type. This
is essentially because the mathematical language that describes most
natural processes, happens to be something called partial differential
equations which have this "tightly coupled" property. The most common
tool that I am aware of to handle such high bandwidth problems is MPI
(Message Passing Interface) and there are several available open source
and commercial options for OS X.
On the other hand, in low bandwidth parallel computing, little or no
communication is needed. The processes on different
processors/computers are independent enough that they don't need to
share information to continue their own task. As an example consider
the SETI project. If you participate in this internet based distributed
computing project, you are essentially downloading data files that
correspond to different days, times and locations in the sky and
searching for intelligent alien signals. Other SETI participants do
the same with their data set, and everyone eventually sends their
results back to the SETI Institute. Note that as you were analyzing
your data set, there is nothing you need from any other participant's
analysis, in order for you to complete your piece of work. They are
completely independent operations.
There are some scientific applications of this type. Other examples
are protein folding, prime number Searches, etc. There are number of
different infrastructures that are used for such problems, for example
Cosm, BOINC, etc.
The notable products for Mac OS X have been conveniently listed at
Apple's Scitech page.
8. Tell us your experiences with the G4 Velocity Engine. How have you
been able to exploit the VE for various research projects?
On my NCT website, I have dedicated a full page on this subject
using a genuine astrophysical code (that evolves a distorted black
hole) as an example, I have tried to demonstrate how the V.E. can help in
scientific computing. As a crude estimate, from my simulations, code
optimized for V.E. can run faster by a factor from anywhere between 2 to
5.
Unfortunately, the Velocity Engine has a serious limitation, i.e. it
cannot handle
double precision floating point numbers. And in several situations this
makes it unusable for scientific work. However, there are several
intelligent things one can attempt in that situation, for example, one
can attempt to speed up parts of the computation that do not need
double precision using the V.E. while keep other parts doubly precise,
and so on. In any case, there is no denying that the V.E. can help
significantly in scientific computation, and if ever a double precision
version is developed, it would do wonders.
It is worth mentioning at this point, a product that will
automatically optimize any C or Fortran source code for the V.E.
This is Veridian's VAST http://www.psrv.com/
and it works quite well. It will take any ordinary C or Fortran source code
and auto-vectorize and also auto-parallelize it. This means
that VAST will make appropriate changes to the source so that the
resulting code and executable take advantage of the V.E. and dual
processors. This saves one from learning to program the V.E. (which
can be tricky) and yet still get the performance boost. As a
scientist, I can therefore continue to focus on the physics, as
opposed to getting deep into the realm of chip design, assembly
language, etc. It works quite well for me, and I highly recommend
it.
9. Beyond the concepts of parallel and cluster computing, there is
another, higher level protocol called Grid Computing. Can you explain
what Grid computing is? What Grid systems are operating in the world?
Are there packages for Grid computing that have been ported to Mac OS
X?
Grid computing is an infrastructure that allows for collaborative use of high
performance computer labs, scientific instruments, databases etc. that are
located possibly in different parts of the world. Normally, when one thinks
of cluster computing, one imagines a single room stacked up with several dozen
computers, all networked via a high speed connection working on some task in
parallel. However, say that is not enough power for your problem at hand. You
need to use the super-computing facilities at San Diego, UIUC, Los Alamos and
Fujitsu all together! Grid computing will let you do that.
There are several Grids being built all over the world. For example, National
Technology Grid (NPACI, NCSA), European DataGrid, NASA Information Power Grid,
The Grid Physics Network and others.
Now, regarding OS X and Grid computing. The open source software tools and
libraries that are relevant to accessing and serving grid services and
developing applications that are grid enabled is called the Globus Toolkit.
There have been several independent attempts to port Globus to OS X with
limited success. The issues that arise would be best addressed if there were a
coordinated effort between Apple and the Globus team. It is my understanding
that Apple is aware of this and is considering such a collaboration.
In the meantime, it is possible to access grid services on OS
X using a Java
based package called the Java Commodity Grid Kit. However, currently the Java
CoG Kit provides client services only (there are very limited server side
services available).
10. What is your perception of Apple's support for science and
technology? Does Apple, via Apple's higher education division, tech
support, sales, and developer relations, give you the support you
need? What areas does Apple need to focus on to better support the
computational scientist - in both hardware and software?
In my opinion, Apple is doing an excellent job in supporting science and
technology. I've interacted with Apple SciTech closely on some issues, and I
have to commend them for their close attention and sincere efforts. I've had a
very open and healthy interaction with Apple SciTech so far. With education
sales, large or small, I've found Apple's managers very forthcoming and easy
to work with. For developers, it is simply awesome that Apple provides a well
documented and excellent set of developer tools that are regularly updated.
There are many places where Apple could help the scientific developer.
In my opinion it is almost imperative for Apple to develop a Fortran
compiler that optimizes very well for the G4 (including the V.E. and dual
processors). There is only one such product in the market, that comes
close, and it is expensive. Competition always helps! Moreover, all
other platforms I know of, have a Fortran compiler bundled with
developer tools. Including highly-optimized, standard computational
libraries with developer tools would also help. I realize Apple is
beginning to do that already. BLAS and LAPACK come standard with
Jaguar. And of course, OS X could still use some optimization and
tweaking. As far as hardware goes, better system architecture (bus
speeds, etc) and faster and more sophisticated processors (64-bit,
possibly a vector processing unit that can support double precision
floats) would be a necessity in the near future. Also, worth
considering would be quad MP G4 systems. I am not fully aware of the
technical challenges associated with these, but it would be incredibly
useful (at least to a scientist) to have a quad G4 processor or higher
system available.
11. What lies ahead in the research you are doing?
My Black Hole research is at a very exciting and unique juncture at this
moment in time. I didn't get a chance to say this before; I numerically model
the merger of two Black Holes and study the properties of the Gravitational
Radiation that emerges from such an event. All over the world there are these
Gravity Wave Observatories being built (LIGO and LISA in the U.S.)
that shall be
able to detect this radiation. Over the next few years, these detectors will
come online and that will be an incredibly exciting time for us! Only time
will tell what will be witnessed through this new window into the universe,
and luckily the wait is not that long!
We'd like to thank Dr. Khanna once again for a stimulating and informative
interview. He can be reached at:
Gaurav Khanna
Assistant Professor
Natural Science Division
Long Island University - Southampton
Southampton NY 11968
PH: 631-287-8411
FAX: 631-287-8419
gkhanna@liu.edu
http://techcenter.southampton.liunet.edu/~gkhanna
References:
Tech Center @ LIU, Southampton: http://techcenter.southampton.liu.edu/
Penn State's Center for Gravity: http://gravity.psu.edu/
Silicon Graphics Inc.: http://www.sgi.com/
Fortran: http://www.gnu.org/software/fortran/fortran.html
MPI: http://www-unix.mcs.anl.gov/mpi/mpich/
Octave: http://www.octave.org/
Cactus: http://www.cactuscode.org/
RNPL: http://laplace.physics.ubc.ca/People/matt/Rnpl/
SETI Project: http://seti.berkeley.edu/
Protein Folding: http://folding.stanford.edu/
Prime Number Search: http://www.mersenne.org/
Globus: http://www.globus.org/
Java CoG Kit: http://www-unix.globus.org/cog/java/
LIGO: http://ligo.caltech.edu/
LISA: http://lisa.jpl.nasa.gov/
Send your comments to John Martellaro
|