EECS High Performance Computing Cluster
The following are some performance ratings for the EECS HPC Cluster.
Memory transfer rates
We ran the popular stream program to measure the memory transfer rates in MB/s for simple computational kernels coded in C. These numbers reveal the quality of code generation for simple uncacheable kernels as well as showing the cost of floating-point operations relative to memory accesses.
Using the stream default values for an array size of 3000000, and total memory of 68.7 MB:
PIII/933/DUAL-PROC nodes
| Function |
Rate (MB/s) |
RMS time |
Min time |
Max time |
| Copy: |
369.2308 |
0.1331 |
0.1300 |
0.1400 |
| Scale: |
369.2308 |
0.1331 |
0.1300 |
0.1400 |
| Add: |
480.0000 |
0.1581 |
0.1500 |
0.1600 |
| Triad: |
342.8571 |
0.2161 |
0.2100 |
0.2200 |
|
P4/2.4 GHz/DUAL-PROC nodes
| Function |
Rate (MB/s) |
RMS time |
Min time |
Max time |
| Copy: |
1200.0000 |
0.0422 |
0.0400 |
0.0500 |
| Scale: |
1200.0000 |
0.0422 |
0.0400 |
0.0500 |
| Add: |
1440.0000 |
0.0562 |
0.0500 |
0.0600 |
| Triad: |
1440.0000 |
0.0605 |
0.0500 |
0.0800 |
|
Disk Access
Using a 100Mb file we got the following results using the benchmarking software Bonnie.
PIII/933/DUAL-PROC nodes
| |
Sequential Output |
Sequential Input |
Random |
| |
Per Char |
Block |
Rewrite |
Per Char |
Block |
Seeks |
| |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
| 100Mb image |
15492 |
100 |
235590 |
98.9 |
96205 |
99.6 |
15208 |
100.1 |
571987 |
100.5 |
22611.9 |
203.5 |
P4/2.4 GHz/DUAL-PROC nodes
| |
Sequential Output |
Sequential Input |
Random |
| |
Per Char |
Block |
Rewrite |
Per Char |
Block |
Seeks |
| |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
| 100Mb image |
20420 |
99.7 |
144787 |
101.8 |
367117 |
100.4 |
24219 |
99.8 |
1263370 |
98.7 |
68428.7 |
205.3 |
Description
Bonnie performs a series of tests on a 100MB file. For each test, Bonnie reports the bytes processed per elapsed second, per CPU second, and the % CPU usage.
Sequential Output
1. Per-Character. The file is written using the putc() stdio macro. The loop that does the writing should be small enough to fit into any reasonable I-cache. The CPU overhead here is that required to do the stdio code plus the OS file space allocation.
2. Block. The file is created using write(2). The CPU overhead should be just the OS file space allocation.
3. Rewrite. Each BUFSIZ of the file is read with read(2), dirtied, and rewritten with write(2), requiring an lseek(2). Since no space allocation is done, and the I/O is well-localized, this should test the effectiveness of the filesystem cache and the speed of data transfer.
Sequential Input
1. Per-Character. The file is read using the getc() stdio macro. Once again, the inner loop is small. This should exercise only stdio and sequential input.
2. Block. The file is read using read(2). This should be a very pure test of sequential input performance.
Random Seeks
This test runs SeekProcCount processes in parallel, doing a total of 4000 lseek()s to locations in the file specified by random() in bsd systems, drand48() on sysV systems. In each case, the block is read with read(2). In 10% of cases, it is dirtied and written back with write(2).
The idea behind the SeekProcCount processes is to make sure there's always a seek queued up.
Network Performance
Netperf is a benchmark that is used to measure various aspects of networking performance. Its primary focus is on bulk data transfer and request/response performance using either TCP or UDP. We ran the benchmark on two idle nodes while the network usage was very low. We run a UDP and a TCP stream test with the following results:
100 Mb Network
| TCP STREAM TEST |
| Recv Socket Size bytes |
Send Socket Size bytes |
Send Message Time secs |
Elapsed Time secs |
Throughput 10^6 bits/s |
| 87380 |
16384 |
16384 |
10.00 |
94.11 |
1 Gb Network
| TCP STREAM TEST |
| Recv Socket Size bytes |
Send Socket Size bytes |
Send Message Time secs |
Elapsed Time secs |
Throughput 10^6 bits/s |
| 87380 |
16384 |
16384 |
10.00 |
940.66 |
|