Draw this picture as quickly as possible!
Apply SIMD instructions
Apply multithreading
One thread per task
Created Thread
s and ran them in parallel
Runnable
interfacestart
instancesjoin
to wait until threads finishPiEstimator
for (int i = 0; i < numThreads; i++) {
threads[i] = new Thread(new PiThread(...));
}
for (Thread t : threads) {
t.start();
}
for (Thread t : threads) {
try { t.join(); }
catch (InterruptedException e) { }
}
PiEstimator
Performancen threads | pi estimate | time (ms)
-----------------------------------
1 | 3.14158 | 8174
2 | 3.14161 | 4690
4 | 3.14161 | 2709
8 | 3.14163 | 1735
16 | 3.14156 | 1867
32 | 3.14167 | 1938
64 | 3.14156 | 1905
128 | 3.14157 | 1907
256 | 3.14164 | 1919
-----------------------------------
Best performance when number of threads = number of available processors
Reasons:
Question. What if tasks are different (unkown) amount of work?
Thread
s has significant overhead
When tasks are fairly homogenous (e.g., computing $\pi$, shortcuts) previous approach is good
A nice Java feature: thread pools
Executor
interface
void execute(Runnable command)
methodExecutorService
interface:
ExecutorService
ImplementationsFrom java.util.concurrent.Executors
:
newFixedThreadPool(int nThreads)
newSingleThreadExecutor()
newCachedThreadPool()
Define tasks
public class MyTask implements Runnable {
...
public void run () {
...
}
}
Create a pool, e.g., fixed thread pool
int nThreads = ...;
ExecutorService pool = Exercutors.newFixedThreadPool(nThreads);
Create and execute tasks
MyTask task = new MyTask(...);
pool.execute(task);
Shutting down the pool
pool.shutdown();
Wait for all pending processes to complete (like join()
method)
try {
pool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
// do nothing
}
Shortcuts from Lab 02:
for (int i = 0; i < size; ++i) {
for (int j = 0; j < size; ++j) {
float min = Float.MAX_VALUE;
for (int k = 0; k < size; ++k) {
float x = matrix[i][k]; float y = matrix[k][j];
float z = x + y;
if (z < min)
min = z;
}
shortcuts[i][j] = min;
}
}
For fixed row i
, col j
:
float min = Float.MAX_VALUE;
for (int k = 0; k < size; ++k) {
float x = matrix[i][k]; float y = matrix[k][j];
float z = x + y;
if (z < min)
min = z;
}
shortcuts[i][j] = min;
Approach 1:
size * size
threadsApproach 2:
availableProcessors()
executer-shortcuts.zip
Lab will be posted early next week
Runnable
task that uses SIMD parallelism to compute escape times