forked from OSchip/llvm-project
				
			
		
			
				
	
	
		
			129 lines
		
	
	
		
			3.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
			
		
		
	
	
			129 lines
		
	
	
		
			3.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 | 
						|
        "http://www.w3.org/TR/html4/strict.dtd">
 | 
						|
<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ -->
 | 
						|
<html>
 | 
						|
<head>
 | 
						|
  <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
 | 
						|
  <title>Polly - Memory access optimizations</title>
 | 
						|
  <link type="text/css" rel="stylesheet" href="../menu.css">
 | 
						|
  <link type="text/css" rel="stylesheet" href="../content.css">
 | 
						|
</head>
 | 
						|
<body>
 | 
						|
<div id="box">
 | 
						|
<!--#include virtual="../menu.html.incl"-->
 | 
						|
<div id="content">
 | 
						|
  <!--*********************************************************************-->
 | 
						|
  <h1>Memory access optimizations</h1>
 | 
						|
  <!--*********************************************************************-->
 | 
						|
<p><em>WARNING: This project was part of the Google Summer of Code 2011.
 | 
						|
Tt is currently not finished, but it is in the design and implementation stage.
 | 
						|
The Ideas/Plans described here may not yet be implemented in Polly and may be
 | 
						|
changed during the actual implementation.</em></p>
 | 
						|
 | 
						|
This project adds memory access transformations to Polly. In many cases
 | 
						|
changing the memory access pattern yields to better data locality or removes
 | 
						|
dependences that would otherwise block transformations.
 | 
						|
 | 
						|
<p>An examples which uses this feature is given below.</p>
 | 
						|
 | 
						|
Consider the following loop
 | 
						|
<pre>
 | 
						|
for (i = 0; i < 8; i++)
 | 
						|
  sum += A[i];
 | 
						|
</pre>
 | 
						|
Through memory access transformations this loop can be executed in parallel.
 | 
						|
It can be transformed to
 | 
						|
<pre>
 | 
						|
<em>// Create and initialize an array 'tmp' with size 4</em>
 | 
						|
for (i = 0; i < 8; i++)
 | 
						|
  tmp[i % 4] += A[i];
 | 
						|
sum = tmp[0] + tmp[1] + tmp[2] + tmp[3];
 | 
						|
</pre>
 | 
						|
 | 
						|
Optimizers like PluTo can schedule the code such that an outer, parallel
 | 
						|
loop is created:
 | 
						|
<pre>
 | 
						|
parfor (ii = 0; ii < 4; ii++) {
 | 
						|
  tmp[ii] = 0;
 | 
						|
  for (i = ii * 2; i < (ii+1) * 2; i++)
 | 
						|
    tmp[ii] += A[i];
 | 
						|
  }
 | 
						|
sum = tmp[0] + tmp[1] + tmp[2] + tmp[3];
 | 
						|
</pre>
 | 
						|
 | 
						|
<h2>TODO</h2>
 | 
						|
<h3>Step 1</h3>
 | 
						|
Polly exports its polyhedral description in a JSCoP file. Define how memory
 | 
						|
layout transformations are expressed in Polly and in the JSCOP file. 
 | 
						|
Example:
 | 
						|
 | 
						|
<p>Consider the following loop.</p>
 | 
						|
<pre>
 | 
						|
for (i = 0; i < 12; i++)
 | 
						|
  A[i] = 0;
 | 
						|
</pre>
 | 
						|
In the JSCOP file the memory accesses are represented as follows.
 | 
						|
<pre>
 | 
						|
"accesses": [{
 | 
						|
        "kind": "write",
 | 
						|
                "relation": "{ Stmt[i] -> A[i] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
To perform a transformation we generate the following code:
 | 
						|
<pre>
 | 
						|
for (i = 0; i < 12; i++)
 | 
						|
  A[0] = i;
 | 
						|
</pre>
 | 
						|
The representation in the JSCoP file is:
 | 
						|
<pre>
 | 
						|
"accesses": [{
 | 
						|
        "kind": "read",
 | 
						|
                "relation": "{ Stmt[i] -> A[0] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
We need to detect this access function change.
 | 
						|
 | 
						|
<h3>Step 2</h3>
 | 
						|
Update the code generation module to reflect the access function change made
 | 
						|
in Step 1.
 | 
						|
<h3>Step 2.1 Code generation for a constant</h3>
 | 
						|
In the JSCOP file an access function which has variables is changed to a
 | 
						|
constant. Code is generated to reflect this change. Let the content of original
 | 
						|
JSCOP file be:
 | 
						|
<pre>
 | 
						|
"accesses" : [{
 | 
						|
        "kind" : "read",
 | 
						|
                 "relation" : "{ Stmt_for_body[i0] -> MemRef_A[i0] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
The transformed JSCOP file is:
 | 
						|
<pre>
 | 
						|
"accesses" : [{
 | 
						|
        "kind" : "read",
 | 
						|
                 "relation" : "{ Stmt_for_body[i0] -> MemRef_A[10] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
Code is generated for this change.
 | 
						|
<h3>Step 2.2 Code generation for coefficients</h3>
 | 
						|
The coefficients of induction variables are changed here. Let the original
 | 
						|
JSCOP file be:
 | 
						|
<pre>
 | 
						|
"accesses" : [{
 | 
						|
      "kind" : "read",
 | 
						|
               "relation" : "{ Stmt_for_body3[i0, i1] -> MemRef_A[32i0+i1] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
The transformed JSCOP file is:
 | 
						|
<pre>
 | 
						|
"accesses" : [{
 | 
						|
      "kind" : "read",
 | 
						|
               "relation" : "{ Stmt_for_body3[i0, i1] -> MemRef_A[16i0+2i1+5] }"
 | 
						|
}]
 | 
						|
</pre>
 | 
						|
Code is generated for this change.
 | 
						|
</div>
 | 
						|
</div>
 | 
						|
</body>
 | 
						|
</html>
 | 
						|
 |