ENGR 325 ENGR 325 HOMEWORK #7 KEY FALL P2015
age|1
ENGR 325
HOMEWORK #7 KEY
FALL 2015
1. (5) In this exercise we look at memory locality properties of matrix computation. The following code
is written in C, where elements within the same row are stored contiguously. Assume each word is
a 32-bit integer. (P&H 5.1, §5.1)
for (I=0; I<8; I++)
for (J=0; J<8000; J++) A[I]
[J]=B[I][0]+A[J][I];
a. How many 32-bit integers can be stored in a 16-byte cache block?
b. References to which variables exhibit temporal locality?
c. References to which variables exhibit spatial locality?
Locality is affected by both the reference order and data layout. The same computation can also be
written below in Matlab, which differs from C by storing matrix elements within the same column
contiguously in memory.
for I=1:8
for J=1:8000
A(I,J)=B(I,0)+A(J,I);
en
d end
d. How many 16-byte cache blocks are needed to store all 32-bit matrix elements being
referenced?
e. References to which variables exhibit temporal locality?
f. References to which variables exhibit spatial locality?
SOLUTION:
a. Each 32-bit integer requires 4 bytes to store, therefore FOUR 32-bit integers can be stored in
the 16-byte block.
b. Consider the access sequence:
A[0][0] = B[0][0] + A[0][0]
A[1][0] = B[0][0] + A[0][1]
A[2][0] = B[0][0] + A[0][2]
…
A[7999] = B[0][0] + A[0]
[0] [7999]
A[0][1] = B[1][0] + A[1][0]
…
⇒ Variables I, J, and B[I][0] exhibit temporal locality.
Matrix A begins as an 8-row by 8000-column matrix and is transformed into an 8000-row by
8- column matrix. So for the A matrices:
Elements = 8 x 8000 = 64,000
Total elements in the beginning and ending matrix: 64000 x 2 =
128,000 Matrix elements per cache block: 4, as per 5.1.1.
Total cache blocks: 128,000 ÷ 4 = 32,000
Note, however, that the “upper left” eight by eight elements overlap, so cache is not
needed for those. Therefore we can subtract 8 x 8 ÷ 4 = 16 cache blocks.
For the B matrix:
Elements = 8 x 1 = 8
Matrix Elements per cache block: 4, as
above Total cache blocks: 8 ÷ 4 = 2
Total 16-byte cache blocks = 32000 – 16 + 2 ⇒ 31,986 cache blocks are required.
e. As in 5.1.2, variables I, J, and B(I,0) exhibit temporal locality.
f. Unlike in C, in MATLAB matrix elements in the same column are stored contiguously. Therefore
the variables A(J,I) and B(I,0) exhibit spatial locality.
2. (10) Caches are important to providing a high-performance memory hierarchy to processors. Below
is a list of 32-bit memory address references, given as word addresses:
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller EvaTee. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $9.49. You're not tied to anything after your purchase.