Name: A. Very-Good-Student Student ID: u9507815 Lab Group: Afterhours Type in your answer to the following questions WITHIN this document. ----------------------------------------------------------------------- Q1: What is your estimate for the resolution (Tr) and overhead (Tc) of gettimeofday? Resolution of 1us, Overhead or much less than that. Overhead is the minimum time observed. Resolution is jump from the minimum to the next smallest value. You should see a bunch of 0s and some 1s. Occassionally you may see higher numbers, this is due to being on a multiuser machine and getting swapped out or interrupted between the two timing calls. ----------------------------------------------------------------------- Q2: Copy walltime.c to mywalltime.c and modify it to determine the overhead and resolution of elapsetime. What are the values of (Tr) and overhead (Tc) for this routine? (Note the makefile will compile and link your file mywalltime.c with libunknown.a correctly Possibly depending on the machine you are using, but you should see a something like: 27 27 24 24 24 24 24 24 24 24 21 24 24 21 36 36 33 36 48 42 42 24 24 24 24 24 24 24 24 24 24 24 24 24 18 36 36 36 36 48 45 42 24 24 24 24 24 24 24 24 24 24 24 24 24 21 36 36 36 36 45 45 45 24 24 24 24 24 24 24 24 24 24 24 24 Indicating that the overhead is 18us (smallest number) and the resolution is 3us (jump from 18 to 21). ----------------------------------------------------------------------- Q3: Submit your code (mywalltime.c). #include #include #include #define ASIZE 1000 void elapsetime(int* isec, int* iusec); int main(int argc, char** argv ) { int time[ASIZE]; int i,icnt; int isec0,isec1; int iusec0,iusec1; i=0; icnt=0; while(i160 time appears to go up by factor of 8. From 160->320 time goes up by .62/.08 or close to a factor of 8. If we assume scaling as O(n^k) then as we double problem size we would expect time to increase by 2^k. 2^3 = 8 ... so it looks like O(n^3) scaling ----------------------------------------------------------------------- Q9 Which routine is the dominant routine and what fraction of the total CPU is it taking? Typical profile running on another ULTRASparc machine with problem size of 800. 92.8 43.54 43.54 1 43540.00 43793.23 elim [4] 1.5 44.26 0.72 _mcount (634) 1.1 44.78 0.52 1922412 0.00 0.00 .umul [8] 1.0 45.27 0.49 1922403 0.00 0.00 .mul [9] 0.7 45.62 0.35 oldarc [10] 0.6 45.90 0.28 1 280.00 45710.00 main [2] 0.5 46.13 0.23 640800 0.00 0.00 next [7] 0.5 46.36 0.23 799 0.29 0.32 pivot [11] 0.3 46.49 0.13 done [13] 0.3 46.61 0.12 1 120.00 166.77 largest [12] 0.1 46.68 0.07 965488 0.00 0.00 __fabs [14] 0.1 46.74 0.06 640800 0.00 0.00 _drand48_u [6] 0.1 46.80 0.06 1 60.00 60.00 subst [15] 0.1 46.85 0.05 640800 0.00 0.00 _drand48 [5] 0.1 46.88 0.03 640833 0.00 0.00 mutex_unlock [16] 0.0 46.90 0.02 1281666 0.00 0.00 _return_zero [18] From this it is evident that elim takes 81.5% of the total time. (Note profile includes some other routines that are either from system libraries or arise from profiling, eg _mcount is not part of the code in linear.c ----------------------------------------------------------------------- Q10 How many times are the routines elim, pivot and largest called? From small section of profile shown above we see the number of calls elim - 1 pivot - 799 largest - 1 ----------------------------------------------------------------------- Q11: Use the profile output to construct a calling graph for those routines contained in linear.c, i.e. show how each routine is called and approximately how much time is spent in a given routine when it is called from the parent routine. You should verify your calling graph by inspecting the code. Total code takes 45.43s to execute. Time spent in each routine is as follows main(0.28) gauss_elim(0.0) elim(43.54) pivot(0.23) largest(0.12) subst(0.06) cpu_time(0.0) _times prtmatrix and iprtmatrix are not called ----------------------------------------------------------------------- Q12. Which line of the code is executed the most times (cut out the source code and report the number of times this line is executed). 2646700 -> a[ord[i]*n+j]=a[ord[i]*n+j]-factor*a[ord[k]*n+j]; ----------------------------------------------------------------------- Q13. Is the location of this line consistent with your gprof profiles? Yes this is in routine elim - which gprof showed to be taking nearly all of the time. ----------------------------------------------------------------------- Q14. How does the number of times this line is executed change as you vary the problem size? (NOTE, between tcov runs you should remove any tcov data files that relate to the previous run - else your results will be a combination of all previous tcov runs). Is this consistent with your earlier conclusions concerning the overall scaling of the code. Is it also consistent with the loop structure around this line? (You may want to run tcov profiling for a variety of problem sizes). Consider followint 3 problem sizes n=100 328350 -> a[ord[i]*n+j]=a[ord[i]*n+j]-factor*a[ord[k]*n+j]; n=200 2646700 -> a[ord[i]*n+j]=a[ord[i]*n+j]-factor*a[ord[k]*n+j]; n=400 21253400 -> a[ord[i]*n+j]=a[ord[i]*n+j]-factor*a[ord[k]*n+j]; We deduced early that doubling the problem size increased the time by a factor of 8. Thus consider multipling the line counts by 8, we get 328350*8= 2626800 is close to observed n=200 value of 2646700 2646700*8=21173600 is close to observed n=400 value of 21253400 This is all consistent with the loop structure that shows three loops running over something that relates to n LOOP1 for (k=0; k temp-=amatrix_kp[i*nsize+j]*xvector[j]; Line 155 in elim - copying an array, scales as n^2 80200 -> temp[i*n+j]=a[ord[i]*n+j]; Line 171 in pivot - scales as n^2 79800 -> test=fabs(a[ord[ii]*n+k]/sz[ord[ii]]); Line 192 in subst - scales as n^2 79800 -> sum=sum+a[i*n+j]*x[j]; ----------------------------------------------------------------------- Q16. How many basic blocks does the code contain, and how many of these were actually executed? from end of tcov output 51 Basic blocks in this file 41 Basic blocks executed ----------------------------------------------------------------------- Q17. Read the man page for er_print. How would you use er_print to obtain details on which routine called which, ie a calling graph similar to that from gprof? (Try this to make sure that it works). er_print -callers-callees test.1.er/ will give for gauss_elim, a similar calling graph to gprof Attr. Excl. Incl. Name User CPU User CPU User CPU sec. sec. sec. 5.240 0.080 5.550 main <<