more info: arps_mpi strange behavior as grid size increases

Kevin W. Thomas kwthomas at WIZARD.CAPS.OU.EDU
Mon Feb 24 10:57:51 CST 2003


>>TYPO CORRECTION:
>>
>>Hello,
>>
>>I previously sent:
>>
>>> In testing v5.0.0beta8 on a Beowulf Linux cluster, I have encountered
>>> some strange behavior in arps_mpi.  Up to approximately 125x125x30
>>> physical points, the model runs, and for larger grids it hangs.  This
>>> result is scalable no matter how many processors I use.  With one
>>> processor, the limit is at 128x128x33.  With 4x4 processors, the limit
>>> is at 483x483x33.  I compute (back of the envelope) RAM usage of 200-250
>>> MB per processor, so I'm still well below the RAM of the cluster's nodes
>>> (1 GB).  For the runs that fail, the job is picked up and started,
>>> creates the log file, and then hangs up.  I haven't done extensive
>>> testing to find where it's hanging yet.  I wanted to see whether you are
>>> aware of this problem and know of a quick fix.
>>>
>>
>>I have traced this problem.  The code hangs during mpi_send called by
>>mpsend2dew, which is called by jacob, which is called by inigrd, which
>>is called by initgrdvar, which is called by initial, which is called by
>>arps.
>>
>>                                                                     vv
>>Why would a grid of 487x487x33 cause this call to hang, when 483x483x33
>>                                                                     ^^
>>does not cause it to hang?  The code does not produce any error
>>messages.  It simply freezes up (until it's killed by the queue's
>>wallclock limit).  Advice, suggestions, or fixes most appreciated.  Many
>>thanks in advance!
>>
>>Best regards,
>>Matt
>>
>>--
>>Dr. Matthew D. Parker
>>Assistant Professor of Meteorology/Climatology
>>Department of Geosciences
>>University of Nebraska-Lincoln
>
>Dr. Parker...
>
>You are correct, the problem is in the mpi_send() call.  When 4071 bytes or
>less are sent, the code works fine.  When 4072 bytes or more are sent, the
>call hangs.  I've checked my MPI book and the MPI web page.  I see nothing
>to suggest that there is a limit to the size used in an mpi_send() call.
>This suggests either a buggy "mpich", or a problem related to the Linux kernel.
>I've forwarded the information that I found to our local Sys Admin people to
>see if they can come up with a solution.
>
>I've come up with an equation to determine if you are over the limit.  For
>simplicity, I'm assuming nx=ny (483 and 487 in your 16 processor example),
>and that nproc_x=nproc_y, which is 4 in your example  Nz=33 for your case.
>
>val = ( (nx-3)/nproc_x + 3 ) * nz
>
>If "val" is 4071 or less, you are okay.  If not, the run will hang.

"Bytes" is the wrong word here.  The results that I posted, including the
equation are for reals, not bytes.  The failure is somewhere between 16284
16284 (4071*4) bytes and 16288 bytes (4072*4).

        Kevin W. Thomas
        Center for Analysis and Prediction of Storms
        University of Oklahoma
        Norman, Oklahoma
        Email:  kwthomas at ou.edu




More information about the ARPSSUPPORT mailing list