more info: arps_mpi strange behavior as grid size increases

Matthew Parker mparker at PAPAGAYO.UNL.EDU
Fri Feb 21 12:06:17 CST 2003


TYPO CORRECTION:

Hello,

I previously sent:

> In testing v5.0.0beta8 on a Beowulf Linux cluster, I have encountered
> some strange behavior in arps_mpi.  Up to approximately 125x125x30
> physical points, the model runs, and for larger grids it hangs.  This
> result is scalable no matter how many processors I use.  With one
> processor, the limit is at 128x128x33.  With 4x4 processors, the limit
> is at 483x483x33.  I compute (back of the envelope) RAM usage of 200-250
> MB per processor, so I'm still well below the RAM of the cluster's nodes
> (1 GB).  For the runs that fail, the job is picked up and started,
> creates the log file, and then hangs up.  I haven't done extensive
> testing to find where it's hanging yet.  I wanted to see whether you are
> aware of this problem and know of a quick fix.
>

I have traced this problem.  The code hangs during mpi_send called by
mpsend2dew, which is called by jacob, which is called by inigrd, which
is called by initgrdvar, which is called by initial, which is called by
arps.

                                                                     vv
Why would a grid of 487x487x33 cause this call to hang, when 483x483x33
                                                                     ^^
does not cause it to hang?  The code does not produce any error
messages.  It simply freezes up (until it's killed by the queue's
wallclock limit).  Advice, suggestions, or fixes most appreciated.  Many
thanks in advance!

Best regards,
Matt

--
Dr. Matthew D. Parker
Assistant Professor of Meteorology/Climatology
Department of Geosciences
University of Nebraska-Lincoln
-------------- next part --------------
An embedded message was scrubbed...
From: Matthew Parker <mparker at papagayo.unl.edu>
Subject: arps_mpi strange behavior as grid size increases
Date: Fri, 14 Feb 2003 11:10:09 -0600
Size: 1460
URL: <http://www.caps.ou.edu/pipermail/arpssupport/attachments/20030221/48aaf3e7/attachment.mht>


More information about the ARPSSUPPORT mailing list