pycaps.io package

Submodules

pycaps.io.MRMSGrid module

class MRMSGrid(start_date, end_date, variable, path_start, freq='1H')[source]

Bases: object

MRMSGrid reads time series of MRMS grib2 files, interpolates them, and outputs them to netCDF4 format.

interpolate_grid(in_lon, in_lat)[source]

Interpolates MRMS data to a different grid using cubic bivariate splines

interpolate_to_netcdf(in_lon, in_lat, out_path, date_unit='seconds since 1970-01-01T00:00')[source]

Calls the interpolation function and then saves the MRMS data to a netCDF file. It will also create separate directories for each variable if they are not already available.

load_data()[source]

Loads data from MRMS GRIB2 files and handles compression duties if files are compressed.

MRMS_main()[source]
interpolate_mrms_day(start_date, variable, mrms_path, map_filename, out_path)[source]

For a given day, this module interpolates hourly MRMS data to a specified latitude and longitude grid, and saves the interpolated grids to CF-compliant netCDF4 files.

Parameters:
  • start_date (datetime.datetime) – Date of data being interpolated
  • variable (str) – MRMS variable
  • mrms_path (str) – Path to top-level directory of MRMS GRIB2 files
  • map_filename (str) – Name of the map filename. Supports ARPS map file format and netCDF files containing latitude and longitude variables
  • out_path (str) – Path to location where interpolated netCDF4 files are saved.
load_map_coordinates(map_file)[source]

Loads map coordinates from netCDF or pickle file created by util.makeMapGrids.

Parameters:map_file – Filename for the file containing coordinate information.
Returns:Latitude and longitude grids as numpy arrays.

pycaps.io.ModelGrid module

class ModelGrid(filenames, run_date, start_date, end_date, variable, frequency='1H')[source]

Bases: object

Base class for reading 2D model output grids from netCDF files.

Given a list of file names, loads the values of a single variable from a model run. Supports model output in netCDF format.

filenames

list of str

List of netCDF files containing model output

run_date

ISO date string or datetime.datetime object

Date of the initialization time of the model run.

start_date

ISO date string or datetime.datetime object

Date of the first timestep extracted.

end_date

ISO date string or datetime.datetime object

Date of the last timestep extracted.

freqency

str

spacing between model time steps.

valid_dates

DatetimeIndex of all model timesteps

forecast_hours

array of all hours in the forecast

file_objects

list

List of the file objects for each model time step

__enter__()[source]

Open each file for reading.

__exit__()[source]

Close links to all open file objects and delete the objects.

close()[source]

Close links to all open file objects and delete the objects.

static format_var_name(variable, var_list)[source]

Searches var list for variable name, checks other variable name format options.

Parameters:
  • variable (str) – Variable being loaded
  • var_list (list) – List of variables in file.
Returns:

Name of variable in file containing relevant data, and index of variable z-level if multiple variables contained in same array in file.

load_data()[source]

Load data from netCDF file objects or list of netCDF file objects. Handles special variable name formats.

Returns:Array of data loaded from files in (time, y, x) dimensions, Units
load_data_old()[source]

Loads time series of 2D data grids from each opened file. The code handles loading a full time series from one file or individual time steps from multiple files. Missing files are supported.

pycaps.io.NCARModelGrid module

class NCARModelGrid(member, run_date, variable, start_date, end_date, path, single_step=False)[source]

Bases: pycaps.io.ModelGrid.ModelGrid

Extension of the ModelGrid class for interfacing with the NCAR ensemble.

Parameters:
  • member (str) – Name of the ensemble member
  • run_date (datetime.datetime object) – Date of the initial step of the ensemble run
  • start_date (datetime.datetime object) – First time step extracted.
  • end_date (datetime.datetime object) – Last time step extracted.
  • path (str) – Path to model output files.
  • single_step (boolean (default=False) – Whether variable information is stored with each time step in a separate file or one file containing all timesteps.

pycaps.io.SSEFModelGrid module

class SSEFModelGrid(member, run_date, variable, start_date, end_date, path, single_step=False)[source]

Bases: pycaps.io.ModelGrid.ModelGrid

Extension of ModelGrid to the CAPS Storm-Scale Ensemble Forecast system.

Parameters:
  • member (str) – Name of the ensemble member
  • run_date (datetime.datetime object) – Date of the initial step of the ensemble run
  • start_date (datetime.datetime object) – First time step extracted.
  • end_date (datetime.datetime object) – Last time step extracted.
  • path (str) – Path to model output files.
  • single_step (boolean (default=False) – Whether variable information is stored with each time step in a separate file or one file containing all timesteps.

pycaps.io.binfile module

class BinFile(file_name, mode='r', byteorder='<')[source]

Bases: object

Handles the low-level reading and writing of binary files. Inherit from this class for all your binary file I/O needs.

Parameters:
  • file_name (str) – Name of the file to open.
  • mode (str) – Read/write mode for the file. Default is ‘r’ (for reading).
  • byteorder (str) – Endianness for the file. Acceptable values are ‘<’ for little-endian, ‘>’ for big endian. Default is ‘<’ (little endian).
ANCH_CURPOS = 1
ANCH_FILEBEG = 0
ANCH_FILEEND = 2
_ateof()[source]

Returns True if the pointer has reached the end of the file, returns False otherwise.

_compute_block_size(type_dict)[source]
_peek(type_string)[source]

Peek at a value in the file.

Parameters:type_string (str) – See _read().
Returns:Data from the file (see _read()).
_read(type_string, _peeking=False)[source]

Read values from the file and return as a specific type.

Parameters:type_string (str) – The number and type of data to read from the file. The form of the string is nt, where n is the number of values to read, and t is the type of the values. If n is omitted, it is assumed to be 1. Acceptable values for t can be found in the table below.
Returns:Data from the file as the specified type. For example, a type string of ‘4f’ will return a list of 4 floats. A type string of ‘i’ will return a single integer.

Some possible values for the type character are as follows:

Type Character Meaning
i signed 32-bit integer
h signed 16-bit integer
b signed 8-bit integer
f single-precision float
d double-precision float
c single character
s character string

The s character may be post-fixed with a number that tells the length of the string. For example, s10 refers to a string of length 10. See https://docs.python.org/2/library/struct.html#format-characters for a full description.

_read_block(type_dict, dest_dict, fortran=False)[source]

Read a block of data from the file.

Parameters:
  • type_dict (OrderedDict) – An ordered dictionary with variable names as keys and type strings (see _read()) as values.
  • dest_dict (dict) – A dictionary in which to place the values when they’ve been read from the file.
  • fortran (bool) – Whether or not to read the block as a Fortran-formmated block. Default is false.
_read_grid(type_string, shape, fortran=False)[source]

Read a grid from the file.

Parameters:
  • type_string (str) – As in :py:meth`_read()`, but only the type character. The number is determined by the shape argument.
  • shape (tuple) – The shape of the grid to return.
  • fortran (bool) – Whether or not to write the grid as a Fortran-formatted block. Default is False.
Returns:

A grid of data as a numpy array.

_seek(location, anchor=0)[source]

Move the file pointer to a particular location.

Parameters:
  • location (int) – Location in the file in number of bytes.
  • anchor (int) – The point in the file to which location is relative. For example, BinFile.ANCH_FILEBEG means that location is relative to the beginning of the file. Default value is BinFile.ANCH_FILEBEG.
_tell()[source]

Get the current location of the file pointer in bytes from the start of the file.

_write(value, type_string)[source]

Write values to the file.

Parameters:type_string (str) – See _read() for a full description.
_write_block(type_dict, src_dict, fortran=False)[source]

Write a block of data from the file.

Parameters:
  • type_dict (OrderedDict) – An ordered dictionary with variable names as keys and type strings (see _read()) as values.
  • src_dict (dict) – A dictionary containing the variable names as keys and their values.
  • fortran (bool) – Whether or not to read the block as a Fortran-formmated block. Default is false.
_write_grid(type_string, grid, fortran=False)[source]

Write a grid to the file.

Parameters:
  • type_string (str) – As in :py:meth`_read()`, but only the type character. The number is determined by the shape argument.
  • shape (tuple) – The shape of the grid to return.
  • fortran (bool) – Whether or not to write the grid as a Fortran-formatted block. Default is False.
close()[source]

Close the file

pycaps.io.coltilt module

class ColumnTiltFile(file_name, vars=('vr', 'Z'), mode='r')[source]

Bases: pycaps.io.binfile.BinFile

Read an ARPS EnKF column-tilt formatted radar observation file.

Parameters:
  • file_name (str) – The name of the file to open.
  • variables (list) – A list containing the names and order of the variables in the file. Defaults to [‘vr’, ‘Z’] (radial velocity, then reflectivity).
  • mode (str) – Read/write mode for this file. The default is ‘r’ for reading, currently the only supported option.
radar_id

str

4-character radar ID.

timestamp

datetime

The valid time for the data file.

elevations

np.array

List of elevation angles in the file.

__getitem__(var_name)[source]

Retrieve data from the file.

Parameters:var_name (str) – Name of the variable to retrieve. Acceptable values are ‘z’ (height), ‘r’ (slant range), or any of the names passed to the vars keyword argument in ColumnTiltFile.__init__().
Returns:A three-dimensional numpy array (NTILT \(\times\) NY \(\times\) NX)

Examples

>>> ctf = ColumnTiltFile("/path/to/columntiltfile/KTLX.20110524.210000")
>>> ctf['Z'].shape # Print the shape of the reflectivity array. Will be the same horizontal shape as the domain.
(14, 303, 303)
>>> ctf['Z'].max() # Print the maximum of the reflectivity array
68.41645

pycaps.io.dataload module

get_axes(base_path, base_name, agl=True, z_coord_type='', split=None, fcst=False)[source]

Get the axes from a grdbas file.

Parameters:
  • base_path (str) – Path to the grdbas file.
  • base_name (str) – Base name of the grdbas file (e.g. ‘enf001’).
  • agl (bool) – Whether to return the z coordinates relative to ground level (True) or MSL (False)
  • z_coord_type (str) – Which set of z coordinates to use. “” refers to the atmospheric coordinates, and “soil” refers to the soil z coordinates. Default is “” (atmospheric coordinates).
  • split (tuple) – If the domain is split, then this is tuple contains the domain decomposition in (NX, NY).
Returns:

A dictionary containing the x and y coordinates in the ‘x’ and ‘y’ keys, and either ‘z’ and ‘z_MSL’ (if agl=True) or ‘z’ and ‘z_AGL’ (if agl=False)

load_domain(base_path, data_file, derived, interpolator, coords='hght', aggregator=None, split=None)[source]

Load one timestep of one ensemble member.

load_ensemble(base_path, members, times, var_names, derived=<function recarray_fm_dict>, interpolator=<pycaps.interp.interp.NullInterpolator object>, aggregator=None, max_concurrent=-1, single_var=False, z_coord_type='atmos', coords='hght', fcst=False, split=None)[source]

Load an entire ensemble into memory. Optionally, do interpolation, compute derived variables, and aggregate the ensemble.

Parameters:
  • base_path (str) – Path to the data to load.
  • members (int or list) – If an integer, determines the number of members to load (e.g. load members 1 through members). If a list, determines which members to load (e.g. load members 1, 4, 10, and 19).
  • times (list) – A list of times (in seconds since initialization) at which to load data.
  • var_names (list) – A list of variable names to load from the file.
  • derived (function) – A function to compute derived variables. The function must take a collection of keyword arguments (e.g. **kwargs) and return a single numpy array. Optional, default is to return all the variables in a numpy record array.
  • interpolator (Interpolator) – An interpolator object (as defined in pycaps.interp.interp) specifying how to interpolate each member at each time. Optional, default is to do no interpolation (return the full three- dimensional domain)
  • aggregator (function) – A function describing how to aggregate the ensemble members prior to computing derived variables. The function must take a single numpy array, the first dimension of which is the ensemble, and return another numpy array. Could be used for computing an ensemble mean prior to computing reflectivity. Optional, default is to do no aggreggation.
  • max_concurrent (int) – The number of processes to run concurrently. Optional, default is to load all ensemble members at the same time (or in some configurations, all time steps from an individual member at the same time).
  • single_var (bool) – Whether or not to load data from a single variable file. If true, then var_names must be of length 1. Optional, default is False.
  • z_coord_type (str) – Specifies which z coordinates to use. Acceptable values are “atmos” for atmospheric z coordinates or “soil” for soil z coordinates. Optional, default is “atmos”.
  • coords (str) – Specifies which vertical coordinate to use in the atmosphere. Acceptable values are “hght” for for height coordinates, and “pres” for pressure coordinates. Optional, default is “hght”.
  • fcst (bool) – Specifies whether to load the forecast (True) or analysis (False) ensemble. Optional, default is False (loads analysis ensemble).
  • split (tuple) – Specifies the domain configuration for split ensembles. Must be a tuple (NPX, NPY), where NPX is the number of subdomains in the x direction, and NPY is the same for the y direction. Optional, default is an already-joined ensemble.
Returns:

A numpy array containing the data in the ensemble. For the full ensemble with no interpolation or aggregation, the order of dimensions will be (NE, NT, NZ, NY, NX). Aggregation will remove the NE dimension, and interpolation will change the last three according to which interpolation is being done (for example, interpolation to a height will remove the NZ dimension, while interpolating to a set of points will replace the NZ, NY, and NX dimensions with NP).

load_run(base_path, base_name, times, var_names, derived=<function recarray_fm_dict>, interpolator=<pycaps.interp.interp.NullInterpolator object>, max_concurrent=-1, single_var=False, z_coord_type='atmos', coords='hght', split=None)[source]

Load a single run into memory. Optionally, do interpolation and compute derived variables.

Parameters:
  • base_path (str) – Path to the data to load.
  • base_name (str) – The
  • times (list) – A list of times (in seconds since initialization) at which to load data.
  • var_names (list) – A list of variable names to load from the file.
  • derived (function) – A function to compute derived variables. The function must take a collection of keyword arguments (e.g. **kwargs) and return a single numpy array. Optional, default is to return all the variables in a numpy record array.
  • interpolator (Interpolator) – An interpolator object (as defined in pycaps.interp.interp) specifying how to interpolate each member at each time. Optional, default is to do no interpolation (return the full three- dimensional domain)
  • max_concurrent (int) – The number of processes to run concurrently. Optional, default is to load all time steps at the same time.
  • single_var (bool) – Whether or not to load data from a single variable file. If true, then var_names must be of length 1. Optional, default is False.
  • z_coord_type (str) – Specifies which z coordinates to use. Acceptable values are “atmos” for atmospheric z coordinates or “soil” for soil z coordinates. Optional, default is “atmos”.
  • coords (str) – Specifies which vertical coordinate to use in the atmosphere. Acceptable values are “hght” for for height coordinates, and “pres” for pressure coordinates. Optional, default is “hght”.
  • split (tuple) – Specifies the domain configuration for split ensembles. Must be a tuple (NPX, NPY), where NPX is the number of subdomains in the x direction, and NPY is the same for the y direction. Optional, default is an already-joined ensemble.
Returns:

A numpy array containing the data for the run. For the full run with no interpolation, the order of dimensions will be (NT, NZ, NY, NX). Iinterpolation will change the last three according to which interpolation is being done (for example, interpolation to a height will remove the NZ dimension, while interpolating to a set of points will replace the NZ, NY, and NX dimensions with NP).

pycaps.io.gridtilt module

class GridTiltFile(file_name, vars=('vr', 'Z'), mode='r')[source]

Bases: pycaps.io.binfile.BinFile

Read an ARPS EnKF grid-tilt formatted radar observation file.

Parameters:
  • file_name (str) – The name of the file to open.
  • vars (list) – A list containing the names and order of the variables in the file. Defaults to [‘vr’, ‘Z’] (radial velocity, then reflectivity).
  • mode (str) – Read/write mode for this file. The default is ‘r’ for reading, ‘w’ for writing is also supported.
timestamp

datetime

The valid time for the data file.

n_tilts

int

Number of elevations in the radar data.

n_gridx

int

Number of grid points in the x direction.

n_gridy

int

Number of grid points in the y direction.

radar_name

str

Name of the radar.

radar_lat

float

Latitude of the radar.

radar_lon

float

Longitude of the radar.

radar_x

float

x coordinate of the radar location on the domain.

radar_y

float

y coordinate of the radar location on the domain.

d_azimuth

float

Azimuthal spacing for the raw radar data.

range_min

float

Minimum range for the raw radar data.

range_max

float

Maximum range for the raw radar data.

elevations

np.array

List of elevation angles.

__getitem__(var_name)[source]

Retrieve data from the file.

Parameters:var_name (str) – Name of the variable to retrieve. Acceptable values are ‘z’ (height in meters), ‘r’ (slant range in meters), or any of the names passed to the vars keyword argument in GridTiltFile.__init__().
Returns:A three-dimensional numpy array (NTILT \(\times\) NY \(\times\) NX)

Examples

>>> gtf = GridTiltFile("/path/to/gridtiltfile/KTLX.20110524.210000")
>>> gtf['Z'].max() # Pull the reflectivity out of the file and take the maximum.
68.41645
__setitem__(var_name, data)[source]

Set data in the file.

Parameters:
  • var_name (str) – Name of the variable to set. Use ‘z’ for height and ‘r’ for slant range.
  • data (np.array) – The data put into the file. Must be the same shape as the rest of the variables.

Examples

>>> gtf = GridTiltFile("/path/to/gridtiltfile/KTLX.20110524.210000")
>>> gtf['Z'] = new_reflectivity_data
close()[source]

Close the file. Write data if opened for writing.

copy_headers(gtf)[source]

Copy the headers from another gridtilt file to this one.

Parameters:gtf (GridTileFile) – The grid tilt file from which to take the header information.

pycaps.io.io_modules module

grdbas_read(grdbas_filename, format='hdf')[source]

Reads in basic grid data from an ARPS grdbas file (in HDF format)

Parameters:
  • grdbas_filename – The full path to the grdbas file to read from
  • format – OPTIONAL – The format of your input data (currently valid options are ‘hdf’ (default) and ‘netcdf’)
Returns:

A tuple containing the following values

Variable Description
ctrlat The latitude of the domain center
ctrlon The longitude of the domain center
trulat1 The first true latitude value for the lambert conformal map projection
trulat2 The second true latitude value for the lambert conformal map projection
trulon The true longitude value for the lambert conformal map projection
nx The number of gridpoints in the east-west direction
ny The number of gridpoints in the north-south direction
nz The number of gridpoints in the vertical direction
dx The grid spacing in the east-west direction
dy The grid spacing in the north-south direction
width_x The width of the domain in the east-west direction, in meters
width_y The width of the domain in the north-south direction, in meters

grdbas_read_patch(grdbas_filename, xpatches, ypatches, format='hdf')[source]

Reads in basic grid data from ARPS grdbas patch files (in HDF format)

Parameters:
  • grdbas_filename – The full path to the grdbas file to read from
  • xpatches – The number of patches in the x-direction (in arps.input, nproc_x)
  • ypatches – The number of patches in the y-direction (in arps.input, nproc_y)
  • format – OPTIONAL – The format of your file. Valid options include netcdf and HDF (default HDF).
Returns:

A tuple containing the following values

Variable Description
ctrlat The latitude of the domain center
ctrlon The longitude of the domain center
trulat1 The first true latitude value for the lambert conformal map projection
trulat2 The second true latitude value for the lambert conformal map projection
trulon The true longitude value for the lambert conformal map projection
nx The number of gridpoints in the east-west direction in the full domain
ny The number of gridpoints in the north-south direction in the full domain
nz The number of gridpoints in the vertical direction in the full domain
dx The grid spacing in the east-west direction
dy The grid spacing in the north-south direction
width_x The width of the domain in the east-west direction, in meters
width_y The width of the domain in the north-south direction, in meters
nx_patch The number of gridpoints in the east-west direction in a single patch
ny_patch The number of gridpoints in the north-south direction in a single patch

read_xy_slice(field, source, level, format='hdf', **kwargs)[source]

Reads the data needed for xyplot to generate an x-y variable field plot.

Parameters:
  • field – The variable name associated with the field to be plotted (e.g. u, v, pt, qr)
  • source – The FULL PATH to the file containing the data to be plotted.
  • level – The vertical (k) model layer for which data should be read and plotted.
  • format – OPTIONAL – The input data format (currently valid options are ‘hdf’ (default) and ‘netcdf’)
  • grdbas – OPTIONAL – A history file to read grid information from (if different from source)
  • h2 – OPTIONAL – A second source file containing other data needed for plotting, if you have one. (example: Read reflectivity from one file, and wind data from another).
  • truref – OPTIONAL – If reading from an ARPS file processed using ossedata (affects variable name and filename) this should be set to the full path to the truref file.
  • decompress – OPTIONAL – A flag to set to True if you’re using ARPS data with 32 bit integers mapped to 16 bit integers. Will call the decompression function if set to True.
Returns:

The 2D slice of data to be plotted

Return type:

var_sfc

pycaps.io.level2 module

class NCDCLevel2File(file_name, mode='r', byteorder='>')[source]

Bases: pycaps.io.binfile.BinFile

Read a raw Level II file downloaded from NCDC.

Parameters:
  • file_name (str) – The name of the file to load.
  • mode (str) – Read/write mode of the file. Default value is ‘r’, the only currently supported option.
__getitem__(var_name)[source]

Retrieve data from the file.

Parameters:var_name (str) – Name of the variable to retrieve from the file. Acceptable values are ‘REF’ for reflectivity and ‘VEL’ for radial velocity.
Returns:A three-dimensional numpy array (NTILT \(\times\) NAZIM \(\times\) NRANGE)
get_coords(var_name, sweep_no)[source]

Get the azimuth and range for a sweep in the data file.

Parameters:
  • var_name (str) – The variable name for which to retrieve the azimuth and range.
  • sweep_no (int) – The tilt for which to retrieve azimuth and range.
Returns:

A tuple of 1-dimensional numpy arrays of azimuth (in degrees) and range (in meters).

get_elevations(var_name)[source]

Get the elevation angles for a particular variable.

Parameters:var_name (str) – The variable name for which to retrieve the elevation angles.
Returns:A 1-dimensional numpy array containing the elevation angles in degrees.

pycaps.io.modelobs module

class ARPSModelObsFile(file_name, vars=('vr', 'Z'), mpi_config=(1, 1), mode='r')[source]

Bases: pycaps.io.binfile.BinFile

Read an ARPS EnKF model observation file (such as that created by arpsenkf or postinnov).

Parameters:
  • file_name (str) – The name of the file to open.
  • vars (list) – A list containing the names and order of the variables in the file. Defaults to [‘vr’, ‘Z’] (radial velocity, then reflectivity).
  • mpi_config (tuple) – A tuple containing the MPI configuration of the run that generated this file. Defaults to (1, 1), signifying no MPI.
  • mode (str) – Read/write mode for this file. The default is ‘r’ for reading, currently the only supported option.
timestamp

datetime

The valid time for the data in the file.

n_tilts

int

Number of elevation angles in the file.

n_gridx

int

Number of grid points in the x direction.

n_gridy

int

Number of grid points in the y direction.

radar_id

str

4-character ID for the radar.

radar_lat

float

Latitude of the radar.

radar_lon

float

Longitude of the radar.

radar_x

float

x coordinate of the radar on the domain.

radar_y

float

y coordinate of the radar on the domain.

d_azim

float

Azimuthal spacing of the raw data.

range_min

float

Minimum range of the raw data.

range_max

float

Maximum range of the raw data.

elevations

np.array

List of elevation angles in the file.

__getitem__(var_name)[source]

Retrieve data from the file.

Parameters:var_name (str) – Name of the variable to retrieve. Acceptable values are ‘z’ (height), ‘r’ (slant range), or any of the names passed to the vars keyword argument in ARPSModelObsFile.__init__().
Returns:A three-dimensional numpy array (NTILT \(\times\) NY \(\times\) NX)

Module contents