HRRR Download Scripting Tips


×

If you have not already, please register as a user before downloading data. Citation details can be found at the bottom of this page.



You may write your own script to automate the download process, but PLEASE do not download an excessive number of files in a short period of time on multiple nodes (you agreed to not do this when you read the Best Practices).

HRRR GRIB2 files are large. sfc files are >100 MB and prs files are >380 MB each. If you download a day's worth of prs analyses, thats over 9 GB!


GRIB2 files are downloaded from the URL

https://pando-rgw01.chpc.utah.edu/ [model type]/ [fields]/ [YYYYMMDD]/ [file name]

Metadata for each file can be viewed from the same URL except with .idx appended to the grib2 file name

https://pando-rgw01.chpc.utah.edu/ [model type]/ [fields]/ [YYYYMMDD]/ [file name].idx

The model type and variable fields available include:

  • [model type] hrrr for the operational HRRR
    • [fields] sfc
    • [fields] prs
    • [fields] subh (sparse availability, if any)
    • [fields] nat (sparse availability, if any)
  • [model type] hrrrX for the experimental HRRR
    • [fields] sfc
  • [model type] hrrrak for HRRR Alaska
    • [fields] sfc
    • [fields] prs

[YYYYMMDD] represents the UTC date format (e.g. 20171228).

[file name] is in the format [model type].t[00-23]z.wrf[fields]f[00-18].grib2
where the two digit number following t is the model run hour and the two digit number following f is the model forecast hour.

Model NameModel TypeModel CycleArchived Forecasts
Operational HRRRhrrrHourly
range(0,23)
sfc: f00-f18
prs: f00
Experimental HRRRhrrrXHourly
range(0,23)
sfc: f00
HRRR AlaskahrrrakEvery 3 hours
range(0,23,3)
sfc: f00

Example

https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2

https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2.idx

The alternative download page may help you better understand the URL structure

cURL download full file

curl -O https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2

cURL download full file and rename

curl -o hrrr20180101_00zf00.grib2 https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2

wget download full file

wget https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2

If you know the byte range of the variable you want (found from the .idx file), you can retrieve that single variable. The .idx files share the same URL as the grib2, except with .idx appended to the end.

cURL download a single variable

From the .idx file, we see that the byte range for TMP:2 m starts at 34884036 and ends at 36136433.

curl -o 20180101_00zf00_2mTemp.grib2 --range 34884036-36136433 https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2

After inspecting the file, you will see cURL has returned a valid grib2 file with one variable.

cURL download several variables

Unfortunately, the curl --range function wont work if you request more than one range. I don't know why this doesn't work, but it must be a limitation of the Pando archive. Fortunately, similar variables are usually grouped together, like U and V wind compenents, so you can request a range that spans the variables you want. This example gets TMP, POT, SPFH, DPT, RH, UGRD, VGRD, WIND at 2 meters.

curl -o 20180101_00zf00_2mTemp2mDPT10mwind.grib2 --range 34884036-44863087 https://pando-rgw01.chpc.utah.edu/hrrr/sfc/20180101/hrrr.t00z.wrfsfcf00.grib2


I use python/2.7.11. Describing my set-up would be long, so just use these as a template.

Download full grib2 files

Download the full grib2 file from the Pando archive. 'sfc' files are about 150 MB each, and 'prs' files are about 320 MB each.

Download single HRRR variable from file

It is more efficient if you only download the variables you are interested in. While a full sfc file is about 150 MB, a single variable field is about 1 MB. You can perform partial downloads of the grib2 files to extract the variables you want with cURL.

Main function: download_HRRR_variable_from_pando() Partial downloads with cURL require a known byte range. The grbi2.idx ( sfc example , prs example ) files are metadata text files that contain the begining byte of each field in the file. Each grib2 file has a unique metadata file. To find the byte range for a variable, the above function searches for the line that contains the specified variable abbreviation and level.
Example: Download single variable for single day Example: Download single variable for date range Example: Download multiple variables for date range, each variable in it's own file. Example: Multithreading for fast downloads. I tested the time it takes to download using a sequential download (one at a time), multiprocessing, and multithreading. Multithreading is the correct way to do this. The download time saturates after about 8 threads.

Example: Downloading adjacent variables.

You may be interested in retrieving more than one variable at a time, and keep it all in one grib2 file. This can be done by requesting a byte range that spans more than one variable. This method works if the variables you are interested in downloading are adjacent to each other in the grib2 file. For example, if you want the 10 meter U and V wind components, you are in luck because they are adjacent variables. In the main function, set the more_vars argument to the number of additional files to download, setting variable = to the first field you want to download.

For example: If you want all the variables at 500 mb, set variable='HGT:500 mb' and set more_vars=4, which will download 'HGT:500 mb', 'TMP:500 mb', 'DPT:500 mb', 'UGRD:500 mb', and 'VGRD:500 mb' in the same grib2 file.

Plot a single variable with Python and pluck a single point using cURL

This example does two things:

1) Gets a variable from the HRRR file (2-meter temperature), plots the values on a graph (you can do a basemap yourself).

2) Plucks out a single value at KSLC (Salt Lake City) for a time range and creates a timeseries plot. Creating the time series utilizes multiprocessing to speed up the download process.

Useful HRRR S3 Download functions

If you are patient enough to read all that, and have scrolled down here, you're in luck! This is a link to the python functions I use to download the HRRR data. My method is to Code for HRRR_S3 Functions

You can use rclone to copy files from Pando to your own disk.

When you configure rclone, select 's3' as the type of storage and then set everything else blank except for endpoint, which should be `https://pando-rgw01.chpc.utah.edu`.


How the .idx files are created
wgrib2 hrrr.t09z.wrfsfcf17.grib2 -t -var -lev -ftime > hrrr.t09z.wrfsfcf17.grib2.idx
How do I get the latitude/longitude grid?

Latitude and longitude for every HRRR grid point is defined as part of each grib message. The values for each grid are not stored for each grid box (that would take a lot of memory), but are calculated by the wgrib2 utility with the stored projection information.

If you are using pygrib, you can get the variable data, latitude, and longitude like this: value, lat, lon = grbs[1].data()

For convenience and some unique applications, I created an HDF5 file that contains just the HRRR latitude and longitude grids. HRRR_latlon.h5

List of missing and incomplete files
Wind Vectors: Grid Relative vs Earth Relative

If you are dealing with a vector quantity, like wind direction, you need to convert the U and V wind component from grid-relative to earth-relative to correctly orient the wind vectors.

Convert winds Grid-relative to Earth-Relative