FTP Data Access
URUNME can be used to download data from different websites using various file transfer protocols, e.g. FTP (File transfer protocol), on a periodic basis for real-time modeling, analysis, and visualization. The following example is adopted from a futurecasting project where forecasted streamflow data are downloaded from National Water Model (NWM) FTP servers on a daily basis (Figure 1). NWM is a hydrological model that forecasts stream flows across the continental United States and has been in operation since Aug. 16, 2016, on the National Oceanographic and Atmospheric Administration (NOAA) supercomputing system. A long-range 16-member ensemble forecast is produced every day going out 30 days in the future with 4 ensemble members in each cycle at 00, 06, 12, and 18 UTC. Figure 1 shows the process created in URUNME used to download and query the 4 ensemble members produced at 1800 hours. Each day, a new directory is created in the NWM FTP URL (ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod) named based on the production date with a ‘nwm.’ prefix, e.g. ‘nwm.20180805’.
A ‘Formula’ function is used to create the URL and local paths shown, using the following expressions (Figure 2):
- todate = Now();
- startdate = Sub(todate, TimeSpan(1, 0, 0, 0));
- todatetext = Text(startdate, ‘yyyyMMdd’);
- remotepath = Concatenate(‘/pub/data/nccf/com/nwm/prod/nwm.’, todatetext);
- localpath = Concatenate(‘\\Forecast\\’, todatetext);
- mem1path = Concatenate(localpath, ‘\\long_range_mem1’);
- mem2path = Concatenate(localpath, ‘\\long_range_mem2’);
- mem3path = Concatenate(localpath, ‘\\long_range_mem3’);
- mem4path = Concatenate(localpath, ‘\\long_range_mem4’);
Now() formula gets the current date and Sub(datetime, timespan) subtracts a day from it. Text(datetime, format) converts the date to text using the given format. The Concatenate(text1, text2, …) function is then used to create the remote path (FTP URL) and local directories where each ensemble is stored.
The ‘Downloader’ function is used to download each ensemble in netCDF format and store them into their respective directories using the parameters shown in Figure 3. A total of 120 netCDF files are downloaded for each ensemble. The parameters ‘remotepath’ and ‘localpath’ in the downloader function are set to ‘Use Variables’ and the corresponding variables from the function ‘paths’ are used as inputs. Simple masks are used to include or exclude files or folders from the download.
Four ‘Read netCDF’ functions, one for each ensemble, are used to read the downloaded files. The ‘Read netCDF’ function (Figure 4) has the option of reading either a single file or an entire directory of multiple files and then combining the data to create a single timeseries. Flow data for the required stream are extracted from these netCDF files by providing the stream ids, as defined by NWM.
Once the data are read into the temporary database, ‘Formula’ functions are used to calculate min, max, and means (Figure 5) of the four ensembles before storing that information into the embedded database using the ‘Write Database’ function. A ‘Scheduler’ is used to run this process automatically on a daily basis.