Dates, times and timezones can be frustrating, especially when working with environmental time series such as those collected by air and water quality sensors.
Environmental time series data often have a strong diurnal signal and are typically plotted with a time axis displaying local time. However, when data are aggregated into larger collections, it is typical to store data with a universal time axis – UTC.
Problems can arise when parsing and formatting dates and times
because R defaults to the system timezone available with
Sys.timezone()
. Imagine an agency scientist based in
Washington, DC, using their laptop to display recent air quality data
from Los Angeles while at a conference in Tasmania. The data center
processing the data might be in Boulder but the data processing machine
might be set to use UTC. Potential timezones (available with
OlsonNames()
) relevant to this scenario include:
America/New_York
America/Los_Angeles
Australia/Tasmania
America/Denver
UTC
Which timezone should be used to convert a request for data from
“2019-08-08”” to “2018-08-15”” into POSIXct
datetimes?
To enforce specification of timezones and to help with the common user interface need to specify a range of dates or times, the MazamaCoreUtils package provides the following functions:
dateRange()
– parses and returns POSIXct
start and end dates representing full days in the specified
timezonetimeRange()
– parses and returns POSixct
start and end times in the specified timezoneparseDatetime()
– parses and returns a vector of
POSIXct
values in the specified timezoneThe parseDatetime()
function is intended as a
timezone-requiring replacement for
lubridate::parse_date_time()
.
Enforcing the specification of timezones throughout a body of code is the most robust way to remove timezone-related errors from your software. To help with this this type of code review, the package also includes functions for testing whether specific named arguments are used with certain function calls:
lintFunctionArgs_file()
– check a single filelintFunctionArgs_dir()
– check an entire directoryTo use these functions you must define a set of
function:argument
rules to be applied such as:
timezoneLintRules <- list(
"parse_date_time" = "tz",
"with_tz" = "tzone",
"now" = "tzone",
"strftime" = "tz"
)
This is interpreted as:
parse_date_time()
function must use
the tz
argument explicitly.with_tz()
function must use the
tzone
argument explicitlyWhile these functions could be used to test for explicit use in any
function:argument
pair, our concern here is primarily with
specification of timezones. The packages includes a detailed list of
timezoneLintRules
to help with this. As an example, here is
the result of linting the dateRange.R
function in this
package:
> lintFunctionArgs_file("R/dateRange.R", timezoneLintRules)
# A tibble: 7 x 6
file line_number column_number function_name named_args includes_required
<chr> <int> <int> <chr> <list> <lgl>
1 dateRange.R 125 29 with_tz <chr [1]> TRUE
2 dateRange.R 128 27 with_tz <chr [1]> TRUE
3 dateRange.R 141 18 parse_date_time <chr [2]> TRUE
4 dateRange.R 142 18 parse_date_time <chr [2]> TRUE
5 dateRange.R 159 18 parse_date_time <chr [2]> TRUE
6 dateRange.R 176 18 parse_date_time <chr [2]> TRUE
7 dateRange.R 188 18 now <chr [1]> TRUE
The result shows that the dateRange.R
source code is
consistent in always explicitly specifying a timezone.
Hopefully, this attention to timezones will help your code avoid misunderstandings when it comes to date and time requests.