group_by() for atlas_occurrences()
queries (#258)show_values(all_fields = TRUE)
(#266)read_zip() to reimport downloaded
filesgroup_by() in occurrence queries to allow facet
downloads by any variable (#195, #258)atlas_citation() for improved
claritygalah_filter() to specify
a taxon_concept_id rather than
galah_identify() (#245)field without data breaks occurrence downloads
(#248)! and %in%
parse correctly (#251)show_all(lists) no longer truncates results to first
500 rows (#252)atlas_counts() no longer errors when
group_by() is set but record count = 0 (#254)atlas_species() no longer
return different column names to queries that return a result
(#255)galah now supports media downloads for all atlases. The only exceptions are GBIF and France, for whom these APIs are not supported (yet)
dplyr syntaxatlas_species()) now work for
Sweden, France, and Spain (#234)select() now works for species downloads (i.e. via
atlas_species(); #185, #227)filter, group_by etc. not
recognising fields (#237)?taxonomic_searches
(#241)galah_geolocate(type = "radius") added.
Supports filtering by point location and radius (in km) (#216)galah_geolocate() and associated sub-functions
for GBIF queriesgalah_filter() no longer fails when assertions are
specified in galah_filter() (#199)atlas_species(),
particularly for other atlases (#234)select(), including supporting
atlas_species() and adding new
group = "taxonomy" option (#218)collect_media() no longer fails when a thumbnail is
missing (#215)galah_filter() parses apostrophes correctly in value
names (#214)group_by() |> atlas_counts() no longer truncates
rows at 30 (#223, #198)search_values() did not return matched
valuesshow_values() & atlas_counts() return
correctly formatted values (#233)atlas_occurrences() no longer overwrites returned field
names with user-supplied onesgalah_apply_profile() now works as expectedshow_values() (#235)collapse() now returns a query object,
rather than a query_set, and gains a .expand
argument to optionally append a query_set for debugging
purposes (#217).
collapse(),
compute(), collect()query_set that lists all APIs that will be pinged
(collapse()), send the queries to required APIs
(compute()), and return data as a tibble
(collect()) (#183).galah_filter(), galah_select() and related
functions now evaluated lazily; no API calls are made until
compute() is called, meaning that earlier programming
stages are faster and easier to debug.galah_filter()galah_filter() has been upgraded to use a hierarchical
parsing architecture suggested by Advanced R. As a
result, galah_filter() is faster and evaluates expressions
more consistently (#196, #169)galah_filter() now supports is.na,
!, c() & %in% (#196)galah_config() for better options management
(#193)slice_head() and desc() as
masked functions to use in galah atlas_counts() query.| in galah_filter()
(#169)show_values() errors nicely when API is down
(#184)atlas$region error when loading galah fixed
with potions package implementation (#178)atlas_occurrences(mint_doi = TRUE) (#182)group_by()
sometimes caused an error (#201)&) in query results
(#203)data_request object when wrapped
by a function (#207)Patch release to fix minor issues on some devel systems
on CRAN.
Minor release to address CRAN issues. Last release before 2.0.0.
Minor release to resolve issues on CRAN, and a few recent bugs.
tibble as input to
search_taxa() (e.g., to resolve homonyms, #168)galah_select() while atlas = GBIF (which is not supported;
#181)...
in galah_filter() (#186)An experimental feature of version 1.5.1 is the ability to call
functions from other packages (#161), as synonyms for
galah_ functions. These are:
identify() ({graphics}) as a synonym for
galah_identify()select() ({dplyr}) as a synonym for
galah_select()group_by() ({dplyr}) as a synonym for
galah_group_by()slice_head() ({dplyr}) as a synonym for
the limit argument in atlas_counts()st_crop() ({sf}) as a synonym for
galah_polygon()count() ({dplyr}) as a synonym for
atlas_counts()These are implemented as S3 methods for objects of class
data_request, which are created by
galah_call(). Hence new function names only work when piped
after galah_call().
The Global Biodiversity Information Facility (GBIF) is the umbrella organisation to which all other atlases supply data. Hence it is logical to be able to query GBIF and it’s “nodes” (i.e. the living atlases) via a common API. Supported functions are:
search_taxa and galah_identify for name
matchingshow_all(fields) and
show_all(assertions)show_all() calls that give ‘collections’ information
are limited to 20 records by default, as GBIF datasets are often huge.
search_all() is generally more reliableshow_values() for any GBIF fieldgalah_filter and galah_group_by (and
therefore filter and group_by(), see above),
but NOT galah_select.atlas_counts() (and therefore count(), see
above)atlas_occurrences() & atlas_species();
both are implemented via the ‘downloads’ system, meaning that queries
can be larger, but may be slowThe current implementation is experimental and back-end changes are expected in future. Users who require a more stable implementation should use the {rgbif} package.
galah_config() gains a print function, and
now uses fuzzy matching for the atlas field to match to
region, organisation or acronym (as defined by
show_all(atlases)). An example use case is to match to
organisations via acronyms,
e.g. galah_config(atlas = "ALA").readr::read_csv in
place of utils::read.csv for improved speedshow_all (and associated sub-functions) gain a
limit argument, set to NULL (i.e. no limit) by defaultgalah no longer imports {data.table},
since the only function previously used from that package
(rbindlist) is duplicated by
dplyr::bind_rowsurl_paginate() to handle cases where
pagination is needed, but total data length is unknown
(e.g. show_all_lists(), #170).galah_select(group = "assertions") is always enacted
properly by atlas_occurrences, and won’t lead to overly
long urls (#137). When called without any other field names,
recordID is added to avoid triggering the ‘default’ set of
columns.atlas_species works again after some minor changes to
the API; but requires a registered email to functiongalah_call(), filtered with galah_ functions,
and downloaded with atlas_ functions. Previously, this
functionality was only possible with queries to the ALA (#126)collect_media()atlas_media() has been improved to use 2 simplified
functions to show & download media (#145, #151):
atlas_media()
returns a tibble of available media filescollect_media()
downloads the list of media from atlas_media() to a local
machinetype = "thumbnails" in collect_media()
(#140)galah_geolocate()galah_geolocate() now supports filtering queries using
polygons and bounding boxes. Overall improvements and bug fixes to
galah_geolocate() through new internal functions
galah_polygon() and galah_bbox() (#125)show_all(),
search_all() & show_values(),
search_values()show_all() and
search_all() are flexible look-up functions that can search
for all information in {galah}, rather than by separate
search_/show_all_ functions
(e.g. search_fields(), search_atlases(),
show_all_fields(), show_all_reasons(), etc)
(#127, #132)show_values()
& search_values() (#131)galah_apply_profile() function (#130)galah_ functions
(#133)galah_geolocate() no longer depends on archived
{wellknown} package (#141)galah_filter(species != "") or
galah_filter(species == "") (#143)collect_doi()
(#140)galah_select() no longer adds “basic” group of columns
automatically (#128)galah_config() doesn’t display incorrect
preserve = TRUE message (#136)galah_select() (#137)atlas_counts() and atlas_occurrences() no
longer return different record numbers when a field is empty (#138)atlas_media() results no longer differ to results
returned by galah_filter() &
atlas_counts() (#151)ala_ functions are renamed to use the prefix
atlas_. This change reflects their functionality with
international atlases (i.e., atlas_occurrences,
atlas_counts, atlas_species,
atlas_media, atlas_taxonomy,
atlas_citation) (#103)select_taxa is replaced by 3 functions:
galah_identify, search_taxa and
search_identifiers. galah_identify is used
when building data queries, whereas search_taxa and
search_identifiers are now exclusively used to search for
taxonomic information. Syntax changes are intended to reflect their
usage and expected output (#112, #122)select_ functions are renamed to use the prefix
galah_. Specifically, galah_filter,
galah_select and galah_geolocate replace
select_filters, select_columns and
select_locations. These syntax changes reflect a move
towards consistency with dplyr naming and functionality
(#101, #108)find_ functions that provide a listing of all possible
values renamed to show_all_ (i.e.,
show_all_profiles, show_all_ranks,
show_all_atlases, show_all_cached_files,
show_all_fields, show_all_reasons).
find_ functions that require and input and return specific
results renamed to search_ (i.e.,
search_field_values,
search_profile_attributes) (#112, #113)galah_group_bygalah_group_by(), which groups and
summarises record counts based on categorical field values, similar to
dplyr::group_by() (#90, #95)galah_down_togalah_down_to() + atlas_taxonomy(), which uses
tidy evaluation like other galah_ functions (#101,
#120)galah_call|>,
%>%) by first using galah_call(), narrowing
queries with galah_ functions and finishing queries with an
atlas_ function (#60, #120).galah_filter (#91, #92)search_taxa returns correct IDs for search terms with
parentheses (#96)search_taxa returns best-fit taxonomic result when
ranks are specified in data.frame or tibble
(#115)search_taxonomy()
renamed to ala_taxonomy()ala_taxonomy no longer fails for nodes ranked
as informal or unranked (#86)data.tree packagegalahgalah_config()ala_config() has been renamed to
galah_config() to improve internal consistency (#68)search_taxonomy()search_taxonomy() provides a means to search for
taxonomic names and check the results are ‘correct’ before proceeding to
download data via ala_occurrences(),
ala_species() or ala_counts() (e.g., not
ambiguous or homonymous) (#64 #75)search_taxonomy() returns information of author and
authority of taxonomic names (#79)search_taxonomy() consistently orders column names,
including in correct taxonomic order by rank (#81)find_cached_files() lists all user cached files and
stored metadata (#57)clear_cached_files() removes previously cached files
and stored metadata (#71)ala_counts(), ala_occurrences(),
ala_media() and ala_species() now have
refresh_cache argument to remove previously cached files
and replace with the current query (#71)ala_media() caches media metadata if
galah_config(caching = TRUE)search_fields() allows the user to pass a
qid as an argument (#59)galah_config(run_checks = FALSE). This helps users avoid
slowing down data request download speeds when many requests are made in
quick succession via galah_filter() or
ala_occurrences() (#61, #80)ala_counts(), select_columns() and
search_fields() now use match.arg to
approximate strings through fuzzy matching (#66)select_columns(group = 'assertions') now sends
qa = includeall to ALA web service API to return all
assertion columns (#48)ala_occurrences() returns data DOI when
ala_occurrences(mint_doi = TRUE) and re-downloads data when
called multiple times (#56)ala_occurrences() no longer converts field names with
all-CAPS to camelCase (#62)ala_config() allows users to specify an international
Atlas to download data from (#21)ala_media() includes the file path to the downloaded
media in the returned metadata (#22)ala_occurrences() contains the
search_url used to download records; this takes the user to
the website search page (#32)ala_species() provides a more helpful error if no
species are found (#39)select_taxa() has an optional all_ranks
argument to return intermediate rank information (#35)select_taxa() behaves as expected when character
strings of 32 or 36 characters are provided (#23)ala_occurrences() uses the
columns as expected (#30)galah_filter() negates assertion filters when required,
fixing the issue of assertion values being ignored (#27)select_taxa() no longer throws an error when queries of
more than one term have a differing number of columns in the return
value (#41)ala_counts() returns data.frame with consistent column
classes when a group_by parameter is called multiple times
and ala_config(caching = TRUE) (#47)ala_ functions fail gracefully if a non-id character
string is passed (#49)ala_media() now takes the same select_
arguments as other ala_ functions (#18)search_fields now has media as a
type argument optionverbose == TRUE (#8)galah_location auto-detects the type of argument
provided and so takes a single argument, query, in place of
sf and wkt (#17)select_taxa auto-detects the type of argument provided
and so takes a single argument, query, in place of
term and term_type (#16)ala_counts uses the group_by field name as
the returned data.frame column name (#6)ala_occurrences sends sourceId parameter
to ALA (#5)search_fields provides a more helpful error for invalid
types (#11)First version of galah, built on earlier functionality
from the ALA4R package.