nest_distinct()
selects only unique/distinct rows in a nested data frame.
Arguments
- .data
A data frame, data frame extension (e.g., a tibble), or a lazy data frame (e.g., from dbplyr or dtplyr).
- .nest_data
A list-column containing data frames
- ...
Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables.
- .keep_all
If
TRUE
, keep all variables in.nest_data
. If a combination of...
is not distinct, this keeps the first row of values.
Value
An object of the same type as .data
. Each object in the column .nest_data
will also be of the same type as the input. Each object in .nest_data
has
the following properties:
Rows are a subset of the input but appear in the same order.
Columns are not modified if
...
is empty or.keep_all
isTRUE
. Otherwise,nest_distinct()
first callsdplyr::mutate()
to create new columns within each object in.nest_data
.Groups are not modified.
Data frame attributes are preserved.
Details
nest_distinct()
is largely a wrapper for dplyr::distinct()
and maintains
the functionality of distinct()
within each nested data frame. For more
information on distinct()
, please refer to the documentation in
dplyr
.
Examples
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)
gm_nest %>% nest_distinct(country_data, country)
#> # A tibble: 5 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [33 × 1]>
#> 2 Europe <tibble [30 × 1]>
#> 3 Africa <tibble [52 × 1]>
#> 4 Americas <tibble [25 × 1]>
#> 5 Oceania <tibble [2 × 1]>
gm_nest %>% nest_distinct(country_data, country, year)
#> # A tibble: 5 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [396 × 2]>
#> 2 Europe <tibble [360 × 2]>
#> 3 Africa <tibble [624 × 2]>
#> 4 Americas <tibble [300 × 2]>
#> 5 Oceania <tibble [24 × 2]>