Separate a character column into multiple columns in a column of nested data frames
Source:R/nest_separate.R
nest_separate.Rd
nest_separate()
is used to separate a single character column into multiple
columns using a regular expression or a vector of character positions in a
list of nested data frames.
Usage
nest_separate(
.data,
.nest_data,
col,
into,
sep = "[^[:alnum:]]+",
remove = TRUE,
convert = FALSE,
extra = "warn",
fill = "warn",
...
)
Arguments
- .data
A data frame, data frame extension (e.g., a tibble), or a lazy data frame (e.g., from dbplyr or dtplyr).
- .nest_data
A list-column containing data frames
- col
Column name or position within. Must be present in all data frames in
.nest_data
. This is passed totidyselect::vars_pull()
.This argument is passed by expression and supports quasiquotation (you can unquote column names or column positions).
- into
Names of new variables to create as character vector. Use
NA
to omit the variable in the output.- sep
Separator between columns.
If character,
sep
is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.If numeric,
sep
is interpreted as character positions to split at. Positive values start at 1 at the far-left of the string; negative value start at -1 at the far-right of the string. The length ofsep
should be one less thaninto
.- remove
If
TRUE
, remove input column from output data frame.- convert
If
TRUE
, will runtype.convert()
withas.is = TRUE
on new columns. This is useful if the component columns are integer, numeric or logical.NB: this will cause string
"NA"
s to be converted toNA
s.- extra
If
sep
is a character vector, this controls what happens when there are too many pieces. There are three valid options:"warn"
(the default): emit a warning and drop extra values."drop"
: drop any extra values without a warning."merge"
: only splits at mostlength(into)
times
- fill
If
sep
is a character vector, this controls what happens when there are not enough pieces. There are three valid options:"warn"
(the default): emit a warning and fill from the right"right"
: fill with missing values on the right"left"
: fill with missing values on the left
- ...
Additional arguments passed on to
tidyr::separate()
methods.
Value
An object of the same type as .data
. Each object in the column .nest_data
will have the specified column split according to the regular expression or the vector of character positions.
Details
nest_separate()
is a wrapper for tidyr::separate()
and maintains the functionality
of separate()
within each nested data frame. For more information on separate()
please refer to the documentation in 'tidyr'.
See also
Other tidyr verbs:
nest_drop_na()
,
nest_extract()
,
nest_fill()
,
nest_replace_na()
,
nest_unite()
Examples
set.seed(123)
gm <-
gapminder::gapminder %>%
dplyr::mutate(comb = paste(continent, year, sep = "-"))
gm_nest <- gm %>% tidyr::nest(country_data = -continent)
gm_nest %>%
nest_separate(country_data,
col = comb,
into = c("var1","var2"),
sep = "-")
#> # A tibble: 5 × 2
#> continent country_data
#> <fct> <list>
#> 1 Asia <tibble [396 × 7]>
#> 2 Europe <tibble [360 × 7]>
#> 3 Africa <tibble [624 × 7]>
#> 4 Americas <tibble [300 × 7]>
#> 5 Oceania <tibble [24 × 7]>