我正在使用{dplyr}1.1.0中的data.frames
的list-columns
,我想知道当嵌套的data.frame被按行方式分组时,是否可以在不离开管道的情况下对每个data.frame
中的rename()
和mutate()
列进行重命名()
和mutate()
。
为什么我想知道/这么做?根据我对{dplyr}1.1.0的理解,它推荐rowwise()
,而不是在列上使用{purrr}的map
-family。下面我首先展示了我在{dplyr}1.1.0之前所做的工作,然后展示了几个针对{dplyr}1.1.0的示例(其中大多数不起作用)。
虽然{rlang}支持左侧的粘附字符串(LHS),这可以在编写{dplyr}自定义函数时使用,但{dplyr}函数在rowwise
tibble
中的LHS似乎还不受支持(至少我下面的示例不起作用)。
对于rename
,我找到了一种使用rename_with()
的方法,但我不知道如何使用mutate
使其工作。
我也不理解我得到的大多数错误消息。他们或多或少地说,在:=
之前,我没有在LHS上使用字符串,但在rowwise
模式下,我引用的列(new
)实际上是length==1
的字符向量。
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(purrr)
myiris <- iris %>%
nest_by(Species, .key = "mydat") %>%
ungroup %>%
mutate(new = letters[1:3])
# our data looks like this
# we want to use the strings in column `new` on the LHS of `rename` and `mutate`
myiris
#> # A tibble: 3 x 3
#> Species mydat new
#> <fct> <list<tbl_df[,4]>> <chr>
#> 1 setosa [50 x 4] a
#> 2 versicolor [50 x 4] b
#> 3 virginica [50 x 4] c
# For reference: under dplyr < 1.0 I did the following:
# rename in pipe
# working
myiris %>%
mutate(mydat = map2(mydat, new,
~ rename_at(.x, "Sepal.Length", function(z) paste(.y)))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> # ... with 46 more rows
# mutate in pipe
# was never working even under dplyr < 1.0.0
myiris %>%
mutate(mydat = map2(mydat, new,
~ mutate(.x, eval(.y) := .y))) %>%
pull(mydat)
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `map2(mydat, new, ~mutate(.x, `:=`(eval(.y), .y)))`.
# mutate with custom function
# working
mymutate <- function(df, y) {
mutate(df, !! y := y)
}
myiris %>%
mutate(mydat = map2(mydat, new,
~ mymutate(.x, .y))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width a
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width b
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width c
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> # ... with 46 more rows
# dplyr > 1.0.0
# objective: `rename()` or `mutate()` in pipe on list-column of data.frames
# while using different column names on LHS coming from another
# column (here `new`)
myiris_row <- myiris %>% rowwise
# rename --------
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename({{new}} := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! new := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! sym(new) := "Sepal.Length")))
#> Error: Only strings can be converted to symbols
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(all_of(new) := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% rename(`:=`(all_of(new), "Sepal.Length")))`.
#> i The error occured in row 1.
# working, but only with `rename_with()`
myiris_row %>%
mutate(mydat = list(mydat %>% rename_with(~ new, "Sepal.Length"))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> # ... with 46 more rows
# mutate ------
# the values of the new column don't matter
# here we just use the same input as the name, to show that RHS evaluation is easier.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(!! new := new)))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris %>%
mutate(mydat = list(mydat %>% mutate(!! sym(new) := new)))
#> Error: Only strings can be converted to symbols
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(all_of(new) := new)))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% mutate(`:=`(all_of(new), new)))`.
#> i The error occured in row 1.
# almost working (what's going on in the data[[1]] btw!)
myiris_row %>%
mutate(mydat = list(mydat %>% mutate("{{new}}" := new))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `promise_fn(3L)`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `"b"`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `"c"`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> # ... with 46 more rows
由reprex包(v0.3.0)在2020-12-22创建
您可以使用quote()
保护您的!!
不受外部呼叫的影响,然后在嵌套呼叫中再次使用!!
取消其引号:
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(!! quote(!!new) := new))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width a
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> 5 5 3.6 1.4 0.2 a
#> 6 5.4 3.9 1.7 0.4 a
#> 7 4.6 3.4 1.4 0.3 a
#> 8 5 3.4 1.5 0.2 a
#> 9 4.4 2.9 1.4 0.2 a
#> 10 4.9 3.1 1.5 0.1 a
#> # ... with 40 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width b
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> 5 6.5 2.8 4.6 1.5 b
#> 6 5.7 2.8 4.5 1.3 b
#> 7 6.3 3.3 4.7 1.6 b
#> 8 4.9 2.4 3.3 1 b
#> 9 6.6 2.9 4.6 1.3 b
#> 10 5.2 2.7 3.9 1.4 b
#> # ... with 40 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width c
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> 5 6.5 3 5.8 2.2 c
#> 6 7.6 3 6.6 2.1 c
#> 7 4.9 2.5 4.5 1.7 c
#> 8 7.3 2.9 6.3 1.8 c
#> 9 6.7 2.5 5.8 1.8 c
#> 10 7.2 3.6 6.1 2.5 c
#> # ... with 40 more rows
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! quote(!!new) := "Sepal.Length"))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> 5 5 3.6 1.4 0.2
#> 6 5.4 3.9 1.7 0.4
#> 7 4.6 3.4 1.4 0.3
#> 8 5 3.4 1.5 0.2
#> 9 4.4 2.9 1.4 0.2
#> 10 4.9 3.1 1.5 0.1
#> # ... with 40 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> 5 6.5 2.8 4.6 1.5
#> 6 5.7 2.8 4.5 1.3
#> 7 6.3 3.3 4.7 1.6
#> 8 4.9 2.4 3.3 1
#> 9 6.6 2.9 4.6 1.3
#> 10 5.2 2.7 3.9 1.4
#> # ... with 40 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> 5 6.5 3 5.8 2.2
#> 6 7.6 3 6.6 2.1
#> 7 4.9 2.5 4.5 1.7
#> 8 7.3 2.9 6.3 1.8
#> 9 6.7 2.5 5.8 1.8
#> 10 7.2 3.6 6.1 2.5
#> # ... with 40 more rows
当相应的列>0时,我需要将一些数据列设置为NA。 我也在考虑重塑,这样我就可以做一个变异。这里最好的做法是什么?
给定数据帧,如下所示 我想通过dplyr 基于< code>var的值添加一个col 。 基于以下逻辑。 如果或则,如果或则 一起使用,如下所示
我用R编写了以下代码,效果很好。但是,假设我必须对具有多个级别的因子变量应用类似的代码(
我想从数据帧中提取一个变量名,并用dplyr::mutate创建一个新变量。我必须写什么才能接受通过“md$meta[1]”提供的变量名?我想这是直截了当的,但我还没能在网上找到答案。如有任何帮助,不胜感激!
而不是抛出错误。有没有一种方法可以在dplyr中得到相同的结果而不会得到未知变量错误?
我有数据。看起来像这样的框架 首先,我想根据Day aka group_by(Day)对数据帧进行分组。当在每个组中,每种类型(tr1,tr2)的和(平均sd)大于控制(ctrl)的差(平均sd),然后我想在新列(new.col)中指定值~是,如果不是,我想指定值~否。 例如,我希望我的数据看起来像这样。它不一定要看起来像这样