  C RUBY-ON-RAILS MYSQL ASP.NET DEVELOPMENT RUBY .NET LINUX SQL-SERVER REGEX WINDOWS ALGORITHM ECLIPSE VISUAL-STUDIO STRING SVN PERFORMANCE APACHE-FLEX UNIT-TESTING SECURITY LINQ UNIX MATH EMAIL OOP LANGUAGE-AGNOSTIC VB6 MSBUILD # R - How to use sum and group_by inside apply?  » r » R - How to use sum and group_by inside apply?

By : Talal Belal
Date : November 23 2020, 04:01 AM
wish helps you You can do this very quickly using data.table.
Creating Dummy Data: code :
``````set.seed(123)
counters <- data.frame(A = rep(1:100000, each = 3), B = sample(c("02","Z1"), size = 300000, replace = T), G = sample(c(1,NA), size = 300000, replace = T))
``````
``````library(data.table)
setDT(counters)
counters[,comb := paste0(B,"_",G)]
dcast(counters, A ~ comb, fun.aggregate = length, value.var = "A")
A 02_1 02_NA Z1_1 Z1_NA
1:      1    0     2    1     0
2:      2    1     0    1     1
3:      3    0     0    2     1
4:      4    1     1    0     1
5:      5    0     1    2     0
---
99996:  99996    0     1    1     1
99997:  99997    0     2    1     0
99998:  99998    2     0    1     0
99999:  99999    1     0    1     1
100000: 100000    0     2    0     1
``````
``````counters[B == "02" & is.na(G), comb := "V"]
counters[B == "02" & !is.na(G), comb := "X"]
....
`````` ## dplyr: How to apply do() on result of group_by?

By : Olumide Okeowo
Date : March 29 2020, 07:55 AM
I wish this help you I'd like to use dplyr to group a table by one column, then apply a function to the set of values in the second column of each group. , Let us define eaten like this:
code :
``````eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
``````
``````eaten %.% group_by(person) %.% do(function(x) combn(x\$foods, m = 2))
``````
``````[]
[,1]     [,2]       [,3]
[1,] "apple"  "apple"    "banana"
[2,] "banana" "cucumber" "cucumber"

[]
[,1]        [,2]        [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"
``````
``````library(gsubfn)
eaten %.% group_by(person) %.% fn\$do2(~ combn(.\$foods, m = 2))
``````
``````\$Grace
[,1]     [,2]       [,3]
[1,] "apple"  "apple"    "banana"
[2,] "banana" "cucumber" "cucumber"

\$Rob
[,1]        [,2]        [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"
`````` ## How to apply self-defined function on the result of group_by

By : user3226449
Date : March 29 2020, 07:55 AM
hope this fix your issue I'd like to group by the data by some column then replace NA with most recent observation. Is there any way to apply a function other than aggregation function to the result of group_by? , Here's how you would do this with dplyr
code :
``````dt %>%
group_by(A) %>%
mutate_each(funs(na.locf(., na.rm = FALSE, fromLast = FALSE)))
``````
``````dt[, lapply(.SD, na.locf, na.rm = FALSE, fromLast = FALSE), by = A]
``````
``````dt[, names(dt)[-1] := lapply(.SD, na.locf, na.rm = FALSE, fromLast = FALSE), A]
`````` ## Trying to use dplyr to group_by and apply scale()

By : Ankit
Date : March 29 2020, 07:55 AM
it should still fix some issue The problem seems to be in the base scale() function, which expects a matrix. Try writing your own.
code :
``````scale_this <- function(x){
(x - mean(x, na.rm=TRUE)) / sd(x, na.rm=TRUE)
}
``````
``````library("dplyr")

# reproducible sample data
set.seed(123)
n = 1000
df <- data.frame(stud_ID = sample(LETTERS, size=n, replace=TRUE),
behavioral_scale = runif(n, 0, 10),
cognitive_scale = runif(n, 1, 20),
affective_scale = runif(n, 0, 1) )
scaled_data <-
df %>%
group_by(stud_ID) %>%
mutate(behavioral_scale_ind = scale_this(behavioral_scale),
cognitive_scale_ind = scale_this(cognitive_scale),
affective_scale_ind = scale_this(affective_scale))
``````
``````library("data.table")

setDT(df)

cols_to_scale <- c("behavioral_scale","cognitive_scale","affective_scale")

df[, lapply(.SD, scale_this), .SDcols = cols_to_scale, keyby = factor(stud_ID)]
`````` ## using dplyr::group_by in a function within apply

By : Luke Anetsberger
Date : March 29 2020, 07:55 AM
I wish did fix the issue. You should apply using the colnames(dat) to get the correct groupings:
code :
``````dat <- mtcars[c(2:4,11)]

grp <- function(x) {
group_by(dat,!!as.name(x)) %>%
summarise(n=n()) %>%
mutate(pc=scales::percent(n/sum(n))) %>%
}

lapply(colnames(dat), grp)
`````` ## Apply a custom function after group_by using dplyr in R

By : user3033653
Date : March 29 2020, 07:55 AM
Any of those help Instead of passing the entire dataframe to is.na.contiguous, pass only the column value then it would be simple to apply it via group and also it would become flexible if you want to do the same for some different column.
code :
``````is.na.contiguous <- function(x, consecutive) {
na.rle <- rle(is.na(x))
na.rle\$values <- na.rle\$values & na.rle\$lengths >= consecutive
any(na.rle\$values)
}

library(dplyr)
d %>%
group_by(c) %>%
filter(!is.na.contiguous(b, 2))

#      a     b     c
#  <dbl> <dbl> <dbl>
#1     1     1     1
#2     2     2     1
#3     3     2     1
#4     7    NA     3
#5     8     2     3
`````` 