logo
down
shadow

R - How to use sum and group_by inside apply?


R - How to use sum and group_by inside apply?

By : Talal Belal
Date : November 23 2020, 04:01 AM
wish helps you You can do this very quickly using data.table.
Creating Dummy Data:
code :
set.seed(123)
counters <- data.frame(A = rep(1:100000, each = 3), B = sample(c("02","Z1"), size = 300000, replace = T), G = sample(c(1,NA), size = 300000, replace = T))
library(data.table)
setDT(counters)
counters[,comb := paste0(B,"_",G)]
dcast(counters, A ~ comb, fun.aggregate = length, value.var = "A")
             A 02_1 02_NA Z1_1 Z1_NA
     1:      1    0     2    1     0
     2:      2    1     0    1     1
     3:      3    0     0    2     1
     4:      4    1     1    0     1
     5:      5    0     1    2     0
    ---                             
 99996:  99996    0     1    1     1
 99997:  99997    0     2    1     0
 99998:  99998    2     0    1     0
 99999:  99999    1     0    1     1
100000: 100000    0     2    0     1
counters[B == "02" & is.na(G), comb := "V"]
counters[B == "02" & !is.na(G), comb := "X"]
....


Share : facebook icon twitter icon
dplyr: How to apply do() on result of group_by?

dplyr: How to apply do() on result of group_by?


By : Olumide Okeowo
Date : March 29 2020, 07:55 AM
I wish this help you I'd like to use dplyr to group a table by one column, then apply a function to the set of values in the second column of each group. , Let us define eaten like this:
code :
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
eaten %.% group_by(person) %.% do(function(x) combn(x$foods, m = 2))
[[1]]
     [,1]     [,2]       [,3]      
[1,] "apple"  "apple"    "banana"  
[2,] "banana" "cucumber" "cucumber"

[[2]]
     [,1]        [,2]        [,3]      
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"  
library(gsubfn)
eaten %.% group_by(person) %.% fn$do2(~ combn(.$foods, m = 2))
$Grace
     [,1]     [,2]       [,3]      
[1,] "apple"  "apple"    "banana"  
[2,] "banana" "cucumber" "cucumber"

$Rob
     [,1]        [,2]        [,3]      
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber"  "banana"    "banana"  
How to apply self-defined function on the result of group_by

How to apply self-defined function on the result of group_by


By : user3226449
Date : March 29 2020, 07:55 AM
hope this fix your issue I'd like to group by the data by some column then replace NA with most recent observation. Is there any way to apply a function other than aggregation function to the result of group_by? , Here's how you would do this with dplyr
code :
dt %>%
   group_by(A) %>%
   mutate_each(funs(na.locf(., na.rm = FALSE, fromLast = FALSE)))
dt[, lapply(.SD, na.locf, na.rm = FALSE, fromLast = FALSE), by = A]
dt[, names(dt)[-1] := lapply(.SD, na.locf, na.rm = FALSE, fromLast = FALSE), A]
Trying to use dplyr to group_by and apply scale()

Trying to use dplyr to group_by and apply scale()


By : Ankit
Date : March 29 2020, 07:55 AM
it should still fix some issue The problem seems to be in the base scale() function, which expects a matrix. Try writing your own.
code :
scale_this <- function(x){
  (x - mean(x, na.rm=TRUE)) / sd(x, na.rm=TRUE)
}
library("dplyr")

# reproducible sample data
set.seed(123)
n = 1000
df <- data.frame(stud_ID = sample(LETTERS, size=n, replace=TRUE),
                 behavioral_scale = runif(n, 0, 10),
                 cognitive_scale = runif(n, 1, 20),
                 affective_scale = runif(n, 0, 1) )
scaled_data <- 
  df %>%
  group_by(stud_ID) %>%
  mutate(behavioral_scale_ind = scale_this(behavioral_scale),
         cognitive_scale_ind = scale_this(cognitive_scale),
         affective_scale_ind = scale_this(affective_scale))
library("data.table")

setDT(df)

cols_to_scale <- c("behavioral_scale","cognitive_scale","affective_scale")

df[, lapply(.SD, scale_this), .SDcols = cols_to_scale, keyby = factor(stud_ID)] 
using dplyr::group_by in a function within apply

using dplyr::group_by in a function within apply


By : Luke Anetsberger
Date : March 29 2020, 07:55 AM
I wish did fix the issue. You should apply using the colnames(dat) to get the correct groupings:
code :
dat <- mtcars[c(2:4,11)]



grp <- function(x) {
  group_by(dat,!!as.name(x)) %>%
  summarise(n=n()) %>% 
  mutate(pc=scales::percent(n/sum(n))) %>% 
  arrange(desc(n)) %>% head()
}


lapply(colnames(dat), grp)
Apply a custom function after group_by using dplyr in R

Apply a custom function after group_by using dplyr in R


By : user3033653
Date : March 29 2020, 07:55 AM
Any of those help Instead of passing the entire dataframe to is.na.contiguous, pass only the column value then it would be simple to apply it via group and also it would become flexible if you want to do the same for some different column.
code :
is.na.contiguous <- function(x, consecutive) {
   na.rle <- rle(is.na(x))
   na.rle$values <- na.rle$values & na.rle$lengths >= consecutive
   any(na.rle$values)
}

library(dplyr)
d %>%
  group_by(c) %>%
  filter(!is.na.contiguous(b, 2))

#      a     b     c
#  <dbl> <dbl> <dbl>
#1     1     1     1
#2     2     2     1
#3     3     2     1
#4     7    NA     3
#5     8     2     3
Related Posts Related Posts :
  • Finding income ratio by student status
  • Sum and Count Changes per Group for each Column in R
  • Writing a Path/Route Plot as a GeoTiff in R
  • Shorter order expression in R
  • Make a function using apply, stringr, stringi, and rbind run faster
  • Using cast() or ddply() to summarise the mean for two continuous variables in one dataframe
  • How to assign ID to multiple rows based on a value in 1 column in 1 row duplicating a value in a DIFFERENT column in a d
  • apply transparent background to divide plot area based on x values using ggplot
  • pivot a data frame and exclude blank cells in r
  • Subsetting Polygons from Spatial Polygons object by slot
  • Error: Don't know how to add e2 to a plot
  • selecting values from a df based on multiple percentages from a different dataframe
  • Why does R paste more decimals than have been rounded, but only sometimes?
  • Emacs ESS indent after %>%
  • Editing data frame after reactive upload in R shiny
  • R: How to add a zero into the middle of a string
  • Write a loop to select all combination of variable values generating positive equation values in R
  • Parameterizing group_by %>% summarise
  • How to display HTML in DT header?
  • R Vectorizing Operation
  • Cystoscape node color
  • geom_tile border missing at corners
  • Geting new data into old data frame in R
  • Eigenvector values for different time periods of same network (igraph in R)
  • Write a function with default column name inputs in dplyr::mutate()
  • Why isn't string splitting after last open parenthesis?
  • blogdown - how do I specify which page a post will appear on
  • Can I span groups of categories with horizontal lines in ggplot2
  • RODBC gives proper row count but yields empty query
  • Merging two incomplete factors
  • reshaping data with time represented as spells
  • R ggplot: How to create a scatter plot with marginal box plots
  • Rcpp use outer with pmax
  • Symbolic matrix mutiplication error (Ryacas)
  • Creating columns based on total number of columns in a data frame R
  • How do I address R raster mosaic error: 'data' must be of a vector type, was 'NULL'?
  • ggplot2() plotting one variable against itself by factor?
  • Random sets with three random numbers in it (sampling random points in a cube)
  • Function to find varying strings
  • How do I post some introductory paragraphs on the main page of my blogdown site?
  • Correlations between numerous variables grouped in dplyr
  • Animating 3d object in R Markdown html with play3d
  • Suppress multiple package/library loading messages
  • R CMD REMOVE has no effect
  • repeated observations average per month
  • Count values per year and based on other column
  • aggregating elements to create groups of minimal size
  • efficient subsetting of data.table with greater-than, less-than using indices
  • Prefix/suffix column content with column names
  • ggplot and ggsignif error on grouping variable
  • For Each Loop to convert into numeric values
  • How to summarise taking a random value from a categorical column?
  • Predictions in SageMaker ::: Writing Function To Split Big Data-frame Into Batches For Predictions
  • ggplot: How to keep marker colours in legend but hide text colours?
  • Removing character elements from a vector
  • Cumulative sum based on factor on R
  • expand colnames to match the last known one
  • `testthat::expect_silent()` does not seem to notice ggplot2 errors
  • Parse Factor in R
  • Exit from Command prompt after running r script
  • shadow
    Privacy Policy - Terms - Contact Us © bighow.org