logo
down
shadow

Cumulative sum based on factor on R


Cumulative sum based on factor on R

By : Torbjörn Seiger
Date : November 22 2020, 04:01 AM
help you fix your problem I have the following dataset, and I need to acumulate the value and sum, if the factor is 0, and then put the cummulated sum when I found the factor != 0. , If you want to stick with the for-loop, you can try this code :
code :
DF$Result <- NA
prev <- 0
for(i in seq_len(nrow(DF))){
  DF$Result[i] <- DF$Variable.1[i] + prev
  if(DF$Factor[i] == 1)
    prev <- 0
  else
    prev <- DF$Result[i]
}


Share : facebook icon twitter icon
Cumulative frequency by factor

Cumulative frequency by factor


By : Fahim Shahid
Date : March 29 2020, 07:55 AM
wish helps you I have to find out the cumulative frequency, converted to percentage, of a continuous variable by factor. For example: , I usually use ddply and transform to do this type of thing:
code :
> data = ddply(data, c('Site', 'Plot'), transform, Gsum=cumsum(G), Gtot=sum(G))
> qplot(x=d, y=Gsum/Gtot, facets=Plot~Site, geom='step', data=data)
Custom cumulative sum with decay factor

Custom cumulative sum with decay factor


By : Grypher Gryphon
Date : March 29 2020, 07:55 AM
Any of those help Your case is exactly the same as generating a AR(1) model with coefficient 0.5. You can use the filter function to generate the data. filter also support higher order recursion, convolution or mixture of them(think about the ARMA model). You may have a look ofconvolve for other convolutions. Also, you could compiler your code to speed up the loop. In my code, complied loop and uncompiled loop code is about 111 and 162 times slower than filter respectively.
code :
library(compiler)
library(rbenchmark)

CustomCumsum<-function(x,alpha){
out<-x[1]
for(i in 2:length(x))
    out[i] <- out[i-1]*alpha+x[i]
out
}

compiledCustomCumsum<-cmpfun(CustomCumsum)

FilterCustomCumsum<-function(x,alpha) as.numeric(filter(x,alpha, method = "recursive"))

x<-rnorm(1000)
# Test whether they are the same
identical(compiledCustomCumsum(x,0.5) , FilterCustomCumsum(x,0.5) )

benchmark(
CustomCumsum=CustomCumsum(x,0.5),compiledCustomCumsum=compiledCustomCumsum(x,0.5),          FilterCustomCumsum=FilterCustomCumsum(x,0.5)
)
                  test replications elapsed relative user.self sys.self user.child sys.child
2 compiledCustomCumsum          100    8.89  111.125      8.78     0.01         NA        NA
1         CustomCumsum          100   13.02  162.750     11.84     0.50         NA        NA
3   FilterCustomCumsum          100    0.08    1.000      0.08     0.00         NA        NA
R Cumulative sum by factor with sum 'reset'

R Cumulative sum by factor with sum 'reset'


By : Geoff Chappell
Date : March 29 2020, 07:55 AM
will be helpful for those in need my problem is I'm trying to find the cumulative sum of rainfall by season (DJF, MAM, JJA, SON) and by year (1926 - 2000), with the sum resetting to zero at the end of each season. , May be you can try dplyr
code :
library(dplyr)
rainfall %>% 
         group_by(season, year) %>%
         mutate(seasonal.cumsum=cumsum(RR))

#          DATE year month season RR yearly.cumsum seasonal.cumsum
#1 19260529 1926     5    MAM  0          2347               0
#2 19260530 1926     5    MAM  0          2347               0
#3 19260531 1926     5    MAM  9          2356               9
#4 19260601 1926     6    JJA  0          2356               0
#5 19260602 1926     6    JJA  3          2359               3
#6 19260603 1926     6    JJA 71          2430              74
#7 19260604 1926     6    JJA  0          2430              74
#8 19260605 1926     6    JJA 48          2478             122
 indx <- rainfall2$year-min(rainfall2$year) + rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall$year))

 res <-  rainfall2 %>%
                   group_by(season, year2) %>%
                   mutate(seasonal.cumsum=cumsum(RR))

 do.call(rbind,lapply(split(res, res$year2), head,2))
 #       DATE month year season  RR year2 seasonal.cumsum
 #1 19260504     5 1926    MAM  50  1927              50
 #2 19260505     5 1926    MAM  84  1927             134
 #3 19270301     3 1927    MAM  98  1928              98
 #4 19270302     3 1927    MAM 112  1928             210
 #5 19280301     3 1928    MAM  91  1929              91
 #6 19280302     3 1928    MAM  85  1929             176
 #7 19290301     3 1929    MAM  18  1930              18
 #8 19290302     3 1929    MAM 111  1930             129
 indx <- rainfall2$year-min(rainfall2$year) + !rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall2$year)-1)      

 res2 <- rainfall2 %>%
        group_by(season, year2) %>%
        mutate(seasonal.cumsum=cumsum(RR))

  do.call(rbind,lapply(split(res2, res2$year2), head,2))
  #        DATE month year season  RR year2 seasonal.cumsum
  #1 19260504     5 1926    MAM  50  1926              50
  #2 19260505     5 1926    MAM  84  1926             134
  #3 19261201    12 1926    DJF 120  1927             120
  #4 19261202    12 1926    DJF  26  1927             146
  #5 19271201    12 1927    DJF 112  1928             112
  #6 19271202    12 1927    DJF  78  1928             190
  #7 19281201    12 1928    DJF  96  1929              96
  #8 19281202    12 1928    DJF  26  1929             122
 set.seed(24)
 df <- data.frame(month=rep(rep(1:12,each=4),3), year=rep(1926:1928, each=12*4))
head(!df$month %in% c(1,2,12), 15)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#[13]  TRUE  TRUE  TRUE
df$year-min(df$year)
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#[38] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 indx <- df$year-min(df$year) + !df$month %in% c(1,2,12)
 indx
 #[1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 #[38] 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 #[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
 #[112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2
  head(diff(indx),55)
  #[1]  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 #[26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1  0  0  0  1  0  0
 #[51]  0  0  0  0  0

  head(c(TRUE,diff(indx) <0), 55)
  #[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
  #[49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

  head(cumsum(c(TRUE,diff(indx) <0)), 55)
  #[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
  #[39] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

  indx1 <- cumsum(c(TRUE, diff(indx) <0))
  head( indx1+ (min(df$year)),55)
  #[1] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[16] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[31] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1928
  #[46] 1928 1928 1928 1928 1928 1928 1928 1928 1928 1928

  indx2 <-  indx1+ (min(df$year))
  split(df, indx2) #to check the results
rainfall <- structure(list(DATE = c(19260529L, 19260530L, 19260531L, 19260601L, 
 19260602L, 19260603L, 19260604L, 19260605L), year = c(1926L, 
 1926L, 1926L, 1926L, 1926L, 1926L, 1926L, 1926L), month = c(5L, 
 5L, 5L, 6L, 6L, 6L, 6L, 6L), season = c("MAM", "MAM", "MAM", 
 "JJA", "JJA", "JJA", "JJA", "JJA"), RR = c(0L, 0L, 9L, 0L, 3L, 
 71L, 0L, 48L), yearly.cumsum = c(2347L, 2347L, 2356L, 2356L, 
 2359L, 2430L, 2430L, 2478L), seasonal.cumsum = c(2518L, 2518L, 
 2530L, 2530L, 2530L, 2530L, 2530L, 2534L)), .Names = c("DATE", 
 "year", "month", "season", "RR", "yearly.cumsum", "seasonal.cumsum"
 ), class = "data.frame", row.names = c(NA, -8L))
 DATE= format(seq(as.Date("1926-05-04"), length.out=1200, by='1 day'), '%Y%m%d')
 month <- as.numeric(substr(DATE,5,6))
 year <- as.numeric(substr(DATE,1,4))
 season <- ifelse(month %in% c(12,1,2), 'DJF', 
         ifelse(month %in% 3:5, 'MAM', ifelse(month %in% 6:8, 'JJA','SON')))
 set.seed(25)
 RR <- sample(0:120, 1200, replace=TRUE)

 rainfall2 <- data.frame(DATE, month, year, season, RR, stringsAsFactors=FALSE)
Cumulative sum of factor variables

Cumulative sum of factor variables


By : nygdjs
Date : March 29 2020, 07:55 AM
this will help One way is to try the matrixStats::rowCummaxs function, but you will need to convert to a matrix first. Though, judging by your data structure, I would recommend working with a matrix instead of a data.frame in the first place
code :
data1[-1] <- matrixStats::rowCummaxs(as.matrix(data1[-1]))
data1
#   id t1 t2 t3 t4
# 1  1  0  0  0  1
# 2  2  1  1  1  1
# 3  3  0  0  0  1
# 4  4  0  1  1  1
# 5  5  1  1  1  1
data1[-1] <- t(apply(data1[-1], 1, cummax))
library(data.table)
dcast(melt(setDT(data1), 
           id = "id"
           )[, value := cummax(value),
             by = id], 
      id ~ variable)

#    id t1 t2 t3 t4
# 1:  1  0  0  0  1
# 2:  2  1  1  1  1
# 3:  3  0  0  0  1
# 4:  4  0  1  1  1
# 5:  5  1  1  1  1
library(dplyr)
library(tidyr)
data1 %>%
  gather(variable, value, -id) %>%
  group_by(id) %>%
  mutate(value = cummax(value)) %>%
  spread(variable, value)

# Source: local data frame [5 x 5]
# Groups: id [5]
# 
#      id    t1    t2    t3    t4
#   (int) (int) (int) (int) (int)
# 1     1     0     0     0     1
# 2     2     1     1     1     1
# 3     3     0     0     0     1
# 4     4     0     1     1     1
# 5     5     1     1     1     1
data1[-1] <- Reduce(pmax, data1[-1], accumulate = TRUE)
data1
#   id t1 t2 t3 t4
# 1  1  0  0  0  1
# 2  2  1  1  1  1
# 3  3  0  0  0  1
# 4  4  0  1  1  1
# 5  5  1  1  1  1
How to dynamically create a cumulative overall total based on a non-cumulative categorical column in excel

How to dynamically create a cumulative overall total based on a non-cumulative categorical column in excel


By : Marisa D.
Date : March 29 2020, 07:55 AM
I wish this help you If my comment is the correct sugestion, then something like this should do it (£580 is at A2, so the first output is D2):
code :
D2 =A2
D3 =D2+A3-IF(COUNTIF($C$2:C2,C3),INDEX(A:A,MAX(IF($C$2:C2=C3,ROW($A$2:A2)))))
Related Posts Related Posts :
  • R - How to use sum and group_by inside apply?
  • For Each Loop to convert into numeric values
  • How to summarise taking a random value from a categorical column?
  • Predictions in SageMaker ::: Writing Function To Split Big Data-frame Into Batches For Predictions
  • ggplot: How to keep marker colours in legend but hide text colours?
  • Removing character elements from a vector
  • expand colnames to match the last known one
  • `testthat::expect_silent()` does not seem to notice ggplot2 errors
  • Parse Factor in R
  • Exit from Command prompt after running r script
  • R: write function with optional arguments
  • After full_join() how to replace NAs in one source with data from other source
  • Use reactive "if" statements in R Shiny
  • Nested loop to find most recent comment
  • How to calculate common values across different groups?
  • RStudio + Knitr + mind mapping
  • Filter a dataframe between two dates
  • Filter rows with earlier date than a row specified by another variable
  • generate a vector between 0 and 1, with certain length and certain number of 1
  • Looping multiple listed data frames into a single function
  • Problems importing .csv file due to presence of \ symbol
  • R: filter/subset range of rows based on cells containing specific value
  • merging by common value in R
  • R - Conditionally select within duplicate IDs and index
  • Generate sequences of anniversary dates between 2 dates
  • Have trouble creating a bar graph with ggplot with having both x and y variables
  • Function to remove near close proximity geolocations geometry in R
  • How to add legend and table with data value into a chart with different lines using ggplot2
  • Adding non-numeric values to a histogram in SAS or R
  • Add variable labels within mutate
  • Collapse matrix to vector and replace values with column names
  • Merge returning matrix of size x^2
  • Fuzzy compare and aggregate similar records within a single single column data-frame
  • Collapsing consecutive dates into a single row
  • How to change default color scheme in ggplot2?
  • Color values in boxplot based on x-axis variable in ggplot
  • Loop in R, extracting the last line of the output
  • Remove records matching a value on a single column occurring within a 5 minute window of a different value on the same c
  • How to efficiently apply a function on a number of matrices - mean of columns
  • Error when trying to build Rmarkdown site
  • Different function output and print option in r
  • How to extract specific intervals of the dataset?
  • Using Group by and Slope with dplyr to get new column
  • A better way to map data to a column in a dataframe?
  • Creating initialise method for reference class in R
  • How to make all plots in gridarrange the same size. Some with labels on axes and others none?
  • Data perturbation - How to perform it?
  • r - getting vectors from rows in data.table
  • Append column names as a row value
  • How to replace strings with the matching string from a list?
  • Comparison of single values in xts and data frame fails
  • How to use 2 columns in one function(t.test) after group by
  • using lapply and filter to subset dataframes with a pattern in the filtered columname
  • Why wont it let me add a legend to my graph on ggplot2 in R?
  • Avoid datatable greying out when recalculating
  • R: read strings in a data frame and record position of a specific letter
  • sum up results of a recursive function within the same recursive function
  • split dataframe by factor and name new df by the factor and addidtional description like "new_dataframe(factor)&quo
  • Replace data from dataframe using other threshold table in R
  • lmer multilevel fit with intercept constraint
  • shadow
    Privacy Policy - Terms - Contact Us © bighow.org