C RUBY-ON-RAILS MYSQL ASP.NET DEVELOPMENT RUBY .NET LINUX SQL-SERVER REGEX WINDOWS ALGORITHM ECLIPSE VISUAL-STUDIO STRING SVN PERFORMANCE APACHE-FLEX UNIT-TESTING SECURITY LINQ UNIX MATH EMAIL OOP LANGUAGE-AGNOSTIC VB6 MSBUILD

# Cumulative sum based on factor on R

By : Torbjörn Seiger
Date : November 22 2020, 04:01 AM
help you fix your problem I have the following dataset, and I need to acumulate the value and sum, if the factor is 0, and then put the cummulated sum when I found the factor != 0. , If you want to stick with the for-loop, you can try this code :
code :
``````DF\$Result <- NA
prev <- 0
for(i in seq_len(nrow(DF))){
DF\$Result[i] <- DF\$Variable.1[i] + prev
if(DF\$Factor[i] == 1)
prev <- 0
else
prev <- DF\$Result[i]
}
``````

Share :

## Cumulative frequency by factor

By : Fahim Shahid
Date : March 29 2020, 07:55 AM
wish helps you I have to find out the cumulative frequency, converted to percentage, of a continuous variable by factor. For example: , I usually use ddply and transform to do this type of thing:
code :
``````> data = ddply(data, c('Site', 'Plot'), transform, Gsum=cumsum(G), Gtot=sum(G))
> qplot(x=d, y=Gsum/Gtot, facets=Plot~Site, geom='step', data=data)
``````

## Custom cumulative sum with decay factor

By : Grypher Gryphon
Date : March 29 2020, 07:55 AM
Any of those help Your case is exactly the same as generating a AR(1) model with coefficient 0.5. You can use the filter function to generate the data. filter also support higher order recursion, convolution or mixture of them(think about the ARMA model). You may have a look ofconvolve for other convolutions. Also, you could compiler your code to speed up the loop. In my code, complied loop and uncompiled loop code is about 111 and 162 times slower than filter respectively.
code :
``````library(compiler)
library(rbenchmark)

CustomCumsum<-function(x,alpha){
out<-x[1]
for(i in 2:length(x))
out[i] <- out[i-1]*alpha+x[i]
out
}

compiledCustomCumsum<-cmpfun(CustomCumsum)

FilterCustomCumsum<-function(x,alpha) as.numeric(filter(x,alpha, method = "recursive"))

x<-rnorm(1000)
# Test whether they are the same
identical(compiledCustomCumsum(x,0.5) , FilterCustomCumsum(x,0.5) )

benchmark(
CustomCumsum=CustomCumsum(x,0.5),compiledCustomCumsum=compiledCustomCumsum(x,0.5),          FilterCustomCumsum=FilterCustomCumsum(x,0.5)
)
``````
``````                  test replications elapsed relative user.self sys.self user.child sys.child
2 compiledCustomCumsum          100    8.89  111.125      8.78     0.01         NA        NA
1         CustomCumsum          100   13.02  162.750     11.84     0.50         NA        NA
3   FilterCustomCumsum          100    0.08    1.000      0.08     0.00         NA        NA
``````

## R Cumulative sum by factor with sum 'reset'

By : Geoff Chappell
Date : March 29 2020, 07:55 AM
will be helpful for those in need my problem is I'm trying to find the cumulative sum of rainfall by season (DJF, MAM, JJA, SON) and by year (1926 - 2000), with the sum resetting to zero at the end of each season. , May be you can try dplyr
code :
``````library(dplyr)
rainfall %>%
group_by(season, year) %>%
mutate(seasonal.cumsum=cumsum(RR))

#          DATE year month season RR yearly.cumsum seasonal.cumsum
#1 19260529 1926     5    MAM  0          2347               0
#2 19260530 1926     5    MAM  0          2347               0
#3 19260531 1926     5    MAM  9          2356               9
#4 19260601 1926     6    JJA  0          2356               0
#5 19260602 1926     6    JJA  3          2359               3
#6 19260603 1926     6    JJA 71          2430              74
#7 19260604 1926     6    JJA  0          2430              74
#8 19260605 1926     6    JJA 48          2478             122
``````
`````` indx <- rainfall2\$year-min(rainfall2\$year) + rainfall2\$month %in% c(1,2,12)
indx1 <- cumsum(c(TRUE,diff(indx) <0))
rainfall2\$year2 <- indx1+ (min(rainfall\$year))

res <-  rainfall2 %>%
group_by(season, year2) %>%
mutate(seasonal.cumsum=cumsum(RR))

#       DATE month year season  RR year2 seasonal.cumsum
#1 19260504     5 1926    MAM  50  1927              50
#2 19260505     5 1926    MAM  84  1927             134
#3 19270301     3 1927    MAM  98  1928              98
#4 19270302     3 1927    MAM 112  1928             210
#5 19280301     3 1928    MAM  91  1929              91
#6 19280302     3 1928    MAM  85  1929             176
#7 19290301     3 1929    MAM  18  1930              18
#8 19290302     3 1929    MAM 111  1930             129
``````
`````` indx <- rainfall2\$year-min(rainfall2\$year) + !rainfall2\$month %in% c(1,2,12)
indx1 <- cumsum(c(TRUE,diff(indx) <0))
rainfall2\$year2 <- indx1+ (min(rainfall2\$year)-1)

res2 <- rainfall2 %>%
group_by(season, year2) %>%
mutate(seasonal.cumsum=cumsum(RR))

#        DATE month year season  RR year2 seasonal.cumsum
#1 19260504     5 1926    MAM  50  1926              50
#2 19260505     5 1926    MAM  84  1926             134
#3 19261201    12 1926    DJF 120  1927             120
#4 19261202    12 1926    DJF  26  1927             146
#5 19271201    12 1927    DJF 112  1928             112
#6 19271202    12 1927    DJF  78  1928             190
#7 19281201    12 1928    DJF  96  1929              96
#8 19281202    12 1928    DJF  26  1929             122
``````
`````` set.seed(24)
df <- data.frame(month=rep(rep(1:12,each=4),3), year=rep(1926:1928, each=12*4))
``````
``````head(!df\$month %in% c(1,2,12), 15)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#[13]  TRUE  TRUE  TRUE
``````
``````df\$year-min(df\$year)
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#[38] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
``````
`````` indx <- df\$year-min(df\$year) + !df\$month %in% c(1,2,12)
indx
#[1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[38] 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
#[112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2
``````
``````  head(diff(indx),55)
#[1]  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
#[26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1  0  0  0  1  0  0
#[51]  0  0  0  0  0

#[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
#[49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

#[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[39] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

indx1 <- cumsum(c(TRUE, diff(indx) <0))
``````
``````  head( indx1+ (min(df\$year)),55)
#[1] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
#[16] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
#[31] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1928
#[46] 1928 1928 1928 1928 1928 1928 1928 1928 1928 1928

indx2 <-  indx1+ (min(df\$year))
split(df, indx2) #to check the results
``````
``````rainfall <- structure(list(DATE = c(19260529L, 19260530L, 19260531L, 19260601L,
19260602L, 19260603L, 19260604L, 19260605L), year = c(1926L,
1926L, 1926L, 1926L, 1926L, 1926L, 1926L, 1926L), month = c(5L,
5L, 5L, 6L, 6L, 6L, 6L, 6L), season = c("MAM", "MAM", "MAM",
"JJA", "JJA", "JJA", "JJA", "JJA"), RR = c(0L, 0L, 9L, 0L, 3L,
71L, 0L, 48L), yearly.cumsum = c(2347L, 2347L, 2356L, 2356L,
2359L, 2430L, 2430L, 2478L), seasonal.cumsum = c(2518L, 2518L,
2530L, 2530L, 2530L, 2530L, 2530L, 2534L)), .Names = c("DATE",
"year", "month", "season", "RR", "yearly.cumsum", "seasonal.cumsum"
), class = "data.frame", row.names = c(NA, -8L))
``````
`````` DATE= format(seq(as.Date("1926-05-04"), length.out=1200, by='1 day'), '%Y%m%d')
month <- as.numeric(substr(DATE,5,6))
year <- as.numeric(substr(DATE,1,4))
season <- ifelse(month %in% c(12,1,2), 'DJF',
ifelse(month %in% 3:5, 'MAM', ifelse(month %in% 6:8, 'JJA','SON')))
set.seed(25)
RR <- sample(0:120, 1200, replace=TRUE)

rainfall2 <- data.frame(DATE, month, year, season, RR, stringsAsFactors=FALSE)
``````

## Cumulative sum of factor variables

By : nygdjs
Date : March 29 2020, 07:55 AM
this will help One way is to try the matrixStats::rowCummaxs function, but you will need to convert to a matrix first. Though, judging by your data structure, I would recommend working with a matrix instead of a data.frame in the first place
code :
``````data1[-1] <- matrixStats::rowCummaxs(as.matrix(data1[-1]))
data1
#   id t1 t2 t3 t4
# 1  1  0  0  0  1
# 2  2  1  1  1  1
# 3  3  0  0  0  1
# 4  4  0  1  1  1
# 5  5  1  1  1  1
``````
``````data1[-1] <- t(apply(data1[-1], 1, cummax))
``````
``````library(data.table)
dcast(melt(setDT(data1),
id = "id"
)[, value := cummax(value),
by = id],
id ~ variable)

#    id t1 t2 t3 t4
# 1:  1  0  0  0  1
# 2:  2  1  1  1  1
# 3:  3  0  0  0  1
# 4:  4  0  1  1  1
# 5:  5  1  1  1  1
``````
``````library(dplyr)
library(tidyr)
data1 %>%
gather(variable, value, -id) %>%
group_by(id) %>%
mutate(value = cummax(value)) %>%

# Source: local data frame [5 x 5]
# Groups: id [5]
#
#      id    t1    t2    t3    t4
#   (int) (int) (int) (int) (int)
# 1     1     0     0     0     1
# 2     2     1     1     1     1
# 3     3     0     0     0     1
# 4     4     0     1     1     1
# 5     5     1     1     1     1
``````
``````data1[-1] <- Reduce(pmax, data1[-1], accumulate = TRUE)
data1
#   id t1 t2 t3 t4
# 1  1  0  0  0  1
# 2  2  1  1  1  1
# 3  3  0  0  0  1
# 4  4  0  1  1  1
# 5  5  1  1  1  1
``````

## How to dynamically create a cumulative overall total based on a non-cumulative categorical column in excel

By : Marisa D.
Date : March 29 2020, 07:55 AM
I wish this help you If my comment is the correct sugestion, then something like this should do it (£580 is at A2, so the first output is D2):
code :
``````D2 =A2
D3 =D2+A3-IF(COUNTIF(\$C\$2:C2,C3),INDEX(A:A,MAX(IF(\$C\$2:C2=C3,ROW(\$A\$2:A2)))))
``````