r - Selecting a subset of rows where a % of the values meet the threshold -


i have dataframe values in rows , samples in columns (two groups, , b). example df:

df <- rbind(rep(1, times = 10),          c(rep(1, times = 9), 2),          c(rep(1, times = 8), rep(2, times = 2)),         c(rep(1, times = 7), rep(2, times = 3)), rep(1, times = 10),          c(rep(1, times = 9), 2),          c(rep(1, times = 8), rep(2, times = 2)),          c(rep(2, times = 7), rep(1, times = 3))) colnames(df) <- c("a1", "a2", "a3", "a4", "a5",               "b1", "b2", "b3", "b4", "b5") row.names(df) <- 1:8 

i have been selecting subset of rows samples below threshold using following:

selected <- apply(df, margin = 1, function(x) all(x < 1.5)) df.sel <- df[selected,] 

result of is

df[c(1,5),] 

i require 2 further type of selections. first select, example, rows @ least 90% of samples below threshold values of 1.5. result of should be:

df[c(1,2,5,6)] 

the second select group. say, rows @ least 50% of values in @ least 1 of groups > 1.5. should give me following df:

df[c(4,8),] 

i new stackoverflow , have been asked in past put example. hope good!

df[!rowsums(df >= 1.5),] ##   a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 1  1  1  1  1  1  1  1  1  1  1 ## 5  1  1  1  1  1  1  1  1  1  1  df[rowmeans(df < 1.5) >= 0.9,] ##   a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 1  1  1  1  1  1  1  1  1  1  1 ## 2  1  1  1  1  1  1  1  1  1  2 ## 5  1  1  1  1  1  1  1  1  1  1 ## 6  1  1  1  1  1  1  1  1  1  2  idx <- apply(df, 1, function(x) {     any(tapply(x, gsub("[0-9]", "", names(x)), function(y) mean(y > 1.5)) > 0.5)     })  df[idx,] ##   a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 4  1  1  1  1  1  1  1  2  2  2 ## 8  2  2  2  2  2  2  2  1  1  1 

Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -