r - Selecting a subset of rows where a % of the values meet the threshold -
i have dataframe values in rows , samples in columns (two groups, , b). example df:
df <- rbind(rep(1, times = 10), c(rep(1, times = 9), 2), c(rep(1, times = 8), rep(2, times = 2)), c(rep(1, times = 7), rep(2, times = 3)), rep(1, times = 10), c(rep(1, times = 9), 2), c(rep(1, times = 8), rep(2, times = 2)), c(rep(2, times = 7), rep(1, times = 3))) colnames(df) <- c("a1", "a2", "a3", "a4", "a5", "b1", "b2", "b3", "b4", "b5") row.names(df) <- 1:8
i have been selecting subset of rows samples below threshold using following:
selected <- apply(df, margin = 1, function(x) all(x < 1.5)) df.sel <- df[selected,]
result of is
df[c(1,5),]
i require 2 further type of selections. first select, example, rows @ least 90% of samples below threshold values of 1.5. result of should be:
df[c(1,2,5,6)]
the second select group. say, rows @ least 50% of values in @ least 1 of groups > 1.5. should give me following df:
df[c(4,8),]
i new stackoverflow , have been asked in past put example. hope good!
df[!rowsums(df >= 1.5),] ## a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 1 1 1 1 1 1 1 1 1 1 1 ## 5 1 1 1 1 1 1 1 1 1 1 df[rowmeans(df < 1.5) >= 0.9,] ## a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 1 1 1 1 1 1 1 1 1 1 1 ## 2 1 1 1 1 1 1 1 1 1 2 ## 5 1 1 1 1 1 1 1 1 1 1 ## 6 1 1 1 1 1 1 1 1 1 2 idx <- apply(df, 1, function(x) { any(tapply(x, gsub("[0-9]", "", names(x)), function(y) mean(y > 1.5)) > 0.5) }) df[idx,] ## a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 ## 4 1 1 1 1 1 1 1 2 2 2 ## 8 2 2 2 2 2 2 2 1 1 1
Comments
Post a Comment