Friday, June 21, 2013

Count frequency of factor column

Name: frqTable
Scenario: Want exchange data to Excel, or do the long tail fit

Code:
frqTable <- function(aFactorCol, aColName, aIsDecs="TRUE"){
  ret <- as.data.frame(table(aFactorCol))
  colnames(ret) <- c(aColName,'frq')
  ret <- ret[order(ret$frq,decreasing=aIsDecs),]
  rownames(ret) <- NULL
  return (ret);
}

Refactor: Drop unused level of a factor variable

Name: refactor
Scenario: After subset a data frame, the factor columns may include 0 frequncy levels
Related:
1. drop = TRUE, e.g. problem.factor <- problem.factor[, drop = TRUE]
2. drop.levels() function in gdata package
References:
http://r.789695.n4.nabble.com/Refactor-all-factors-in-a-data-frame-td826749.html
http://www.r-bloggers.com/data-types-part-3-factors/
http://rwiki.sciviews.org/doku.php?id=tips:data-manip:drop_unused_levels

Code:
refactor <- function (aDf){
  cat <- sapply(aDf, is.factor);
  aDf[cat] <- lapply(aDf[cat], factor);

}

Wednesday, June 19, 2013

Methods to save data frame in file in R

When you try to "save" your data set in a data frame object in R, you have several options:
Method
Pro
Con
Funcitons
Image the object in binary format
Fast, can keep object name and other environment information
R specific
save(df, file= "filename")
rm(df)
load("filename", .GlobalEnv)
Save in coded text
Full information, e.g. data mode
Size is big, can not exchange with other software
dump(c("df"),"filename")
newDf = source(“filename")$value
Export  to plain text
Human readable, and software exchangeable
May need to recast R types when read in
write.table()
read.table()
Export to other format
Software specific
Software specific
Write.X()
Read.X(), where X can be spss, sas,csv, excel