Friday, January 6, 2012

Remove empty columns in Knime

Goal
Delete the columns comprise of blank value (0, NaN, NA, NULL or fix string)

Strategy
1. If the target is number column (double or int), use [Low Variance Filter] node
2. Use [Transpose] + [Row Filter] + [Transpose] , suggested by
http://tech.knime.org/forum/knime-general/removing-columns-where-every-value-is-empty
3. Use code snippet in scripts: R, JPyhton, and Java to deal the table

Comment
None of the three works for me.
1. My target columns are string type. [String to Number] node need to be wired to source column manually.
It does not make sense if I have tons of empty columns.
2. Missing Value in [Row filter]'s setting only works on specific column, which implies it is not an automation.
3. I can write R, Python or script outside Knime to do filtering.

Final solution
1.  Pre-process the CSV file outside Knime
2. Manually skip the column in [File Reader] setting.

1 comment: