Convert column classes in data.table

Convert column classes in data.table

Asked on December 28, 2018 in

To convert single column, try this code:

```dtnew <- dt[, Quarter:=as.character(Quarter)]
str(dtnew)

Classes ‘data.table’ and 'data.frame': 10 obs. of 3 variables:
\$ ID : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
\$ Quarter: chr "1" "2" "3" "4" ...
\$ value : num -0.838 0.146 -1.059 -1.197 0.282 ...
```

The code with lapply and as.character is given below:

```dtnew <- dt[, lapply(.SD, as.character), by=ID]
str(dtnew)

Classes ‘data.table’ and 'data.frame': 10 obs. of 3 variables:
\$ ID : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
\$ Quarter: chr "1" "2" "3" "4" ...
\$ value : chr "1.487145280568" "-0.827845218358881" "0.028977182770002" "1.35392750102305" ...
```

Here is the alternate way to convert column classes:

```DT <- data.table(X1 = c("a", "b"), X2 = c(1,2), X3 = c("hello", "you"))
changeCols <- colnames(DT)[which(as.vector(DT[,lapply(.SD, class)]) == "character")]

DT[,(changeCols):= lapply(.SD, as.factor), .SDcols = changeCols]
```

Example using the syntax eval substitute:

```library(data.table)
dt <- data.table(ID = c(rep("A", 5), rep("B",5)),
fac1 = c(1:5, 1:5),
fac2 = c(1:5, 1:5) * 2,
val1 = rnorm(10),
val2 = rnorm(10))

names_factors = c('fac1', 'fac2')
names_values = c('val1', 'val2')

for (col in names_factors){
e = substitute(X := as.factor(X), list(X = as.symbol(col)))
dt[ , eval(e)]
}
for (col in names_values){
e = substitute(X := as.numeric(X), list(X = as.symbol(col)))
dt[ , eval(e)]
}

str(dt)
```

Yields as:

```Classes ‘data.table’ and 'data.frame': 10 obs. of 5 variables:
\$ ID : chr "A" "A" "A" "A" ...
\$ fac1: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5
\$ fac2: Factor w/ 5 levels "2","4","6","8",..: 1 2 3 4 5 1 2 3 4 5
\$ val1: num 0.0459 2.0113 0.5186 -0.8348 -0.2185 ...
\$ val2: num -0.0688 0.6544 0.267 -0.1322 -0.4893 ...
- attr(*, ".internal.selfref")=<externalptr>
```