Unseen factor levels when appending new records with unseen string values to a dataframe, cause Warning and result in NA

Unseen factor levels when appending new records with unseen string values to a dataframe, cause Warning and result in NA

Asked on October 29, 2018 in R Programming.
Add Comment


  • 3 Answer(s)

    This could be caused by mismatch of types in two data.frames.

    First check all types (classes):

    new2old <- rbind( alltime, all2008 ) # this gives you a warning
    old2new <- rbind( all2008, alltime ) # this should be without warning
     
    cbind(
        alltime = sapply( alltime, class),
        all2008 = sapply( all2008, class),
        new2old = sapply( new2old, class),
        old2new = sapply( old2new, class)
    )
    

    I expect there be a row looks like:

             alltime all2008 new2old old2new
    … … … … …
    some_column “factor” “numeric” “factor” “character”
    … … … … …

    Answered on October 29, 2018.
    Add Comment

    The read.{table,csv,…} functions take a stringsAsFactors parameter, which is by default set to TRUE. we can set this to FALSE while you’re importing and rbind-ing your data.

    If we’d like to set the column to be a factor at the end, we can do that too.

    For example:

    alltime <- read.table("alltime.txt", stringsAsFactors=FALSE)
    all2008 <- read.table("all2008.txt", stringsAsFactors=FALSE)
    alltime <- rbind(alltime, all2008)
    # If you want the doctor column to be a factor, make it so:
    alltime$doctor <- as.factor(alltime$doctor)
    
    Answered on October 29, 2018.
    Add Comment
      • To create the data frame with stringsAsFactor set to FALSE. This should resolve the issue

     

      • after that we do not what to use the rbind – it messes up the column names if the data frame is empty.

     

    df[nrow(df)+1,] <- c("d","gsgsgd",4)
    

    /

    df <- data.frame(a = character(0), b=character(0), c=numeric(0))

    df[nrow(df)+1,] <- c(“d”,”gsgsgd”,4)

    Warnmeldungen:
    1: In `[<-.factor`(`*tmp*`, iseq, value = “d”) :
    invalid factor level, NAs generated
    2: In `[<-.factor`(`*tmp*`, iseq, value = “gsgsgd”) :
    invalid factor level, NAs generated

    df <- data.frame(a = character(0), b=character(0), c=numeric(0), stringsAsFactors=F)

    df[nrow(df)+1,] <- c(“d”,”gsgsgd”,4)

    df
    a                  b   c
    1 d gsgsgd 4

    Answered on October 29, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.