Parse CSV, ignoring commas inside string literals in VBA?

Parse CSV, ignoring commas inside string literals in VBA?

Asked on October 26, 2018 in String.
Add Comment


  • 3 Answer(s)

    The simple regex for parsing the CSV line, for assuming no quotes inside quoted fields, is:

    "[^"]*"|[^,]*
    
    
    Answered on October 26, 2018.
    Add Comment

      Use this code

    Function splitLine1(line As String) As String()
      Dim temp() As String
      'Splits the line in three. The string delimited by " will be at temp(1)
      temp = Split(line, Chr(34)) 'chr(34) = "
      'Replaces the commas in the numeric fields by semicolons
      temp(0) = Replace(temp(0), ",", ";")
      temp(2) = Replace(temp(2), ",", ";")
      'Joins the temp array with quotes and then splits the result using the semicolons
      splitLine1 = Split(Join(temp, Chr(34)), ";")
    End Function
    

      This function that solves the particular problem. With the other way to the job is using the regular expression object from VBScript.

    Function splitLine2(line As String) As String()
    Dim regex As Object
    Set regex = CreateObject("vbscript.regexp")
    regex.IgnoreCase = True
    regex.Global = True
    'This pattern matches only commas outside quotes
    'Pattern = ",(?=([^"]*"[^"]*")*(?![^"]*"))"
    regex.Pattern = ",(?=([^" & Chr(34) & "]*" & Chr(34) & "[^" & Chr(34) & "]*" & Chr(34) & ")*(?![^" & Chr(34) & "]*" & Chr(34) & "))"
    'regex.replaces will replace the commas outside quotes with semicolons and then the
    'Split function will split the result based on the semicollons
    splitLine2 = Split(regex.Replace(line, ";"), ";")
    End Function
    
    Answered on October 26, 2018.
    Add Comment

    This code can be used to Parse CSV, ignoring commas inside string literals in VBA:

    Function ParseCSV(FileName)
      Dim Regex 'As VBScript_RegExp_55.RegExp
      Dim MatchColl 'As VBScript_RegExp_55.MatchCollection
      Dim Match 'As VBScript_RegExp_55.Match
      Dim FS 'As Scripting.FileSystemObject
      Dim Txt 'As Scripting.TextStream
      Dim CSVLine
      ReDim ToInsert(0)
    Set FS = CreateObject("Scripting.FileSystemObject")
    Set Txt = FS.OpenTextFile(FileName, 1, False, -2)
    Set Regex = CreateObject("VBScript.RegExp")
    Regex.Pattern = """[^""]*""|[^,]*" '<- MRAB's answer
    Regex.Global = True
    Do While Not Txt.AtEndOfStream
      ReDim ToInsert(0)
      CSVLine = Txt.ReadLine
      For Each Match In Regex.Execute(CSVLine)
        If Match.Length > 0 Then
          ReDim Preserve ToInsert(UBound(ToInsert) + 1)
          ToInsert(UBound(ToInsert) - 1) = Match.Value
        End If
      Next
      InsertArrayIntoDatabase ToInsert
    Loop
    Txt.Close
    End Function
    
    Answered on October 26, 2018.
    Add Comment


  • Your Answer

    By posting your answer, you agree to the privacy policy and terms of service.