object CSVParser extends RegexParsers {
def apply(f: java.io.File): Iterator[List[String]] = io.Source.fromFile(f).getLines().map(apply(_))
def apply(s: String): List[String] = parseAll(fromCsv, s) match {
case Success(result, _) => result
case failure: NoSuccess => {throw new Exception("Parse Failed")}
}
def fromCsv:Parser[List[String]] = rep1(mainToken) ^^ {case x => x}
def mainToken = (doubleQuotedTerm | singleQuotedTerm | unquotedTerm) <~ ",?".r ^^ {case a => a}
def doubleQuotedTerm: Parser[String] = "\"" ~> "[^\"]+".r <~ "\"" ^^ {case a => (""/:a)(_+_)}
def singleQuotedTerm = "'" ~> "[^']+".r <~ "'" ^^ {case a => (""/:a)(_+_)}
def unquotedTerm = "[^,]+".r ^^ {case a => (""/:a)(_+_)}
override def skipWhitespace = false
}
Tuesday, June 12, 2012
Parsing CSVs in Scala
I did a quick google on parsing CSVs in Scala, and one of the top hits was a stack overflow question where the answer was wrong. Very wrong. So, I threw together a quick parser in Scala to get the job done. I'm not saying it's good, but it passes the spec tests I have included quotes and quoted commas both with single and double quotes. I hope this is useful, and perhaps somebody can improve upon it.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment