Loading Json Dataset Into Spark, Then Use Filter, Map, Etc
I'm new to Apache Spark, and would like to take a dataset saved in JSON (a list of dictionaries), load it into an RDD, then apply operations like filter and map. This seems to me l
Solution 1:
You could do something like
import org.json4s.JValue
import org.json4s.native.JsonMethods._
val jsonData: RDD[JValue] = sc.textFile(path).flatMap(parseOpt)
and then do your JSON processing on that JValue, like
jsonData.foreach(json => {
println(json \ "someKey")
(json \ "id") match {
caseJInt(x) => ???
case _ => ???
})
Post a Comment for "Loading Json Dataset Into Spark, Then Use Filter, Map, Etc"