how lazy is DeltaTable.toDF (Spark and delta.io)?
Suppose you do something like
import io.delta.tables._
val deltaTable = DeltaTable.forPath(spark, "...")
deltaTable.updateExpr(
"column_name = value",
Map("updated_column" -> "'new_value'")
val df = deltaTable.toDF
Will df re-read the underlying Delta table contents on demand whenever accessed (e.g., df.count()), post-update? Such that deltaTable.toDF is effectively equivalent to spark.read.format("delta").load(path)?
Or will it re-apply the series of updates whenever df is accessed?
Comments
Post a Comment