Spark Performance: Is the DirectParquetOutputCommitter really better?

In our previous Spark optimization post we discussed a general priciple that can apply to all Spark jobs. Today we’re looking at something much more specific: writing Parquet files to S3.