Pyspark Size Function, You can estimate the size of the data in the source (for example, in parquet file).