yasc.eda.missing_stat

yasc.eda.missing_stat(data, columns=None, show_print=True, only_missing_columns=False)

Return missing values’ statistics.

Parameters
dataDataFrame

Observed data.

columnsstr or list, optional

A column name or a list of column names. Defaults to None.

show_printbool, optional

Whether to print summay information. Defaults to True.

only_missing_columnsbool, optional

Whether to only include columns with missing values in the output. Defaults to False.

Returns
A string if columns is passed as a str else a DataFrame.

Examples

Check missing statistics of all columns:

>>> from yasc.eda import missing_stat
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({"a": [1, np.nan, np.nan],
...                    "b": [2, np.nan, 3], "c": [4, 5, 6]})
>>> missing_stat(df)
3 columns, of which 2 columns with missing values
  column  #missing  missing_rate
2      c         0      0.000000
1      b         1      0.333333
0      a         2      0.666667

Check missing statistics of a single column:

>>> missing_stat(df, "a")
Column a of dtype float64, 2 missings (0.67)