๐Ÿ”’
Privacy First
Your file is processed entirely in memory and never stored.
All data is cleared automatically after analysis.
We use dynamic rendering โ€” nothing is saved to disk or database.

๐Ÿ“ Upload Your Dataset

Supported formats: .csv, .xlsx, .xls, .json, .feather

Max file size: 500.0MB

๐Ÿง  How It Works

This app runs a full data profiling and quality audit using statistical methods โ€” all locally in your browser session. No code, no setup, just results.

  • ๐Ÿ“‹ Comprehensive Column Classification
    Columns are automatically categorised into numeric, boolean, categorical, datetime, or timeseries types. Timeseries checks include monotonicity and time span detection.
  • ๐Ÿ“Š Descriptive Statistics
    For every numeric column, we compute mean, standard deviation, min, max, variance, skewness, and kurtosis. Outliers are detected using Z-scores with thresholds at 3ฯƒ, 4ฯƒ, and 5ฯƒ.
  • ๐Ÿ“ˆ Distribution Diagnostics
    Skewed columns are flagged and classified (moderate or severe). You'll get transformation suggestions like log, Box-Cox, and Yeo-Johnson โ€” with histograms for visual inspection.
  • ๐Ÿšจ Outlier Detection & Visualisation
    Outliers are split by severity level and shown with plots. This helps identify influential points or data errors before they affect your models.
  • ๐Ÿงช Data Quality Audit
    The app automatically checks for:
    • โœ”๏ธ Missing data (0โ€“29% or 30%+ severity)
    • โœ”๏ธ Fully null columns
    • โœ”๏ธ Duplicate rows
    • โœ”๏ธ Constant and low-variance columns (including categorical and boolean)
    • โœ”๏ธ High and medium cardinality features
    • โœ”๏ธ Imbalanced boolean features (over 70% one class)
  • ๐Ÿ”— Correlation Analysis
    Computes Pearson correlation for numeric features and Cramรฉrโ€™s V for categoricals. Visualises numeric correlation heatmaps and flags highly collinear pairs.
  • ๐Ÿงฎ Multicollinearity Detection (VIF)
    Applies preprocessing (null filtering, constant drop, imputation), then calculates Variance Inflation Factors. Warns about numeric features with VIF > 5 or 10.
  • ๐Ÿ’ก Actionable Insights
    Recommendations are shown for each issue โ€” complete with example code, severity badges, and justifications so you can clean data efficiently.

๐Ÿ” All data stays in memory โ€” nothing is stored or shared. This is a fully stateless, secure analysis workflow.

Processing your dataset... โณ