This project is a web application developed using Flask, designed to upload CSV files and perform various data quality checks. The application provides an interface where users can upload their data and get immediate feedback on issues like missing values, duplicates, text formatting issues, and data type inconsistencies.
- Upload CSV files via a web form.
- Perform comprehensive data quality checks:
- Missing values
- Duplicate records
- Data type validation
- Text data validation
- Display results in a user-friendly HTML format.
- Flask
- Pandas
To set up the project environment, follow these steps:
- Clone the repository to your local machine.
- Ensure that Python is installed on your system.
- Install the required Python packages using pip:
To run the application, navigate to the project directory and execute the following command in the terminal:
python -m flask run
This will start the Flask server on localhost
with the default port 5000
. You can access the application by visiting http://127.0.0.1:5000
in your web browser.
- Uploading Files:
- Navigate to the home page.
- Use the file upload form to select and upload a CSV file.
- The application will automatically process the file and display the results.
- Viewing Results:
- After uploading the file, you will be redirected to a results page.
- This page will show a sample of the data and any data quality issues detected.
Feel free to fork this repository and submit pull requests. You can also open issues if you encounter bugs or have suggestions for improvements.
This project is licensed under the MIT License - see the LICENSE file for details.