Overview
In this project, our team developed an automated report generation system using Python and Django, which extracts data from the OLAP datastore Apache Pinot. The goal was to create a user-friendly web application that streamlines the reporting process, providing real-time insights and analytics.
Key Components
1. Programming Languages
- Pandas: For data manipulation and analysis.
- SQL: Data extraction from Pinot using Python
2. Django
- Role: Web framework for building the application.
- Features Implemented:
- User authentication for secure access.
- Dynamic report generation based on user inputs.
3. Apache Pinot
- Role: OLAP datastore for real-time data analytics.
- Key Features:
- Fast querying of large datasets.
- Support for complex analytics, making it ideal for report generation.
Workflow
1. Data Extraction
- Established a connection to Apache Pinot using its Python client.
- Wrote SQL-like queries to retrieve relevant data, focusing on metrics and trends necessary for reporting.
2. Data Processing
- Utilized Python and Pandas to clean, filter, and aggregate the imported data.
- Transformed the data into formats suitable for analysis and visualization.
3. Report Generation
- Generated structured reports (Excel) using Python, ensuring they were visually appealing and easy to interpret.
4. Django Integration
- Developed a Django application to handle user requests and display reports.
- Set up views and templates for a seamless user experience, allowing users to generate and download reports easily based on the inputs.
5. Automation
- Once the user enter the inputs the application will retrieve all the information needeed to export that data to a Excel document with the formated pre-stablished.
Benefits
- Efficiency: Reduced manual effort in report creation, minimizing errors and saving time.
- Real-time Insights: Users can access up-to-date data, enabling timely decision-making.
Not further information can be share due to confidencial information