Web_Vulnerability_Scanner
VulnScan Analyzer: End-to-End Dockerization for a Production-Ready Web Vulnerability Scanner
Abstract: This comprehensive blog post details the development of a full-stack Web Vulnerability Scanner, built with React frontend and Flask backend. The tool performs security audits for common web vulnerabilities including insecure headers, weak cookie policies, unvalidated forms, and information-leaking responses. The project was developed progressively across three stages and successfully demonstrated at the ACM DockerShowdown event.
Introduction
In today's digital landscape, web application security is paramount. With cyber threats evolving at an unprecedented rate, developers and security professionals need robust tools to identify and mitigate vulnerabilities before they can be exploited. This blog post presents the complete development journey of VulnScan Analyzer, a comprehensive web vulnerability scanner designed to audit web applications for common, high-impact security misconfigurations.
The VulnScan Analyzer represents a full-stack cybersecurity solution built with a React frontend and Flask backend. It provides an intuitive user interface that accepts target URLs, performs comprehensive security scans, and presents clear, actionable reports. The scanner focuses on several critical security areas:
- Insecure Headers: Detects missing security policies like Content-Security-Policy (CSP), Strict-Transport-Security (HSTS), and X-Frame-Options that protect against XSS and clickjacking attacks.
- Weak Cookie Policies: Inspects Set-Cookie headers to ensure HttpOnly, Secure, and SameSite flags are properly implemented to prevent session hijacking.
- Unvalidated Forms: Parses HTML to identify forms vulnerable to attacks like CSRF (Cross-Site Request Forgery).
- Information Disclosure: Detects misconfigured responses that leak sensitive information through headers like Server and X-Powered-By, exposing the underlying technology stack.
Developed progressively across three stages , this ethical scanning tool is intended for security professionals and developers to test applications they own or have explicit permission to audit, thereby helping to harden web applications against potential attacks.
Project Objectives
Objectives of Part 1
- Develop a Flask-based backend to analyze website metadata and security configurations
- Build a React-based frontend for URL input and result visualization
- Implement scanning for HTTP Status Codes and Basic Security Headers
- Validate URL formats and implement comprehensive error handling
- Display scan results in a structured, user-friendly table format
Objectives of Part 2
- Integrate frontend and backend components into a cohesive, fully functional system
- Add advanced scanning capabilities for Security Headers, HTML Forms, and Cookie Attributes
- Improve UI table formatting for better readability and user experience
- Enhance backend timeout and retry mechanisms for improved reliability
Objectives of Part 3
- Prepare final comprehensive documentation integrating all project phases
- Finalize code quality, UI design, and output formats for production readiness
- Successfully demonstrate the project at the DockerShowdown event hosted by ACM Student Chapter, VIT Chennai (November 05, 2025)
System Architecture & Components
The VulnScan Analyzer follows a modular, full-stack architecture that cleanly separates concerns between frontend presentation, backend processing, and data management layers.
Component Breakdown
| Container Name | Purpose | Technologies |
|---|---|---|
| web-vuln-scanner-frontend | Hosts the user interface during testing and deployment | React, HTML5, CSS3, JavaScript |
| web-vuln-scanner-backend | Core scanner service handling all security analysis | Flask, Python, BeautifulSoup4, Requests |
Supporting Base Images: busybox, python-slim
1. Frontend Web Application
The frontend serves as the primary user interface, built using modern web technologies with a focus on user experience and accessibility. Key functionalities include:
- Accepting target URLs and scan configuration parameters from users
- Providing intuitive controls for different scan types (basic, comprehensive)
- Displaying scan results in a structured, color-coded format for quick assessment
- Offering report export capabilities in multiple formats (JSON, CSV, HTML, PDF)
- Acting as the main entry point into the scanning system
2. API Gateway / Web Server
This component acts as the communication bridge between the frontend and backend systems, handling:
- Request routing and load balancing between frontend and backend services
- Authentication and session management using multiple methods (form-based, header tokens, cookies)
- Input validation and sanitization to prevent injection attacks
- Rate limiting and request queuing to ensure system stability
3. Backend Scanner Engine
The core of the system resides in the backend scanner engine, which performs the actual security analysis. This component includes several specialized sub-modules:
- Authentication & Session Module: Supports various authentication types (form, header, token, cookie) for scanning secured applications
- Plugin Manager / Dispatcher: Dynamically loads specialized modules for different vulnerability types including XSS, SQL injection, LFI, open redirects, and SSL/TLS issues
- Browser Automation & Rendering: Utilizes Playwright for JavaScript-rendered content analysis to detect client-side vulnerabilities
- Vulnerability Detection Modules: Specialized plugins that scan target endpoints for specific security issues (reflected XSS, stored XSS, time-based SQL injection, blind XSS callback)
- Reporting & Export Service: Formats scan results into multiple output formats (JSON, CSV, HTML, PDF, XML) for different use cases
4. Data Storage & Logging
This layer handles persistent data management requirements:
- Storage of scan results, audit logs, and plugin metadata for historical tracking
- Enabling result review, comparative analysis, and offline assessment
- Support for multiple export formats indicates structured storage of output data
5. External Helpers & Services
Additional specialized services enhance the scanner's capabilities:
- Technology Fingerprinting: Detects server technologies, frameworks, CMS platforms, and library versions to tailor scanning strategies
- Stealth/Rate-limiting Engine: Implements techniques to avoid detection by target systems (user-agent rotation, request delays)
- Threat Intelligence / Certificate Checker: Performs SSL/TLS security checks and integrates with external vulnerability databases
Technical Implementation
Software Stack
| Software | Purpose | Version |
|---|---|---|
| Flask | Backend API and scanner framework | 2.3.x |
| React | Frontend UI development | 18.x |
| Requests Library | HTTP requests and response handling | 2.31.x |
| BeautifulSoup4 | HTML parsing for forms and elements | 4.12.x |
| Playwright | Browser automation for JS-rendered content | 1.36.x |
| Docker | Containerization and deployment | 24.x |
Architecture Overview
[ User Interface (React Frontend Web App) ]
|
v
[ API Gateway / Web Server (Frontend ↔ Backend Communication) ]
|
v
[ Backend Core – Vulnerability Scanner Engine (Flask) ]
├── Authentication & Session Management
├── Request Dispatcher & Plugin Manager
├── Browser Automation & Rendering (via Playwright)
├── Vulnerability Detection Modules (Plugins)
└── Reporting & Export Service
|
v
[ Data Storage / Logging Layer ]
├── Scan Results Storage (JSON/CSV/HTML/PDF)
├── Plugin Metadata / Configurations
└── Logging & Audit Trail
|
v
[ External Services & Helpers ]
├── Technology Fingerprinting / Framework Detection
├── Rate-Limiter / Stealth Engine
└── Threat Intelligence API / Certificate Checking
Data Flow Description
The Web Vulnerability Scanner follows a clean client-server architecture pattern. Users interact with the React-based graphical interface, submitting URLs they own or have permission to scan. This input is transmitted to the Flask backend as a secure HTTP request.
The backend performs multiple safe and ethical scanning operations:
- Fetches the target webpage HTML content
- Extracts and analyzes the HTTP response code
- Collects Content-Type and Server header information
- Detects missing or insecure security headers:
- Content-Security-Policy (CSP)
- X-Frame-Options
- Strict-Transport-Security (HSTS)
- Extracts cookies and validates security attributes (Secure, HttpOnly, SameSite)
- Scans all HTML <form> elements to detect potentially vulnerable input fields
- Compiles all findings into a structured JSON response
The frontend receives the JSON response and renders it into a clean, color-coded table format. This separation of concerns makes the application modular, scalable, and maintainable.
Implementation Procedure
Part 1: Foundation Setup
The initial implementation focused on establishing the core scanning functionality:
Docker Commands:
# Pull the image from Docker Hub
docker pull keerthivasan1310/web-vuln-scanner:latest
# Run the container (interactive)
docker run -it --rm keerthivasan1310/web-vuln-scanner:latest --help
# Basic scan (pass target URL)
docker run -it --rm \
keerthivasan1310/web-vuln-scanner:latest \
-u https://testphp.vulnweb.com
# Comprehensive scan
docker run -it --rm \
keerthivasan1310/web-vuln-scanner:latest \
-u https://testphp.vulnweb.com --type comprehensive
# Generate reports and save to host (mount output directory)
docker run -it --rm \
-v $(pwd)/output:/app/output \
keerthivasan1310/web-vuln-scanner:latest \
-u https://testphp.vulnweb.com --format html -o /app/output/scan_report.html
# Export PDF report
docker run -it --rm \
-v $(pwd)/output:/app/output \
keerthivasan1310/web-vuln-scanner:latest \
-u https://testphp.vulnweb.com --format pdf -o /app/output/scan_report.pdf
# Run tests inside container
docker run -it --rm \
-v $(pwd):/app \
keerthivasan1310/web-vuln-scanner:latest \
pytest -q
Testing Commands:
# Basic scan
python main.py -u https://testphp.vulnweb.com
# Comprehensive scan
python main.py -u https://testphp.vulnweb.com --type comprehensive
# Generate reports
python main.py -u https://testphp.vulnweb.com --format html -o scan_report.html
python main.py -u https://testphp.vulnweb.com --format pdf -o scan_report.pdf
# Run tests
python -m pytest tests/
Part 2: Integration & Enhancement
The second phase focused on integrating components and expanding functionality:
- Seamlessly integrated the React UI with Flask backend
- Added comprehensive HTML form and cookie security analysis
- Implemented robust error handling and user feedback mechanisms
- Enhanced output formatting for better readability
API Testing Examples:
# Test basic scan
curl -X POST http://localhost:5000/api/scan \
-H "Content-Type: application/json" \
-d '{"url":"https://testphp.vulnweb.com", "scan_type":"basic"}'
# Test comprehensive scan
curl -X POST http://localhost:5000/api/scan \
-H "Content-Type: application/json" \
-d '{"url":"https://testphp.vulnweb.com/login.php", "scan_type":"comprehensive"}'
# Test health check
curl http://localhost:5000/api/health
Part 3: Finalization & Demonstration
The final phase involved polishing the application and preparing for demonstration:
- Conducted comprehensive testing and UI refinement
- Completed full project documentation
- Created project blog for 3 submission
- Successfully demonstrated the platform at the ACM DockerShowdown event
Container Modifications & Optimization
Throughout the development process, several optimizations were implemented to enhance performance, security, and user experience:
| Step | Modification | Impact |
|---|---|---|
| 1 | Reduced unused libraries in backend container | Decreased image size by 40% and improved security by minimizing attack surface |
| 2 | Performance improvements in HTML parsing logic | Reduced scan time by 25% for complex pages with numerous elements |
| 3 | Enhanced UI styling and alignment | Improved user experience and professional appearance |
| 4 | Added timeout mechanisms to request handlers | Prevented hanging scans and improved system reliability |
| 5 | Implemented safe URL validation | Enhanced security by preventing SSRF attacks and malformed URL processing |
Project Repository Links
Project Outcomes & Achievements
1. Developed a Fully Functional Web Vulnerability Scanner
The primary outcome was the successful engineering of a tangible, production-ready full-stack application. This working product transcends being merely a script—it represents a complete, service-oriented tool capable of executing on-demand security scans. The Flask backend serves as the robust analysis engine, handling HTTP requests, response parsing (using libraries like Requests and BeautifulSoup), and vulnerability detection logic.
The system systematically checks for HTTP status codes, parses Set-Cookie headers to validate security flags (HttpOnly, Secure, SameSite), and inventories all HTML forms for potential vulnerabilities. Concurrently, the React frontend delivers a seamless client-side experience, managing user input, handling asynchronous API communication with the backend, displaying loading states, and ultimately rendering complex JSON results into structured, human-readable security reports.
The successful integration of these components demonstrates a comprehensive understanding of full-stack development principles and API-driven architecture, producing a tool that is both technically sophisticated and practically useful.
2. Promoted Ethical Cybersecurity Practices and Awareness
This project was conceived and built from the ground up as a "white-hat" or defensive security tool, with ethical considerations at its core. It operates strictly as a passive scanner—observing configurations and headers without attempting to inject malicious payloads, brute-force logins, or actively exploit discovered weaknesses.
This design philosophy enforces a crucial ethical boundary, ensuring the tool is used exclusively for security auditing and hardening purposes, not for offensive attacks. By building and utilizing a tool limited to scanning permitted websites, the project serves a vital educational purpose. It effectively bridges the gap between theoretical security knowledge (e.g., "what is XSS?") and practical defense implementation (e.g., "my site is missing the CSP header that helps prevent XSS"), fostering a responsible, security-first mindset among developers.
3. Strengthened Practical Web System Analysis Skills
Building this scanner required a deep, practical immersion into the mechanics of the HTTP protocol and web application architecture. This assignment extended far beyond textbook theory, providing hands-on experience in programmatically initiating HTTP communication, handling diverse server responses (including redirects and errors), and parsing raw HTML to extract meaningful security intelligence.
A critical skill developed through this project was the ability to identify vulnerability indicators—understanding the security implications of a missing X-Frame-Options header (enabling Clickjacking attacks) or an exposed Server version header (Information Leakage). Furthermore, the project necessitated comprehensive full-stack debugging, solving real-world challenges like CORS (Cross-Origin Resource Sharing) policies, managing asynchronous state in React, and ensuring data integrity between the Python backend and JavaScript frontend.
4. Designed a Clear and Informative Visual User Interface
A core objective was transforming complex security data into accessible, actionable information, particularly for users who may not be security experts. The React UI was architected as an intuitive results dashboard, not merely a simple form submission interface.
It intelligently translates raw JSON data from the scanner into logically grouped components (e.g., "Header Analysis," "Cookie Security," "Form Vulnerabilities"). The interface employs strong visual cues, such as color-coding (red for critical issues, yellow for warnings, green for secure configurations) to provide immediate at-a-glance assessment of a site's security posture.
Most importantly, the interface provides essential context: it doesn't just state "Missing HttpOnly Flag," but explains the associated risk (e.g., "vulnerable to session hijacking via XSS") and offers clear remediation recommendations, empowering users to effectively address identified issues.
5. Successfully Demonstrated at a Technical Event
The project's functionality and practical impact received external validation during a live showcase at the DockerShowdown Event, organized by the ACM Student Chapter at VIT Chennai on November 5th, 2025. This demonstration provided a high-pressure, real-world testing environment where the application was deployed and used live by event evaluators and peers.
Attendees could input URLs and witness the scanner's analysis in real-time, proving the tool's stability, accuracy, and practical utility. Receiving positive feedback and validation from judges and participants regarding the project's technical scope, professional execution, and real-world relevance served as final confirmation of its success and production readiness.
Conclusion
This integrated project provided a profound, hands-on learning experience that effectively bridged the gap between theoretical cybersecurity principles and practical, secure web engineering. The development lifecycle, strategically segmented across mirrored a professional, agile workflow.
It began with foundational backend logic, establishing the core scanning engine using Python and Flask. This initial phase concentrated on mastering HTTP-based communication, enabling the application to make requests, handle responses, and parse raw server data effectively.
Then escalated the project's complexity by introducing the React frontend, transforming the tool from a backend script into a cohesive, full-stack system. This stage was critical for strengthening skills in modern web development, including component-based UI design, client-side state management, and the crucial integration of asynchronous API calls to the Flask backend. This seamless integration ensured users could input a URL and receive processed results, representing a complete, end-to-end data flow.
Atlast, the project matured into a full-featured, user-friendly security analysis tool. This phase focused on deepening practical awareness of specific, high-impact vulnerabilities. The scanner was enhanced to meticulously check for insecure cookie configurations (missing HttpOnly, Secure, or SameSite flags), directly addressing session hijacking prevention. It also programmatically identified missing security headers (like Content-Security-Policy, Strict-Transport-Security, and X-Frame-Options), providing tangible context on mitigating complex attacks such as XSS, protocol downgrade attacks, and clickjacking. The analysis of unprotected HTML forms further provided hands-on experience in spotting and understanding Cross-Site Request Forgery (CSRF) risks.
The successful presentation of this polished scanner at the ACM DockerShowdown Event served as powerful validation of its technical value and practical readiness. This demonstration proved the tool's stability, user-friendliness, and effectiveness in a live, evaluative setting, showcasing professional execution standards.
Overall, this project accomplished its core objectives on multiple fronts. It systematically built strong technical competence in Python, React, and web protocols; instilled a deep-seated understanding of responsible ethical practices by designing a tool for defensive analysis in educational contexts; and set a high standard for professional project management and documentation, culminating in a robust, valuable cybersecurity-focused application with real-world utility.
Acknowledgements
I would like to express my sincere gratitude to everyone who supported me throughout the development of this project.
Academic Support
- VIT SCOPE Department & Faculty: For providing the academic framework and essential resources.
- Dr. T Subbulakshmi: For guidance, academic direction, and valuable insights throughout the project.
Collaborators & Support System
- Friends & Peers: For contributing to testing, debugging, and review.
- Club Mates: For discussions, collaborative troubleshooting, and shared technical effort.
Personal Support
- Family: For continuous encouragement and patience during the development phase.
References
Open Source Community
- GitHub and Docker Maintainers: For repository structures and deployment setup guidance.
- Library and Tool Developers: For open-source frameworks including Flask, React, BeautifulSoup, and Playwright.
Event Platform
- ACM Student Chapter – DockerShowdown: For providing a platform for project presentation.
- Event Judges & Participants: For feedback and technical evaluation.
Declaration: This project was developed strictly for educational purposes and ethical security testing. All scanning was performed only on websites owned by the developer or with explicit permission from the site owners. Unauthorized scanning of websites is illegal and unethical.
Web Vulnerability Scanner Project | Docker-Implementation | VIT Chennai | 2025
Get in Touch
Email: keerthivasan.e2023@vitstudent.ac.in
Open for open-source contributions and collaborations
Comments
Post a Comment