A Lightweight Comparative Machine Learning Framework for Phishing Website Detection Using Optimized URL Features

Priya Pal; Vivek Shukla; Atul; Divya mishra; Rishabh Tiwari; Mehul Kumar Das

doi:10.20944/preprints202605.1680.v1

Submitted:

23 May 2026

Posted:

26 May 2026

You are already at the latest version

Abstract

Phishing is the most common cybersecurity threat. With phishing, attackers create a website or manipulate a URL in order to obtain a user’s sensitive information. Sensitive information can include a user’s credentials, payment details, or personal information. Phishing attacks target online users by baiting them to click on a fraudulent link. Phishing is a growing concern for users across the world. I propose a phishing detection framework that is lightweight, fast, and able to detect URLs with phishing content. The lightweight comparative phishing framework focuses on the extraction of a reduced number of URL features. These features include characteristics, structures, and patterns that are seen in URLs. These features prepare and place input to the three supervised machine learning methods: Logistic Regression, Decision Tree, and Random Forest. The frameworks were then evaluated based on four main classification metrics: accuracy, precision, recall, and F1-score. The Random Forest Classifier, within the lightweight comparative machine learning framework, was the most accurate in phishing detection with minimal computational requirements. The purpose of this lightweight framework was to offer real time cyber security solutions on browsers. The framework was scalable and efficient.

Keywords:

phishing detection

;

machine learning

;

URL features

;

cybersecurity

;

random forest

;

lightweight detection

Subject:

Computer Science and Mathematics - Computer Science

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Lightweight Comparative Machine Learning Framework for Phishing Website Detection Using Optimized URL Features

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe