Preprint
Article

This version is not peer-reviewed.

A Lightweight Comparative Machine Learning Framework for Phishing Website Detection Using Optimized URL Features

Submitted:

23 May 2026

Posted:

26 May 2026

You are already at the latest version

Abstract
Phishing is the most common cybersecurity threat. With phishing, attackers create a website or manipulate a URL in order to obtain a user’s sensitive information. Sensitive information can include a user’s credentials, payment details, or personal information. Phishing attacks target online users by baiting them to click on a fraudulent link. Phishing is a growing concern for users across the world. I propose a phishing detection framework that is lightweight, fast, and able to detect URLs with phishing content. The lightweight comparative phishing framework focuses on the extraction of a reduced number of URL features. These features include characteristics, structures, and patterns that are seen in URLs. These features prepare and place input to the three supervised machine learning methods: Logistic Regression, Decision Tree, and Random Forest. The frameworks were then evaluated based on four main classification metrics: accuracy, precision, recall, and F1-score. The Random Forest Classifier, within the lightweight comparative machine learning framework, was the most accurate in phishing detection with minimal computational requirements. The purpose of this lightweight framework was to offer real time cyber security solutions on browsers. The framework was scalable and efficient.
Keywords: 
;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated