Loading…

A Large-Scale Pretrained Deep Model for Phishing URL Detection

Phishing attacks have always been a security issue that has attracted great attention in the cyber security community. Recently, the famous pre-trained models is being used as an anti-phishing solution. However, existing studies either simply transfer models pre-trained on text to phishing detection...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Yanbin, Zhu, Weifan, Xu, Haitao, Qin, Zhan, Ren, Kui, Ma, Wenrui
Format:	Conference Proceeding
Language:	English
Subjects:	Fine-tune Phishing Phishing detection Pre-training Robustness Security Self supervised learning Signal processing Task analysis Transformers Uniform resource locators
Citations:	Items that cite this one
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Phishing attacks have always been a security issue that has attracted great attention in the cyber security community. Recently, the famous pre-trained models is being used as an anti-phishing solution. However, existing studies either simply transfer models pre-trained on text to phishing detection task, or pre-train models using only extremely small phishing samples. In this paper, we propose PhishBERT, a veritable pretrained deep transformer network model for phishing URL detection. Using a tailor pre-training objective, PhishBERT obtained a general understanding of various URLs by being pretrained on a corpus of more than 3 billion unlabeled URL data. It is then transferred to the detection task of benign and malicious URL data, with supervised fine-tuning using adversarial methods. Extensive and rigorous benchmark studies verify that PhishBERT is significantly superior to the current state-of-the-art methods in terms of efficiency, robustness and accuracy on the task of phishing website detection.
ISSN:	2379-190X
DOI:	10.1109/ICASSP49357.2023.10095719