Loading…

A Large-Scale Pretrained Deep Model for Phishing URL Detection

Phishing attacks have always been a security issue that has attracted great attention in the cyber security community. Recently, the famous pre-trained models is being used as an anti-phishing solution. However, existing studies either simply transfer models pre-trained on text to phishing detection...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Yanbin, Zhu, Weifan, Xu, Haitao, Qin, Zhan, Ren, Kui, Ma, Wenrui
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Phishing attacks have always been a security issue that has attracted great attention in the cyber security community. Recently, the famous pre-trained models is being used as an anti-phishing solution. However, existing studies either simply transfer models pre-trained on text to phishing detection task, or pre-train models using only extremely small phishing samples. In this paper, we propose PhishBERT, a veritable pretrained deep transformer network model for phishing URL detection. Using a tailor pre-training objective, PhishBERT obtained a general understanding of various URLs by being pretrained on a corpus of more than 3 billion unlabeled URL data. It is then transferred to the detection task of benign and malicious URL data, with supervised fine-tuning using adversarial methods. Extensive and rigorous benchmark studies verify that PhishBERT is significantly superior to the current state-of-the-art methods in terms of efficiency, robustness and accuracy on the task of phishing website detection.
ISSN:2379-190X
DOI:10.1109/ICASSP49357.2023.10095719