Improved Malicious URL Detection Scanner
Tech
Malicious URL detection scanner are a common threat to cybersecurity and cause billions of dollars in losses to unsuspecting victims every year. They lure people into scams that steal their personal information, access their devices, or install malware. Hence, it is important to detect them in a timely manner. Traditionally, blacklists are used for this purpose. However, they are not exhaustive and can fail to detect new malicious URLs. This is why ML-based methods are increasingly being explored.
Detect Dangerous Links with a URL Threat Scanner
Using a combination of machine learning models and instance selection techniques, this article presents an ensemble approach that improves the detection accuracy of malicious URLs. Four machine learning models including DTs, RFs, KNNs and SVMs are paired with BPLSH, DRLSH, and random selection instance selection techniques. These techniques generate considerably smaller but representative datasets, expedite training, facilitate pattern discovery in data, and enhance research efficiency.
The results show that the ensemble model performs significantly better than the individual models and a lexical-based solution. Specifically, the detection rate of malicious URLs using the ensemble model improved by 22% when integrated into the FireEye Advanced URL Detection Engine workflow.
In addition to enhancing the overall performance, the ensemble model also eliminates confusing decisions by applying a condition-based elimination procedure. This ensures that only unambiguous decision is made, thus enabling the model to avoid false positive and negative decisions. Additionally, the ensemble model reduces prediction errors to further improve its performance. MATLAB users can take advantage of this new model by downloading the ML model from GitHub and integrating it into their own application development and security workflows.
