Understanding URL Obfuscation

URL obfuscation is a tactic cybercriminals use to disguise malicious URLs, making them appear legitimate and bypass security filters. This involves altering the URL’s structure or encoding it in ways that hide its true destination, such as using URL shorteners, adding unnecessary characters, or employing hexadecimal encoding.

The goal is to trick users into clicking on the link, which leads to phishing sites or malware downloads while avoiding detection by security systems.

URL Component Misuse Explained

A specific type of URL obfuscation, URL component misuse, involves manipulating parts of the URL structure to deceive users. Certain components, like the authority component, are often targeted for malicious purposes. Understanding and mitigating threats from URL obfuscation requires familiarity with the structure of URLs as detailed in RFC 3986. 

RFC stands for “Request for Comments”; the sequence of numbered RFCs are the published specifications for most internet-related standards. RFC 3986 is the specification for URI (Uniform Resource Identifier) syntax.

Cybercriminals manipulate the user information section of the URL (traditionally used for usernames) to include what appears to be a legitimate domain, creating confusion and making the URL seem trustworthy at a glance.

For example:
https://netflix.com login=secure [email protected]” 

Here, “netflix.com” seems to be part of the domain but is actually positioned as the username, misleading users into thinking they are visiting Netflix’s website. This is due to attackers exploiting the “@” symbol. Browsers discard the username section of a URL (anything before the “@” sign) and send the request to the server following the “@” sign. Attackers can exploit this by inserting familiar text into the user and password sections to deceive users into visiting malicious sites.

A similar example includes URLs that appear to be legitimate Microsoft login pages but actually lead to phishing sites using the same “@” obfuscation technique.

The URL https://[email protected]/XwRFqh leads to the fake Microsoft login page:

Fake Microsoft login page

This technique is designed to confuse reputation engines, causing them to incorrectly parse the URL as “microsoft.com” instead of recognizing “is.gd” as the actual hosting site because of where the @ symbol is placed within the URL. Users taking a quick glance at the link, believe they are clicking on a legitimate “microsoft.com” URL, and overlook the true destination hidden within the rest of the URL.

Perception Point detected a similar method to sneak malicious links into email inboxes in which attackers took advantage of a key difference in how email inboxes and browsers read URLs. By inserting the “@” symbol in the middle, ordinary email security filters interpreted it as a comment, but browsers interpreted it as a legitimate web domain. Thus the phishing emails successfully bypassed security, but when targets clicked on the link inside, they were directed to a fake landing page nonetheless.

Homograph Attacks

Also known as IDN homograph attacks, homoglyph attacks, or Punycode attacks, a homograph attack is a method of deception that takes advantage of the visual similarities between characters, even from the same alphabet (the lowercase “i” vs. the uppercase “I”), and creates fake or misleading but familiar domain names or URLs. These sophisticated tactics exploit humans’ natural vulnerabilities and the technical intricacies of internationalized domain names (IDNs).

A classic example is when a user receives an email from a seemingly legitimate source, like a bank, but the URL in the email leads not to a trusted bank’s site but to a fraudulent one, meticulously crafted to siphon sensitive information. The receiver does not notice the slightest difference in the letters or characters of the malicious URL or domain name.

Homoglyph attacks can be seen as a subset of homograph attacks that specifically exploit characters that appear identical or nearly identical, such as the Latin letter “o” and the Cyrillic “о”. The subtlety of this attack lies in its ability to bypass the user’s vigilance and even some security measures, as the deceptive domain name looks virtually the same as the intended one.

Looking at the same Netflix example as above:

“ոеtflⅰх”

Beyond manipulating the URL syntax with the “@” symbol, the attacker also used the homograph technique to bypass detection. This looks like “netflix” but is a combination of Armenian and Cyrillic characters, mimicking visually similar latin characters to avoid direct comparison and potentially confusing matching mechanisms or reputation engines. These obfuscation techniques are often seen in attacks in the URL description, title, email from address and in email content itself.

summary screenshot
An attempted phishing email using the homograph technique.
How the Perception Point detection managed to catch this phishing attempt.

Additional examples using the homograph evasion technique: 

“؜؜؜А؜؜؜؜؜؜ᴄ؜؜؜؜؜؜ᴄ؜؜؜؜؜؜о؜؜؜؜؜؜υ؜؜؜؜؜؜n؜؜؜؜؜؜ṭ؜؜؜ ؜؜؜Р؜؜؜؜؜؜а؜؜؜؜؜؜у؜؜؜؜؜؜а؜؜؜؜؜؜Ь؜؜؜؜؜؜l؜؜؜؜؜؜е؜؜؜ ؜؜؜ᴠ؜؜؜؜؜؜і؜؜؜؜؜؜а؜؜؜ ؜؜؜А؜؜؜؜؜؜ԁ؜؜؜؜؜؜о؜؜؜؜؜؜Ь؜؜؜؜؜؜е؜؜؜ ؜؜؜А؜؜؜؜؜؜ᴄ؜؜؜؜؜؜г؜؜؜؜؜؜о؜؜؜؜؜؜Ь؜؜؜؜؜؜а؜؜؜؜؜؜ṭ؜؜؜ ؜؜؜Ѕ؜؜؜؜؜؜і؜؜؜؜؜؜ɡ؜؜؜؜؜؜n؜؜؜” <[email protected]>

“ꓠꓰꓔꓝꓡꓲꓫ” <[email protected]>

“Am󠄂󠄂az󠄂on󠄂󠄂 Pr󠄂i󠄂me” <[email protected]>

؜؜؜؜؜؜М؜؜؜؜؜؜і؜؜؜؜؜؜ᴄ؜؜؜؜؜؜г؜؜؜؜؜؜о؜؜؜؜؜؜ѕ؜؜؜؜؜؜о؜؜؜؜؜؜f؜؜؜؜؜؜ṭ؜؜؜ ؜؜؜ᴠ؜؜؜؜؜؜і؜؜؜؜؜؜а؜؜؜ ؜؜؜А؜؜؜؜؜؜ԁ؜؜؜؜؜؜о؜؜؜؜؜؜Ь؜؜؜؜؜؜е؜؜؜ ؜؜؜А؜؜؜؜؜؜ᴄ؜؜؜؜؜؜г؜؜؜؜؜؜о؜؜؜؜؜؜Ь؜؜؜؜؜؜а؜؜؜؜؜؜ṭ؜؜؜ ؜؜؜Р؜؜؜؜؜؜Ḍ؜؜؜؜؜؜F؜؜؜؜؜؜ <[email protected]>

” 𝙰𝚌𝚌𝚘𝚞𝚗𝚝 𝙽𝚘𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚒𝚘𝚗𝚜” <[email protected]>

Асtiоn Nееdеd: [email protected]

“W‌ᴇ‌ʟ‌ʟ‌s‌F‌ᴀ‌ʀ‌ɢ‌ᴏ‌” <[email protected]>

“ᎪpрІe” <[email protected]>

These all look like Latin characters, but are actually obfuscated using Cyrillic and Armenian characters, therefore they cannot be matched with databases of known brands, and can even confuse some language models.

Conclusion

While URL obfuscation techniques are not new, the simultaneous use of multiple obfuscation strategies illustrates the evolving tactics of phishing attacks. In the “Netflix” example above, the attacker employed both URL component misuse and homograph attacks. By combining these techniques, attackers increase their chances of evading detection and successfully deceiving users. This is part of a broader trend where attackers utilize a multi-layered approach to increase the likelihood of successful deception.

New call-to-action