Organizations subject to PCI DSS compliance validation spend significant amounts of time, effort, and money to maintain and validate their compliance. So, the idea that a common graphics card can threaten compliance or lead to a compromise may at first seem ridiculous. This article will show you why it is not as ridiculous as it seems, and what you can do about it.
This article explores the use of hashing in the context of PCI using examples and results from our experiments to guide you. Each section is explained in detail with highlighted takeaways. And as always, we provide supporting references.
Takeaway: The growth of GPU power and crypto-currency mining technology has made simple hashes of PAN totally insecure even with no known digits. The decade old anti-correlation guidance provides no protection, and the use of simple hashes should be deprecated. Treating hashes of PAN as out-of-scope for PCI DSS is not effective and puts your organization at risk. If you are using credit card hashing in your business or applications, you should review this article in detail and conduct a risk analysis immediately.
A hash is a just a large number that stands in as a signature for other, often sensitive, data. Hashes are calculated by a complex “one-way” function that takes an input of any length (e.g. a credit card, a password, a program file, or a document) and calculates a number called a signature. The mathematics is closely related to encryption.
Some things to know about hashes include that the signature is always the same length, the same input always produces the exact same signature, tiny changes in input create very different signatures, the chances of two inputs having the same signature (a collision) is incredibly small, and lastly you can’t undo a hash (that’s the one-way part) like you can encryption.
There are a large variety of different hashing algorithms in common use today (the popular cracking suite, hashcat, easily supports over 100 variations). Hashes, unlike encryption algorithms and protocols, have not been updated by industry. Quite a number of these, such as MD5 and SHA1 are no longer safe, yet they are still used in commercial products.
Hashes can be used to detect changes (e.g. file or message integrity), to validate that a user knows a password (without having to store it or send it over a network), and to “render cardholder data unreadable anywhere it is stored” under PCI DSS.
Takeaway: While you can’t undo a hash, you can often achieve the same thing by guessing the plain-text inputs. This works when there are not too many inputs and the attacker’s computers are fast. Brute force guessing, or "cracking", is commonly used to recover passwords. We will show you that credit card numbers are small enough.
Beginning with DSS v1.0 in 2004, requirement 3.4 introduced the concept of rendering cardholder data “unreadable” using: one-way hashes, truncation, index tokens, or strong cryptography.
Many organizations seized upon this to simplify their compliance. The idea being to remove the data from the scrutiny of PCI DSS. People reasoned that because hashes were one-way, the process would make the data safe. Then the resulting data sets could be de-scoped, exported, and would no longer needs the strong protections of PCI DSS.
Most people can’t tell if a hashing implementation is secure or not. Security, especially hashing security, isn’t immediately obvious and can be tricky to get right. As a result, many applications of card hashing are flawed to this day.
Once companies began to adapt hashing as a strategy to de-scope data, the need for additional guidance started to emerge.
In 2010 PCI DSS v2.0 added a clarifying note to requirement 3.4:
It is a relatively trivial effort for a malicious individual to reconstruct original PAN data if they have access to both the truncated and hashed version of a PAN. Where hashed and truncated versions of the same PAN are present in an entity‘s environment, additional controls should be in place to ensure that the hashed and truncated versions cannot be correlated to reconstruct the original PAN.
And in 2015 PCI DSS V3.1 added an additional sub-requirement to ensure the note was not overlooked:
3.4.e If hashed and truncated versions of the same PAN are present in the environment, examine implemented controls to verify that the hashed and truncated versions cannot be correlated to reconstruct the original PAN.
In 2009, FAQ#1089 tried to address the intent of hashing. While the guidance is dated and could use an update, it includes the following (italics added):
PCI DSS Requirement 3.4 also states that the hash must be strong and one-way. This implies that the algorithm must use strong cryptography (e.g. collisions would not occur frequently) and the hash cannot be recovered or easily determined during an attack. It is also a recommended practice, but not specified requirement, that a salt be included.
Our demonstration (below) clearly shows than PAN can be easily recovered from simple hashes (using minimal or no correlation with truncated PAN) and no longer meets the intent of the PCI DSS.
The future of hashing in PCI DSS is unclear:
Takeaway: The current guidance on hashing has remained unchanged for almost five years. PCI requirements and guidance continues to evolve along with threats and risks. New guidance is possible at any time even if new requirements must wait for updates.
Hash cracking isn't breaking the cryptography and reversing the "one-way" hash. Cracking commonly uses powerful computer components called Graphical Processing Units (GPUs) to generate and hash the long lists of "guesses" then correlate these with real hashed data. The goal is to recover the original input plaintext values by brute force. When this succeeds, the flaw is not the hash algorithm but the length and complexity of the message.
We conducted an experiment to demonstrate how easy is it to "crack" and "correlate" hashed PAN using the industry standard tool hashcat.
First, we took a pair of well known test card numbers and calculated their hashes.
PAN | SHA2-256 |
---|---|
4111111111111111 | 9bbef19476623ca56c17da75fd57734dbf82530686043a6e491c6d71befe8f6e |
5454545454545454 | 3cc8217a6aad545082e07e563edeec444ce961a2468fa1a5eddf238969095735 |
We chose the cards so the first would be found and the second would not and force a full search.
Next, we ran the following "hashcat" commands for the brute force tests on several GPUs:
hashcat64.exe --potfile-disable -m1400 -a3 -D2 -w3 HASHFILE 411111?d?d?d?d?d?d1111 hashcat64.exe --potfile-disable -m1400 -a3 -D2 -w3 HASHFILE ?d?d?d?d?d?d?d?d?d?d?d?d1111
The table below shows how fast a single modern GPU (a 2080Ti - see banner photo) can crack the hash and recover the card.
# | Truncation | # Guesses | Hash Rate | Time |
---|---|---|---|---|
1. | 411111xxxxxx1111 | 10^6 | 176.8 MH/s | 0s (Milliseconds) |
2. | 411111xxxxxxxxxx | 10^10 | 245.9 MH/s | 3s (1.8s) |
3. | xxxxxxxxxxxx1111 | 10^12 | 5469.1 MH/s | 3m 3s |
4. | xxxxxxxxxxxxxxxx | 10^16 | 5469.1 to 6557.2 MH/s | 17 to 21 d (estimated) |
The real power of hash cracking is not that you can recover just one PAN per search but that you can recover every PAN in the same time.
To demonstrate this, we ran a second test (#2) on 1000 hashes, the two above, and the rest starting "411111" generated at random.
hashcat64.exe --potfile-disable -m1400 -a3 -D2 -w3 HASHFILE 411111?d?d?d?d?d?d?d?d?d?d
The 2080Ti didn't even get close to full speed and still recovered the 999 PANs in 43 seconds.
Notes:
Takeaway: Given all of this, it's feasible to brute force every possible 16-digit PAN in one day with little more than a good gaming PC and some extra work. Selective cracking is not only feasible but trivial.
Given what you now know, imagine the following:
Hashing of PAN is not the Silver Bullet that many people thought would let them escape PCI DSS scope and requirements. Most of the use cases for it are either unsafe or require as much (or more) to do securely as encryption does. Tokens and index-pads are often better suited in applications, especially where long term storage is a requirement.
The following fail to meet the intent of PCI DSS:
These can still meet the intent of PCI DSS but can be more burdensome to implement correctly and may be more restricted than other options.
These also work but are equivalent to other methods of protecting card data.
Migrating away from hashes of PAN is a potentially large project. The first step is to conduct your risk analysis. This should consider how secure your current solution is versus if you think it is (or will remain) compliant. You may need assistance from your trusted advisor.
Regardless of your organization's use cases, the planning phases should be similar:
Many solutions are possible, and each will have unique challenges. This phase will be highly dependent on your businesses use cases and requirements. For example, a marketing analytics database will be potentially quite different from a distributed hot-card list or an in/out token used to connect pre-authorizations and completions. Again, depending on the solution and your objectives, you may require assistance from your trusted advisors and QSA's. Some possible approaches:
We should caution that there may be open compliance and security questions:
PCI:
Other:
Original Publication: 2021-05-20
Updated PCI FAQ & Learn More links: 2023-06-16