Image for post
Image for post

PassGAN: A Deep Learning Approach to Password Guessing

Password guessing tools (HashCat and John the Ripper) enable users to check millions, even billions of passwords per second against password hashes. When a password is “hashed”, it is turned into a scrambled representation of itself. A user’s password is taken and the hash value is derived from a combination of the given password and a key known to the site or service. Password guessing tools make it easy to identify weak passwords when stored in this hash form. The effectiveness of the guessing software relies on its ability to test large numbers of highly likely passwords against each password hash. “Instead of exhaustively trying all possible character combinations, password guessing tools use words from dictionaries and previous password leaks as candidate passwords.” Newer, flashier password guessing tools use Markov models in conjunction with defined heuristics for password transformations (combinations of multiple words, mixed letter case and leet speak — using numbers as characters in a word e.g., il0v3you) to generate a large amount of new ‘highly likely’ passwords.

This method is relatively successful, however, the heuristics are ill-constructed and are based on intuitions about users' password choices as opposed to actual data. It also has limited scalability, as developing new rules and heuristics takes time, and requires a lot of domain knowledge.

A word about GANs…

Generative Adversarial Networks (GANs) belong to the family of deep learning models known as generative models. This simply means that these models have the ability to generate new content. GAN model architecture is actually two sub-models: a generator model, and a discriminator model.

Generator — Model used to generate new, likely samples from the problem domain.

Discriminator — Model used to classify samples as real (from domain) or fake (from the generator).

The Generator model takes a fixed-length random vector as input and uses it to generate a sample in the domain. This vector is drawn randomly from a Gaussian distribution and is used as the starting point for the generative process. After training, points in this n-dimensional vector space will correspond to points in the problem domain, basically representing a compressed version of the data distribution. Once trained, the generator can be used to produce new samples.

The Discriminator takes a sample (real or generated) from the domain and outputs a binary class prediction of real or fake (generated). After training the model, we can actually discard the Discriminator because at this point, we are purely interested in the Generator, as it outputs new likely samples.

Generative modeling is an unsupervised algorithm, although a clever aspect of GAN architecture is that it frames the training of the model as a supervised problem. In a GAN system, both models are trained together. The generator produces a batch of samples, and the discriminator takes these in, along with the real samples, and attempts to classify all of the samples as either real or fake. After each round, the models are updated. The discriminator is updated and gets a better understanding of how to discriminate between real and generated data, but more importantly, the generator is updated on how well the fake samples were able to trick the discriminator into classifying them as real.

You can see that in a way the two models are competing against each other (you could consider them adversarial) in a zero-sum game. When the Discriminator successfully labels real or fake samples, it is given a reward. In this case, that reward is that the model's parameters do not need to be changed, and the Generator is penalized with large updates to its parameters. When the Generator fools the Discriminator, the reverse happens. The Generator is rewarded, no change is made to its parameters, and the Discriminator gets its parameters updated.

How does this work in PassGAN?

PassGAN applies a Generative Adversarial Network to password leaks in order to learn about the distribution of these real passwords. It then uses this knowledge to generate fake password guesses. PassGAN uses the Improved training of Wasserstein GANs relying on an Adam optimizer as its base, which is simply a tuned Generative Adversarial Network. PassGAN implements ResNets (residual networks) to wrap five residual blocks (components of the model) together to form each sub-model (Generator, Discriminator). Each of these five residual blocks is composed of an input as the identity function, two 1-d convolutional layers, and a weight of( 0.3 * output of Conv layers) to produce an output for the block. The architecture for the Generative and Discriminative sub-models is as follows…

After being tested on two different datasets — a distinct subset of the RockYou dataset and a dataset of leaked passwords from LinkedIn — PassGAN was proven to be competitive with other state-of-the-art rules-based password guessing tools, and when paired with HashCat, was able to match 51%-73% more passwords than HashCat on its own. This shows PassGAN's ability to extract a considerable number of password properties that traditional password guessing tools simply cannot. While PassGAN was able to generate and correctly match roughly 34% of passwords from the testing data (RockYou database and dataset of leaked LinkedIn profiles), what’s even more impressive is that the majority of the model's guesses “looked like” user-generated passwords. The researchers conducting this experiment theorize that these passwords could potentially be real passwords for users not found in their datasets.

In rules-based password generation models, the number of passwords the model is able to generate is dictated by the number of rules defined, and the size of the password dataset used to instantiate the model. PassGAN’s success is largely due to the fact that it does not have this limitation. Because of its architecture as a generative neural network, PassGAN can output a practically inexhaustible number of model-generated passwords. Rules-based models can produce a fraction of those predictions, however, these models' predictions tend to perform better when compared with PassGAN's first few predictions. PassGAN shines because of its ability to generate a much more comprehensive list of predictions than other rules-based tools. Researches claim that this trade-off between expressiveness, generality and autonomous learning capacity, and output size of the model, can be solved by combining multiple tools such as HashCat and PassGAN. This would look like running HashCat initially to see if you can get lucky, and then if that generator exhausts its attempts, implement a more comprehensive model like PassGAN. Remember, PassGAN can output an endless amount of password predictions, an 8 TB hard drive can hold roughly 1⁰¹² passwords.

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store