Introduction

This notebook presents a method to train embedding layer using Word2Vec Skip-Gram method on Wikipedia text8 dataset.

We are going to work with text8 dataset. It is 100MB of cleaned English Wikipedia text. $\text{10MB} = 10^8$ hence text8

Dataset:

References:

Skip-Gram Model Architecture

Imports

In [1]:
import time
import math
import collections
import numpy as np
import matplotlib.pyplot as plt
In [2]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
In [3]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Read Data

Dataset location

In [4]:
dataset_location = '/home/marcin/Datasets/wiki-text8/text8'
In [5]:
with open(dataset_location, 'r') as f:
    text = f.read()
print(text[:500])
 anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that used violent means to destroy the organization of society it has also been taken up as a positive label by self defined anarchists the word anarchism is derived from the greek without archons ruler chief king anarchism as a political philoso

Dataset is cleaned, it contains only lowercase letters and spaces

In [6]:
print(sorted(set(text)))
[' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
In [7]:
words_raw = text.split()
print(words_raw[:20])
['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against', 'early', 'working', 'class', 'radicals', 'including', 'the', 'diggers', 'of', 'the', 'english']
In [8]:
print('Total words:', len(words_raw))
print('Unique words:', len(set(words_raw)))
Total words: 17005207
Unique words: 253854

Preprocess the Data

First we will look at word-count distribution

In [9]:
words_counter = collections.Counter(words_raw)
print('WORD : COUNT')
for w in list(words_counter)[:10]:
    print(w, ':', words_counter[w])
WORD : COUNT
anarchism : 303
originated : 572
as : 131815
a : 325873
term : 7219
of : 593677
abuse : 563
first : 28810
used : 22737
against : 8432

Lets plot word-counts on linear and logarithmic scales

In [10]:
sorted_all = np.array(sorted(list(words_counter.values()), reverse=True))
fig, [ax1, ax2] = plt.subplots(1, 2, figsize=[16,6])
ax1.plot(sorted_all); ax1.set_title('Word Counts (linear scale)')
ax2.plot(sorted_all); ax2.set_title('Word Counts (log scale)')
ax2.set_yscale('log')

This is extremely sharp distribution. Some words appear over 1 million times, while over 100k words appear only once.

Common Words

Lets have a look at the most common words.

In [11]:
words_counter.most_common()[:10]
Out[11]:
[('the', 1061396),
 ('of', 593677),
 ('and', 416629),
 ('one', 411764),
 ('in', 372201),
 ('a', 325873),
 ('to', 316376),
 ('zero', 264975),
 ('nine', 250430),
 ('two', 192644)]

We will deal with this later on using subsampling as described by Mikolov.

Rare Words

Lets look at some of of the uncommon words

In [12]:
words_counter.most_common()[-10:]
Out[12]:
[('kajn', 1),
 ('gorbacheva', 1),
 ('mikhailgorbachev', 1),
 ('englander', 1),
 ('workmans', 1),
 ('erniest', 1),
 ('metzada', 1),
 ('metzuda', 1),
 ('fretensis', 1),
 ('exortation', 1)]

Words like 'metzanda' or 'metzunda' are so rare (first ever time I see these) we are not concerned about them when building our NLP system. We will subsequently drop all words that 5 or less times.

Create dictionaries

Tokenize words, but keep only ones that occur six or more times

In [15]:
i2w = {i : w for i, (w, c) in enumerate(words_counter.most_common()) if c > 5}
w2i = {w : i for i, w in i2w.items()}
print('Number of words after filter:', len(i2w))
Number of words after filter: 63641

Confirm both dictionaries check

In [16]:
for i in range(10):
    word = i2w[i]
    print(i, ':', word, ':', w2i[word])
0 : the : 0
1 : of : 1
2 : and : 2
3 : one : 3
4 : in : 4
5 : a : 5
6 : to : 6
7 : zero : 7
8 : nine : 8
9 : two : 9

Tokenize

In [17]:
words_tok = [w2i[w] for w in words_raw if w in w2i]
print('Number of words after removing uncommon:', len(words_tok))
Number of words after removing uncommon: 16680599

This is our text, with uncommon words removed and converted to tokens:

In [18]:
print(words_tok[:100])
[5233, 3080, 11, 5, 194, 1, 3133, 45, 58, 155, 127, 741, 476, 10571, 133, 0, 27349, 1, 0, 102, 854, 2, 0, 15067, 58112, 1, 0, 150, 854, 3580, 0, 194, 10, 190, 58, 4, 5, 10712, 214, 6, 1324, 104, 454, 19, 58, 2731, 362, 6, 3672, 0, 708, 1, 371, 26, 40, 36, 53, 539, 97, 11, 5, 1423, 2757, 18, 567, 686, 7088, 0, 247, 5233, 10, 1052, 27, 0, 320, 248, 44611, 2877, 792, 186, 5233, 11, 5, 200, 602, 10, 0, 1134, 19, 2621, 25, 8983, 2, 279, 31, 4147, 141, 59, 25, 6437]

Subsampling

Equation from the paper, where $P(w_i)$ is probability to drop certain word, $f(w_i)$ is a frequency and $t$ is a parameter

$$ P(w_i) = 1 - \sqrt{\frac{t}{f(w_i)}} $$

Calculate probabilities

In [19]:
threshold = 1e-5

tokens_counter = collections.Counter(words_tok)    # token : num_occurances

prob_drop = {}
for tok, count in tokens_counter.items():
    word_freq = count / len(words_tok)
    prob_drop[tok] = 1 - math.sqrt(threshold / word_freq)

Print probabilities for some words, note that for frequent words like 'the' drop probability is quite high while for uncommon words it is actually negative (meaning don't ever drop)

In [20]:
print('word        occurances     p_drop')
for word in ['the', 'at', 'dog', 'cat', 'extravagant', 'sustaining']:
    token = w2i[word]
    print(f'{word:11}    '
          f'{tokens_counter[token]:7}     '
          f'{prob_drop[token]: .2f}')
word        occurances     p_drop
the            1061396      0.99
at               54576      0.94
dog                958      0.58
cat                692      0.51
extravagant         51     -0.81
sustaining          49     -0.85

Drop words according to probability (we could do this on per-batch basis)

In [21]:
words_fin = [tok for tok in words_tok if np.random.rand() > prob_drop[tok] ]

Generate Training Dataset

For each word sample the surrounding context of $R$ words on each side. To reflect the fact that more distant words are less relevant we pick $R$ as a random integer in range $[1:\text{max_window}]$ This could also be done on per-batch basis

In [22]:
max_window = 5

data_x, data_y = [], []

for i, tok in enumerate(words_fin):
    R = np.random.randint(1, max_window+1)
    start = max(i - R, 0)
    stop = i + R
    targets = words_fin[start:i] + words_fin[i+1:stop+1]
    
    data_x.extend([tok] * len(targets))
    data_y.extend(targets)    

Show sample data

In [23]:
print('Original:', words_fin[:10])
print('Inputs: ', data_x[:10])
print('Targets:', data_y[:10])
Original: [5233, 3080, 741, 10571, 133, 27349, 2, 15067, 58112, 854]
Inputs:  [5233, 5233, 5233, 5233, 3080, 3080, 3080, 741, 741, 10571]
Targets: [3080, 741, 10571, 133, 5233, 741, 10571, 3080, 10571, 5233]

Convert to tensors

In [24]:
train_x = torch.tensor(data_x).to(device)
train_y = torch.tensor(data_y).to(device)

Skip-Gram Model

In [31]:
class SkipGram(nn.Module):
    def __init__(self, n_vocab, n_embed):
        super(SkipGram, self).__init__()
        self.embed = nn.Embedding(num_embeddings=n_vocab, embedding_dim=n_embed)
        self.fc = nn.Linear(n_embed, n_vocab)
        
    def forward(self, x):
        x = self.embed(x)
        return self.fc(x)
In [32]:
n_vocab = len(w2i)  # size of vocabulary
n_embed = 300       # size of embedding dimension
In [33]:
model = SkipGram(n_vocab, n_embed)
model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)
In [34]:
def get_most_similar(model, test_words, topk):
    result = {}
    with torch.no_grad():
        for word in test_words:
            tok = w2i[word]
            x = torch.tensor([tok]).to(device)
            x_embed = model.embed(x)
            cos_sim = F.cosine_similarity(x_embed, model.embed.weight)
            _, indices = cos_sim.topk(topk+1)  # +1 because self is always most similar
            similar_words = [i2w[tok.item()] for tok in indices]
            result[word] = similar_words[1:]
        
    return result
In [35]:
n_batch = 3072
nb_epochs = 1

trace = {'loss': []}  # per iteration

iteration = 0
for epoch in range(1):

    time_start = time.time()

    #
    #   Train Model
    #
    model.train()
    for i in range(0, len(train_x), n_batch):

        # Pick mini-batch (over seqence dimension)
        inputs = train_x[i:i+n_batch]    # [n_batch]
        targets = train_y[i:i+n_batch]   # [n_batch]
        
        # Optimize
        optimizer.zero_grad()
        outputs = model(inputs)  # logits
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()

        # Record
        trace['loss'].append( loss.item() )
                
        if i % (100*n_batch) == 0:
            percent_complete = i * 100 / len(train_x)
            time_delta = time.time() - time_start
            print(f'Epoch: {epoch:3} ({percent_complete:.0f}%)     '
                f'Loss: {loss:.4f}     Time: {time_delta:.2f}s')
            
            test_words = ['king', 'rock', 'dog', 'jump', 'five', 'http']
            res_dict = get_most_similar(model, test_words, topk=5)
            for word, similar in res_dict.items():
                print(f'{word:<6}:', ' '.join(similar))
            print('----------')
            
            time_start = time.time()
            
Epoch:   0 (0%)     Loss: 11.2223     Time: 0.13s
king  : breakbeat devotions sv defends inserting
rock  : sociologist hypnotists fita contained octahedral
dog   : unclassified byng meridian midrashim laos
jump  : twice mickiewicz torturing intermarriage utmost
five  : stott bestiaries pirenne reinforcement iquique
http  : magnetospheric fluctuates mekhilta aggressors ductile
----------
Epoch:   0 (1%)     Loss: 10.9685     Time: 12.20s
king  : breakbeat devotions sv defends inserting
rock  : sociologist hypnotists fita contained octahedral
dog   : unclassified byng laos meridian dubnium
jump  : twice mickiewicz torturing utmost intermarriage
five  : stott bestiaries pirenne faction reinforcement
http  : fluctuates magnetospheric aggressors mekhilta ductile
----------
Epoch:   0 (2%)     Loss: 11.0106     Time: 12.13s
king  : breakbeat inserting sv defends crumpled
rock  : sociologist hypnotists contained fita octahedral
dog   : unclassified byng laos meridian dubnium
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott bestiaries pirenne faction reinforcement
http  : fluctuates magnetospheric mekhilta aggressors barbarossa
----------
Epoch:   0 (3%)     Loss: 10.4936     Time: 12.20s
king  : breakbeat devotions sv defends inserting
rock  : sociologist hypnotists fita contained octahedral
dog   : unclassified byng laos dubnium meridian
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott bestiaries faction pirenne reinforcement
http  : fluctuates magnetospheric barbarossa aggressors mekhilta
----------
Epoch:   0 (4%)     Loss: 10.7373     Time: 13.03s
king  : breakbeat sv devotions inserting defends
rock  : hypnotists sociologist fita octahedral contained
dog   : unclassified byng laos meridian dubnium
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott bestiaries faction pirenne reinforcement
http  : fluctuates magnetospheric barbarossa aggressors mekhilta
----------
Epoch:   0 (6%)     Loss: 10.8086     Time: 13.15s
king  : breakbeat devotions cushions defends sv
rock  : hypnotists sociologist fita contained octahedral
dog   : unclassified byng meridian dubnium padre
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott bestiaries faction affluent preservation
http  : fluctuates magnetospheric barbarossa ductile mekhilta
----------
Epoch:   0 (7%)     Loss: 10.5008     Time: 13.05s
king  : breakbeat devotions obituary sv navarro
rock  : hypnotists sociologist fita contained octahedral
dog   : unclassified byng meridian padre dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction bestiaries affluent preservation
http  : fluctuates magnetospheric mekhilta ductile barbarossa
----------
Epoch:   0 (8%)     Loss: 10.4753     Time: 12.98s
king  : breakbeat devotions sv obituary cushions
rock  : sociologist hypnotists fita contained octahedral
dog   : unclassified byng meridian padre dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction bestiaries affluent preservation
http  : fluctuates magnetospheric ductile mekhilta barbarossa
----------
Epoch:   0 (9%)     Loss: 10.5241     Time: 13.03s
king  : breakbeat devotions cushions obituary sv
rock  : sociologist hypnotists fita octahedral contained
dog   : unclassified byng meridian laos dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction bestiaries frontline affluent
http  : fluctuates magnetospheric ductile mekhilta barbarossa
----------
Epoch:   0 (10%)     Loss: 10.7246     Time: 13.02s
king  : breakbeat devotions obituary cushions sv
rock  : hypnotists fita sociologist octahedral contained
dog   : unclassified byng meridian laos dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction bestiaries affluent frontline
http  : fluctuates magnetospheric ductile barbarossa mekhilta
----------
Epoch:   0 (11%)     Loss: 10.3479     Time: 13.04s
king  : breakbeat obituary navarro devotions sv
rock  : sociologist hypnotists fita octahedral contained
dog   : unclassified byng meridian laos dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction bestiaries hike affluent
http  : fluctuates magnetospheric ductile barbarossa mekhilta
----------
Epoch:   0 (12%)     Loss: 10.5709     Time: 13.03s
king  : breakbeat obituary navarro sv devotions
rock  : sociologist hypnotists fita octahedral contained
dog   : unclassified byng midrashim dubnium laos
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott faction bestiaries hike frontline
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (13%)     Loss: 10.6129     Time: 13.04s
king  : breakbeat obituary devotions cushions navarro
rock  : sociologist fita hypnotists octahedral contained
dog   : unclassified byng midrashim dubnium meridian
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott faction bestiaries hike frontline
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (14%)     Loss: 10.1539     Time: 13.04s
king  : breakbeat cushions obituary navarro devotions
rock  : sociologist fita octahedral hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction hike preservation bestiaries
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (15%)     Loss: 10.2195     Time: 13.08s
king  : breakbeat cushions navarro obituary devotions
rock  : sociologist octahedral fita hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction hike bestiaries affluent
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (17%)     Loss: 11.3025     Time: 13.05s
king  : breakbeat cushions obituary navarro devotions
rock  : octahedral sociologist fita hypnotists contained
dog   : unclassified byng meridian dubnium midrashim
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction hike bestiaries affluent
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (18%)     Loss: 10.6344     Time: 13.03s
king  : breakbeat cushions obituary devotions defends
rock  : octahedral sociologist fita hypnotists contained
dog   : unclassified byng meridian midrashim laos
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction hike iquique affluent
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (19%)     Loss: 10.4506     Time: 13.03s
king  : breakbeat cushions obituary navarro devotions
rock  : octahedral sociologist fita hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott faction hike frontline iquique
http  : fluctuates magnetospheric barbarossa ductile aggressors
----------
Epoch:   0 (20%)     Loss: 10.7135     Time: 13.10s
king  : breakbeat cushions devotions navarro obituary
rock  : octahedral fita sociologist hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott faction iquique hike affluent
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (21%)     Loss: 10.5298     Time: 12.99s
king  : breakbeat cushions navarro defends devotions
rock  : octahedral fita sociologist hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott faction hike iquique affluent
http  : fluctuates magnetospheric ductile barbarossa aggressors
----------
Epoch:   0 (22%)     Loss: 10.3797     Time: 13.04s
king  : breakbeat cushions navarro defends devotions
rock  : octahedral fita sociologist hypnotists contained
dog   : unclassified byng meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction bestiaries affluent hike
http  : fluctuates magnetospheric ductile slayton barbarossa
----------
Epoch:   0 (23%)     Loss: 10.3056     Time: 13.06s
king  : cushions breakbeat navarro defends devotions
rock  : octahedral fita sociologist hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction bestiaries hike affluent
http  : fluctuates magnetospheric ductile barbarossa slayton
----------
Epoch:   0 (24%)     Loss: 9.9083     Time: 13.04s
king  : cushions breakbeat navarro devotions obituary
rock  : octahedral fita sociologist hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction hike bestiaries affluent
http  : fluctuates magnetospheric ductile by barbarossa
----------
Epoch:   0 (25%)     Loss: 10.3970     Time: 13.00s
king  : cushions breakbeat navarro defends devotions
rock  : octahedral sociologist fita hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction iquique aelia frontline
http  : fluctuates magnetospheric by aggressors mekhilta
----------
Epoch:   0 (27%)     Loss: 10.8339     Time: 12.99s
king  : cushions breakbeat navarro devotions defends
rock  : octahedral sociologist fita hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction iquique aelia chilean
http  : fluctuates magnetospheric by barbarossa slayton
----------
Epoch:   0 (28%)     Loss: 10.7849     Time: 12.99s
king  : cushions breakbeat navarro devotions defends
rock  : octahedral fita sociologist hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction iquique hike aelia
http  : fluctuates magnetospheric by barbarossa slayton
----------
Epoch:   0 (29%)     Loss: 10.6845     Time: 13.06s
king  : cushions breakbeat navarro devotions obituary
rock  : octahedral fita sociologist hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction hike iquique frontline
http  : fluctuates magnetospheric by slayton aggressors
----------
Epoch:   0 (30%)     Loss: 10.7814     Time: 13.01s
king  : cushions breakbeat navarro devotions defends
rock  : octahedral fita sociologist hypnotists contained
dog   : byng unclassified meridian midrashim dubnium
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction iquique aelia affluent
http  : fluctuates magnetospheric by slayton barbarossa
----------
Epoch:   0 (31%)     Loss: 10.6334     Time: 13.02s
king  : cushions breakbeat navarro devotions obituary
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium midrashim
jump  : twice utmost mickiewicz torturing boasts
five  : stott faction iquique hike gallen
http  : fluctuates magnetospheric barbarossa aggressors slayton
----------
Epoch:   0 (32%)     Loss: 10.0962     Time: 13.03s
king  : cushions breakbeat navarro devotions obituary
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium midrashim
jump  : twice utmost mickiewicz torturing boasts
five  : stott faction gallen hike iquique
http  : fluctuates magnetospheric barbarossa aggressors slayton
----------
Epoch:   0 (33%)     Loss: 10.1201     Time: 12.99s
king  : cushions breakbeat navarro obituary devotions
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium cows
jump  : twice utmost mickiewicz torturing boasts
five  : stott faction frontline affluent iquique
http  : fluctuates magnetospheric barbarossa by aggressors
----------
Epoch:   0 (34%)     Loss: 10.5759     Time: 13.06s
king  : cushions breakbeat navarro obituary devotions
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium cows
jump  : twice utmost mickiewicz torturing bacterium
five  : stott faction iquique frontline affluent
http  : fluctuates magnetospheric barbarossa aggressors slayton
----------
Epoch:   0 (35%)     Loss: 10.6580     Time: 13.07s
king  : cushions breakbeat navarro obituary devotions
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium cows
jump  : twice utmost mickiewicz torturing bacterium
five  : stott faction iquique affluent aelia
http  : fluctuates magnetospheric aggressors uprooted slayton
----------
Epoch:   0 (37%)     Loss: 9.4880     Time: 13.01s
king  : cushions breakbeat navarro obituary devotions
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified meridian dubnium cows
jump  : twice utmost mickiewicz torturing intermarriage
five  : stott faction iquique aelia affluent
http  : magnetospheric fluctuates aggressors uprooted slayton
----------
Epoch:   0 (38%)     Loss: 10.9283     Time: 12.98s
king  : cushions breakbeat navarro obituary maddox
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified cows meridian outraged
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott faction two iquique affluent
http  : fluctuates magnetospheric by uprooted aggressors
----------
Epoch:   0 (39%)     Loss: 10.3218     Time: 12.99s
king  : cushions breakbeat navarro maddox obituary
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified cows outraged meridian
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott faction chilean two affluent
http  : fluctuates magnetospheric by uprooted aggressors
----------
Epoch:   0 (40%)     Loss: 10.0964     Time: 13.01s
king  : cushions breakbeat navarro maddox obituary
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified cows dubnium outraged
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott two faction chilean iquique
http  : fluctuates magnetospheric by uprooted aggressors
----------
Epoch:   0 (41%)     Loss: 10.6877     Time: 12.87s
king  : cushions breakbeat obituary navarro maddox
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified outraged yogi dubnium
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott two faction chilean iquique
http  : fluctuates by magnetospheric uprooted aggressors
----------
Epoch:   0 (42%)     Loss: 10.2543     Time: 12.96s
king  : cushions breakbeat obituary navarro sv
rock  : octahedral fita sociologist contained negev
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott two gallen chilean faction
http  : fluctuates by uprooted magnetospheric slayton
----------
Epoch:   0 (43%)     Loss: 10.4638     Time: 13.06s
king  : cushions breakbeat obituary navarro maddox
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz utmost torturing intermarriage
five  : stott two gallen chilean iquique
http  : fluctuates by uprooted magnetospheric slayton
----------
Epoch:   0 (44%)     Loss: 10.7690     Time: 12.94s
king  : cushions navarro breakbeat obituary maddox
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott two gallen chilean affluent
http  : fluctuates by uprooted slayton magnetospheric
----------
Epoch:   0 (45%)     Loss: 10.5092     Time: 12.96s
king  : cushions obituary navarro breakbeat devotions
rock  : octahedral fita sociologist contained hypnotists
dog   : byng unclassified outraged cows yogi
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two gallen chilean affluent
http  : fluctuates by uprooted slayton magnetospheric
----------
Epoch:   0 (46%)     Loss: 10.6718     Time: 12.99s
king  : obituary cushions navarro breakbeat maddox
rock  : octahedral fita sociologist hypnotists contained
dog   : unclassified byng outraged cows yogi
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two gallen affluent chilean
http  : uprooted fluctuates by magnetospheric slayton
----------
Epoch:   0 (48%)     Loss: 10.7360     Time: 12.98s
king  : obituary cushions navarro breakbeat devotions
rock  : octahedral fita sociologist sam hypnotists
dog   : unclassified byng outraged cows yogi
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two gallen affluent chilean
http  : by fluctuates slayton uprooted magnetospheric
----------
Epoch:   0 (49%)     Loss: 10.6004     Time: 13.01s
king  : obituary ruler cushions breakbeat navarro
rock  : octahedral sociologist fita sam hypnotists
dog   : unclassified byng outraged cows yogi
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two gallen affluent hungarians
http  : html uprooted slayton by fluctuates
----------
Epoch:   0 (50%)     Loss: 11.1435     Time: 12.94s
king  : obituary cushions breakbeat ruler navarro
rock  : octahedral sociologist fita sam skeleton
dog   : unclassified byng outraged yogi cows
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two seven gallen sofa
http  : html by slayton uprooted fluctuates
----------
Epoch:   0 (51%)     Loss: 10.2044     Time: 12.97s
king  : obituary breakbeat maddox navarro cushions
rock  : octahedral sociologist fita sam skeleton
dog   : byng unclassified outraged yogi jehu
jump  : twice utmost mickiewicz intermarriage torturing
five  : stott two seven gallen blackstone
http  : html by slayton fluctuates uprooted
----------
Epoch:   0 (52%)     Loss: 10.6340     Time: 12.98s
king  : obituary breakbeat maddox navarro cushions
rock  : octahedral sociologist fita sam hypnotists
dog   : byng unclassified outraged yogi jehu
jump  : twice mickiewicz utmost intermarriage torturing
five  : stott seven two gallen politician
http  : html slayton by uprooted fluctuates
----------
Epoch:   0 (53%)     Loss: 10.7180     Time: 12.99s
king  : obituary breakbeat cushions maddox devotions
rock  : octahedral sociologist fita sam hypnotists
dog   : byng unclassified outraged jehu yogi
jump  : twice mickiewicz utmost boasts intermarriage
five  : stott two seven politician blackstone
http  : html slayton uprooted medvedev fluctuates
----------
Epoch:   0 (54%)     Loss: 10.1787     Time: 12.98s
king  : obituary breakbeat devotions maddox navarro
rock  : octahedral sociologist fita sam skeleton
dog   : byng unclassified outraged jehu cows
jump  : twice mickiewicz boasts utmost intermarriage
five  : seven two stott gallen blackstone
http  : html uprooted slayton medvedev fluctuates
----------
Epoch:   0 (55%)     Loss: 10.5638     Time: 13.06s
king  : obituary breakbeat devotions maddox navarro
rock  : octahedral sociologist fita sam negev
dog   : byng unclassified outraged cows jehu
jump  : twice mickiewicz boasts utmost esas
five  : two seven stott gallen blackstone
http  : html uprooted slayton medvedev fluctuates
----------
Epoch:   0 (56%)     Loss: 9.9627     Time: 12.95s
king  : obituary breakbeat maddox devotions navarro
rock  : octahedral sociologist fita negev skeleton
dog   : unclassified byng outraged cows jehu
jump  : twice mickiewicz boasts utmost intermarriage
five  : two stott seven gallen blackstone
http  : html uprooted slayton medvedev barbarossa
----------
Epoch:   0 (58%)     Loss: 10.7108     Time: 13.00s
king  : obituary breakbeat devotions maddox navarro
rock  : octahedral sociologist fita negev isra
dog   : unclassified byng outraged cows jehu
jump  : twice mickiewicz boasts utmost intermarriage
five  : two stott seven gallen blackstone
http  : html uprooted slayton medvedev barbarossa
----------
Epoch:   0 (59%)     Loss: 10.1034     Time: 12.96s
king  : obituary devotions breakbeat navarro maddox
rock  : octahedral sociologist foxhound hypnotists negev
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz boasts utmost esas
five  : two stott seven gallen blackstone
http  : html google uprooted medvedev slayton
----------
Epoch:   0 (60%)     Loss: 10.0191     Time: 12.97s
king  : devotions breakbeat obituary navarro maddox
rock  : octahedral sociologist hypnotists negev foxhound
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz utmost boasts intermarriage
five  : two stott gallen seven blackstone
http  : html uprooted google medvedev slayton
----------
Epoch:   0 (61%)     Loss: 10.5854     Time: 12.99s
king  : obituary navarro maddox breakbeat devotions
rock  : octahedral sociologist hypnotists negev foxhound
dog   : byng unclassified outraged cows yogi
jump  : twice mickiewicz utmost boasts intermarriage
five  : two stott seven gallen blackstone
http  : html uprooted medvedev google org
----------
Epoch:   0 (62%)     Loss: 10.0884     Time: 12.95s
king  : obituary navarro maddox ruler breakbeat
rock  : octahedral sociologist hypnotists negev foxhound
dog   : byng unclassified cows outraged yogi
jump  : twice mickiewicz utmost boasts intermarriage
five  : two stott seven gallen blackstone
http  : html uprooted medvedev google slayton
----------
Epoch:   0 (63%)     Loss: 10.5402     Time: 12.99s
king  : obituary navarro ruler maddox breakbeat
rock  : octahedral sociologist hypnotists negev foxhound
dog   : byng unclassified outraged cows yogi
jump  : twice utmost mickiewicz boasts intermarriage
five  : two stott seven gallen three
http  : html google medvedev uprooted slayton
----------
Epoch:   0 (64%)     Loss: 9.9091     Time: 13.01s
king  : obituary navarro ruler maddox breakbeat
rock  : octahedral sociologist hypnotists negev foxhound
dog   : byng unclassified cows outraged yogi
jump  : twice utmost mickiewicz boasts intermarriage
five  : two stott seven three gallen
http  : html medvedev google uprooted slayton
----------
Epoch:   0 (65%)     Loss: 9.9044     Time: 12.95s
king  : obituary ruler emperor navarro heir
rock  : octahedral sociologist negev hypnotists foxhound
dog   : byng unclassified cows outraged yogi
jump  : twice boasts utmost mickiewicz thorium
five  : two seven stott three six
http  : html medvedev org google uprooted
----------
Epoch:   0 (66%)     Loss: 10.2254     Time: 12.91s
king  : obituary ruler emperor navarro devotions
rock  : octahedral sociologist negev foxhound fita
dog   : byng unclassified cows outraged yogi
jump  : twice boasts utmost mickiewicz thorium
five  : two seven stott three six
http  : html medvedev google org uprooted
----------
Epoch:   0 (68%)     Loss: 10.3730     Time: 12.99s
king  : obituary ruler devotions heir navarro
rock  : octahedral sociologist negev foxhound fita
dog   : byng unclassified cows outraged yogi
jump  : twice boasts utmost mickiewicz thorium
five  : two seven stott zero six
http  : html medvedev org google uprooted
----------
Epoch:   0 (69%)     Loss: 10.3639     Time: 12.99s
king  : obituary devotions ruler ingenious heir
rock  : octahedral sociologist negev foxhound funk
dog   : byng unclassified cows jehu outraged
jump  : twice boasts mickiewicz utmost thorium
five  : two seven stott zero six
http  : html medvedev org google uprooted
----------
Epoch:   0 (70%)     Loss: 9.9222     Time: 12.99s
king  : obituary devotions heir ingenious maddox
rock  : octahedral sociologist negev foxhound funk
dog   : byng unclassified cows outraged jehu
jump  : twice boasts mickiewicz utmost thorium
five  : two seven zero stott three
http  : html medvedev org google uprooted
----------
Epoch:   0 (71%)     Loss: 9.3872     Time: 13.00s
king  : obituary devotions heir maddox ingenious
rock  : octahedral sociologist foxhound funk negev
dog   : byng unclassified cows outraged jehu
jump  : twice boasts mickiewicz utmost thorium
five  : two zero seven six three
http  : html medvedev org google uprooted
----------
Epoch:   0 (72%)     Loss: 9.8873     Time: 12.94s
king  : obituary maddox heir devotions navarro
rock  : octahedral sociologist negev funk foxhound
dog   : byng cows unclassified outraged padre
jump  : twice boasts utmost mickiewicz thorium
five  : two zero seven six three
http  : html medvedev org google uprooted
----------
Epoch:   0 (73%)     Loss: 10.5275     Time: 12.95s
king  : obituary maddox devotions heir ingenious
rock  : octahedral sociologist funk foxhound contained
dog   : cows byng unclassified outraged padre
jump  : twice boasts mickiewicz utmost surfboard
five  : two zero seven three six
http  : html org medvedev google se
----------
Epoch:   0 (74%)     Loss: 10.5691     Time: 13.02s
king  : obituary maddox devotions heir ingenious
rock  : octahedral sociologist foxhound skeleton thyme
dog   : cows byng unclassified yogi padre
jump  : twice boasts utmost mickiewicz surfboard
five  : two zero seven three six
http  : html org medvedev google se
----------
Epoch:   0 (75%)     Loss: 9.7746     Time: 12.98s
king  : obituary maddox heir devotions tot
rock  : octahedral sociologist funk skeleton foxhound
dog   : cows byng unclassified yogi padre
jump  : twice boasts mickiewicz utmost surfboard
five  : two seven zero nine six
http  : html org medvedev google se
----------
Epoch:   0 (76%)     Loss: 10.3947     Time: 12.98s
king  : maddox obituary devotions heir tot
rock  : octahedral sociologist funk skeleton negev
dog   : unclassified cows byng yogi jehu
jump  : twice boasts utmost mickiewicz surfboard
five  : two zero seven six nine
http  : html org se medvedev google
----------
Epoch:   0 (77%)     Loss: 10.2181     Time: 13.03s
king  : maddox obituary heir tot devotions
rock  : octahedral sociologist funk skeleton negev
dog   : unclassified cows byng outraged yogi
jump  : twice utmost boasts mickiewicz surfboard
five  : two seven zero six three
http  : html org se medvedev google
----------
Epoch:   0 (79%)     Loss: 10.1526     Time: 12.99s
king  : maddox obituary heir tot devotions
rock  : octahedral sociologist skeleton funk negev
dog   : cows unclassified byng gingerbread outraged
jump  : twice utmost boasts mickiewicz surfboard
five  : two seven zero six three
http  : html org se google medvedev
----------
Epoch:   0 (80%)     Loss: 10.2632     Time: 13.05s
king  : maddox obituary heir devotions aleksandar
rock  : octahedral sociologist funk skeleton sam
dog   : cows unclassified byng outraged gingerbread
jump  : twice surfboard utmost boasts mickiewicz
five  : two seven zero six three
http  : html org se google medvedev
----------
Epoch:   0 (81%)     Loss: 10.6909     Time: 13.00s
king  : maddox obituary ingenious aleksandar stupor
rock  : octahedral sociologist funk skeleton musicians
dog   : cows unclassified byng outraged gingerbread
jump  : twice surfboard boasts utmost mickiewicz
five  : two zero seven three six
http  : html org google se medvedev
----------
Epoch:   0 (82%)     Loss: 10.4086     Time: 13.02s
king  : maddox obituary stupor ingenious aleksandar
rock  : octahedral funk sociologist skeleton musicians
dog   : cows unclassified byng outraged gingerbread
jump  : twice boasts utmost surfboard mickiewicz
five  : two seven zero three six
http  : html org google se com
----------
Epoch:   0 (83%)     Loss: 10.4131     Time: 12.94s
king  : maddox obituary stupor ingenious aleksandar
rock  : octahedral sociologist funk skeleton music
dog   : unclassified cows byng gingerbread yogi
jump  : twice surfboard boasts utmost mickiewicz
five  : two zero seven three six
http  : html org google se com
----------
Epoch:   0 (84%)     Loss: 9.8197     Time: 12.95s
king  : maddox obituary stupor aleksandar ingenious
rock  : octahedral funk skeleton sociologist musicians
dog   : cows unclassified byng yogi outraged
jump  : twice surfboard boasts utmost mickiewicz
five  : two zero seven three fixative
http  : html org google se medvedev
----------
Epoch:   0 (85%)     Loss: 10.5381     Time: 12.97s
king  : maddox obituary stupor aleksandar ingenious
rock  : octahedral funk sociologist skeleton musicians
dog   : unclassified cows byng outraged yogi
jump  : twice surfboard boasts utmost mickiewicz
five  : two zero nine seven six
http  : html org google se medvedev
----------
Epoch:   0 (86%)     Loss: 10.3994     Time: 12.98s
king  : obituary maddox ingenious stupor aleksandar
rock  : octahedral funk skeleton sociologist negev
dog   : unclassified byng cows outraged yogi
jump  : twice surfboard boasts utmost mickiewicz
five  : two zero seven six nine
http  : html org google se com
----------
Epoch:   0 (87%)     Loss: 10.1399     Time: 12.96s
king  : obituary maddox stupor ingenious aleksandar
rock  : octahedral funk sociologist skeleton thyme
dog   : unclassified cows byng outraged yogi
jump  : twice surfboard boasts mickiewicz utmost
five  : two zero seven six one
http  : html org google se com
----------
Epoch:   0 (89%)     Loss: 10.6667     Time: 12.98s
king  : obituary maddox ingenious aleksandar stupor
rock  : octahedral sociologist funk skeleton musicians
dog   : unclassified byng cows outraged yogi
jump  : twice surfboard boasts mickiewicz utmost
five  : two seven zero six nine
http  : html org google com se
----------
Epoch:   0 (90%)     Loss: 10.2227     Time: 12.95s
king  : obituary maddox ingenious stupor aleksandar
rock  : octahedral funk sociologist skeleton musicians
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts mickiewicz utmost
five  : two zero seven six nine
http  : html org google se com
----------
Epoch:   0 (91%)     Loss: 10.3509     Time: 12.93s
king  : obituary maddox stupor aleksandar hereditary
rock  : octahedral sociologist skeleton funk inspirational
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard mickiewicz utmost boasts
five  : two zero seven six one
http  : html org google se webcam
----------
Epoch:   0 (92%)     Loss: 9.9819     Time: 12.98s
king  : obituary maddox stupor ruler hereditary
rock  : octahedral sociologist skeleton funk inspirational
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts mickiewicz hydrological
five  : two seven zero six one
http  : html org google com se
----------
Epoch:   0 (93%)     Loss: 9.2164     Time: 12.95s
king  : obituary maddox ruler stupor hereditary
rock  : octahedral skeleton sociologist funk mixing
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts utmost mickiewicz
five  : two seven zero six nine
http  : html org google se webcam
----------
Epoch:   0 (94%)     Loss: 10.1278     Time: 12.95s
king  : obituary maddox ruler aleksandar stupor
rock  : octahedral skeleton sociologist funk thyme
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts utmost mickiewicz
five  : seven two zero six nine
http  : html org google com se
----------
Epoch:   0 (95%)     Loss: 9.9521     Time: 12.98s
king  : obituary maddox ruler aleksandar heir
rock  : octahedral skeleton sociologist funk thyme
dog   : cows byng unclassified outraged gingerbread
jump  : twice surfboard utmost boasts mickiewicz
five  : seven two zero nine six
http  : html org google com se
----------
Epoch:   0 (96%)     Loss: 10.0453     Time: 12.97s
king  : obituary maddox aleksandar stupor molloy
rock  : octahedral skeleton sociologist funk mixing
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts utmost hydrological
five  : seven two zero six nine
http  : html org google se com
----------
Epoch:   0 (97%)     Loss: 10.2882     Time: 12.95s
king  : obituary maddox aleksandar stupor molloy
rock  : octahedral skeleton sociologist thyme funk
dog   : cows byng unclassified outraged yogi
jump  : twice surfboard boasts utmost hydrological
five  : seven two zero six one
http  : html org google se webcam
----------
Epoch:   0 (99%)     Loss: 10.2121     Time: 12.96s
king  : obituary maddox aleksandar heir stupor
rock  : octahedral skeleton sociologist thyme funk
dog   : cows byng unclassified outraged gingerbread
jump  : twice surfboard boasts hydrological utmost
five  : two seven zero one six
http  : html org google com se
----------
Epoch:   0 (100%)     Loss: 10.7685     Time: 12.90s
king  : obituary heir aleksandar maddox stupor
rock  : octahedral mixing skeleton funk sociologist
dog   : cows unclassified byng gingerbread yogi
jump  : twice surfboard boasts utmost hydrological
five  : two seven zero six one
http  : html org google com htm
----------

After one epoch it picks up numbers and web domains. It needs couple more epochs at lest.