Failing Word Analogies

Unfortunately, there's a big flaw in the linear projection trick.

Code

Let's use whatlies to explore these analogies.

import numpy as np
import pandas as pd 

from whatlies import Embedding, EmbeddingSet
from whatlies.transformers import Pca
from whatlies.language import FasttextLanguage, SpacyLanguage, BytePairLanguage

lang_ft = FasttextLanguage("cc.en.300.bin")
lang_sp = SpacyLanguage("en_core_web_md")

Similar to `king`

We can start by retreiving the most similar embeddings based on cosine distance.

lang_ft.score_similar(lang_ft['king'], n=10, metric='cosine')

This gives us these results.

[(Emb[king], 0.0),
 (Emb[kings], 0.2449641227722168),
 (Emb[queen], 0.2931479215621948),
 (Emb[King], 0.3408734202384949),
 (Emb[prince], 0.35047459602355957),
 (Emb[royal], 0.41696715354919434),
 (Emb[throne], 0.42722034454345703),
 (Emb[kingdom], 0.434279203414917),
 (Emb[emperor], 0.44683873653411865),
 (Emb[lord], 0.4479447603225708)]

Similar to `king - man + woman`

We can also expand the query by adding operations.

lang_ft.score_similar(lang_ft['king'] - lang_ft['man'] + lang_ft['woman'], n=10, metric='cosine')

This gives us these results.

[(Emb[king], 0.2713325619697571),
 (Emb[queen], 0.3457321524620056),
 (Emb[kings], 0.45897185802459717),
 (Emb[Queen], 0.49255800247192383),
 (Emb[royal], 0.49954700469970703),
 (Emb[King], 0.5179671049118042),
 (Emb[throne], 0.554189920425415),
 (Emb[princess], 0.5551300048828125),
 (Emb[prince], 0.6072607636451721),
 (Emb[palace], 0.623775064945221)]

Similar to `king - slow + fast`

Let's try another one.

lang_ft.score_similar(lang_ft['king'] - lang_ft['slow'] + lang_ft['fast'], n=10, metric='cosine')

This gives us these results.

[(Emb[king], 0.20691156387329102),
 (Emb[kings], 0.3835362195968628),
 (Emb[queen], 0.45022904872894287),
 (Emb[King], 0.45194685459136963),
 (Emb[prince], 0.48818516731262207),
 (Emb[royal], 0.5023854970932007),
 (Emb[kingdom], 0.5079109072685242),
 (Emb[throne], 0.5353788137435913),
 (Emb[emperor], 0.5441315174102783),
 (Emb[princess], 0.5490601658821106)]

Strange

It seems like king - man + woman is further away from queen than king - slow + fast.

There are many more of these examples worth exploring. In general though, it's safe to say that word analogies do not hold. If you're interested in exploring more, you may appreciate this helper function.

def to_dataf(emb_list_before, emb_list_after):
    """Turns before/after Embedding score-lists into a single dataframe."""
    names_before = [_[0].name for _ in emb_list_before]
    scores_before = [_[1] for _ in emb_list_before]
    names_after = [_[0].name for _ in emb_list_after]
    scores_after = [_[1] for _ in emb_list_after]
    res = pd.DataFrame({'before_word': names_before, 
                        'before_score': scores_before, 
                        'after_word': names_after,
                        'after_score': scores_after})
    return (res
            .assign(before_score=lambda d: np.round(d['before_score'], 4))
            .assign(after_score=lambda d: np.round(d['after_score'], 4)))

def retreive_most_similar(lang, start, positive=(), negative=(), orthogonal=(), unto=(), n=10, metric='cosine'):
    """Utility function to quickly perform arithmetic and get an overview."""
    start_emb = lang[start]
    base_dist = lang.score_similar(start_emb, n=10, metric=metric)
    for pos in positive:
        start_emb = start_emb + lang[pos]
    for neg in negative:
        start_emb = start_emb - lang[neg]
    for ort in orthogonal:
        start_emb = start_emb - lang[ort]
    for un in unto:
        start_emb = start_emb - lang[un]
    proj_dist = lang.score_similar(start_emb, n=10, metric=metric)
    return to_dataf(base_dist, proj_dist)

retreive_most_similar(lang_sp, start="king", positive=["woman"], negative=["man"])

Exercises

Try to answer the following questions to test your knowledge.

What other analogies can you come up with besides king - man + woman? Can you varify if these hold?
If you like to do a small coding execise; can you confirm that analogies don't hold for BERT-kinds of models?

Failing Word Analogies

Video

Code

Similar to `king`

Similar to `king - man + woman`

Similar to `king - slow + fast`

Strange

Explore

Exercises