Caption similarity as measured by NLP tools¶
[1]:
import tensorflow as tf
import tensorflow_hub as hub
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.io
import string
import operator
import matplotlib as mpl
[2]:
import caption_contest_data as ccd
The sentence encoder¶
We’re using this model for sentence embedding. If you don’t have it downloaded to the local directory below, replace the below cell with
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2")
to have the model downloaded.
[3]:
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
[4]:
import os
os.environ["TFHUB_DOWNLOAD_PROGRESS"] = "1"
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2")
This cell cached the downloaded files (though I don’t know where on macOS).
[5]:
# embed = hub.Module("./universal-sentence-encoder")