This repo contains information/implementation (PyTorch, Tensorflow) about IS and FID score. This is a handy toolbox that you can easily add to your projects. TF implementations are intended to compute the exact same output as the official ones for reporting in papers. Discussion/PR/Issues are very welcomed.
Put this metrics/
folder in your projects, and see below (Pytorch), and each .py's head comment for usage.
We also need to download some files in res/, see res/ for more details.
TF implementations (almost the same as official, just changed the interface, can be reported in papers)
- inception score
- FID score
- calculate stats (mu, sigma)
Pytorch Implementation (CANNOT report in papers, but can get an quick view)
- pytorch, torchvision, scipy, numpy, tqdm
- inception score, get around
mean=9.67278, std=0.14992
for CIFAR-10 train data when n_split=10 - FID score
- calculate stats for custom images in a folder (mu, sigma)
- multi-GPU support by
- e.g.
will use 4 GPU.
- e.g.
- inception score, get around
command line usage
calculate IS, FID
# calc IS score on CIFAR10, will download CIFAR10 data to ../data/cifar10 python # calc IS score on custom images in a folder/ python --path foldername/ # calc IS, FID score on custom images in a folder/, compared to CIFAR10 (given precalculated stats) python --path foldername/ --fid res/stats_pytorch/fid_stats_cifar10_train.npz # calc FID on custom images in two folders/ python --path foldername1/ --fid foldername2/ # calc FID on two precalculated stats python --path res/stats_pytorch/fid_stats_cifar10_train.npz --fid res/stats_pytorch/fid_stats_cifar10_train.npz
precalculate stats
# precalculate stats store as npz for CIFAR 10, will download CIFAR10 data to ../data/cifar10 python --save-stats-path res/stats_pytorch/fid_stats_cifar10_train.npz # precalculate stats store as npz for images in folder/ python --path foldername/ --save-stats-path res/stats_pytorch/fid_stats_folder.npz
in code usage
: image tensor has already normalized bymean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
: image tensor has already normalized bymean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500]
from metrics import is_fid_pytorch # using precalculated stats (.npz) for FID calculation is_fid_model = is_fid_pytorch.ScoreModel(mode=2, stats_file='res/stats_pytorch/fid_stats_cifar10_train.npz', cuda=cuda) imgs_nchw = torch.Tensor(50000, C, H, W) # torch.Tensor in -1~1, normalized by mean=[0.500, 0.500, 0.500], std=[0.500, 0.500, 0.500] is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw) # we can also pass in mu, sigma for get_score_image_tensor() is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda) mu, sigma = is_fid_pytorch.read_stats_file('res/stats_pytorch/fid_stats_cifar10_train.npz') is_mean, is_std, fid = is_fid_model.get_score_image_tensor(imgs_nchw, mu1=mu, sigma1=sigma) # if no need FID is_fid_model = is_fid_pytorch.ScoreModel(mode=2, cuda=cuda) is_mean, is_std, _ = is_fid_model.get_score_image_tensor(imgs_nchw) # if want stats (mu, sigma) for imgs_nchw, send in return_stats=True is_mean, is_std, _, mu, sigma = is_fid_model.get_score_image_tensor(imgs_nchw, return_stats=True) # from pytorch dataset, use get_score_dataset(), instead of get_score_image_tensor(), other usage is the same cifar = dset.CIFAR10(root='../data/cifar10', download=True, transform=transforms.Compose([ transforms.Resize(32), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) ) IgnoreLabelDataset(cifar) is_mean, is_std, _ = is_fid_model.get_score_dataset(IgnoreLabelDataset(cifar))
Inception Score (IS)
- MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class).
$p(y|\mathbf{x})$ is of low entropy. - DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution
$p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is of high entropy. - Better models: KL Divergence of
$p(y|\mathbf{x})$ and$p(y)$ should be high.
- $\mathbf{IS} = \exp (\mathbb{E}{\mathbf{x} \sim p_g} D{KL} [p(y|\mathbf{x}) || p(y)] )$
- where
$\mathbf{x}$ is sampled from generated data -
$p(y|\mathbf{x})β$ is the output probability of Inception v3 when input is$\mathbf{x}β$ -
$p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is the average output probability of all generated data (from InceptionV3, 1000-dim vector) -
$D_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}$ , where$j$ is the dimension of the output probability.
$p(y)$ is a evenly distributed vector - larger
$\mathbf{IS}β$ score -> larger KL divergence -> larger diversity and clearness
- Official TF implementation is in openai/improved-gan
- Pytorch Implementation: sbarratt/inception-score-pytorch
- TF seemed to provide a good implementation
- scipy.stats.entropy
- zhihu: Inception Score ηεηεε±ιζ§
- A Note on the Inception Score
FrΓ©chet Inception Distance (FID)
$\mathbf{FID} = ||\mu_r - \mu_g||^2 + Tr(\Sigma_{r} + \Sigma_{g} - 2(\Sigma_r \Sigma_g)^{1/2})β$ - where
$Tr$ is trace of a matrix (wikipedia) -
$X_r \sim \mathcal{N}(\mu_r, \Sigma_r)$ and$X_g \sim \mathcal{N}(\mu_g, \Sigma_g)$ are the 2048-dim activations the InceptionV3 pool3 layer -
$\mu_r$ is the mean of real photo's feature -
$\mu_g$ is the mean of generated photo's feature -
$\Sigma_r$ is the covariance matrix of real photo's feature -
$\Sigma_g$ is the covariance matrix of generated photo's feature
- Official TF implementation: bioinf-jku/TTUR
- Pytorch Implementation: mseitzer/pytorch-fid
- TF seemed to provide a good implementation
- zhihu: Frechet Inception Score (FID)
- Explanation from Neal Jean