• Stars
    star
    145
  • Rank 246,140 (Top 5 %)
  • Language
    Python
  • License
    Creative Commons ...
  • Created almost 8 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Calculate the Gini coefficient of a numpy array.

gini

A Gini coefficient calculator in Python.

Overview

This is a function that calculates the Gini coefficient of a numpy array. Gini coefficients are often used to quantify income inequality, read more here.

The function in gini.py is based on the third equation from here, which defines the Gini coefficient as:

G = \dfrac{ \sum_{i=1}^{n} (2i - n - 1) x_i}{n  \sum_{i=1}^{n} x_i}

Examples

For a very unequal sample, 999 zeros and a single one,

>>> from gini import *
>>> a = np.zeros((1000))
>>> a[0] = 1.0

the Gini coefficient is very close to 1.0:

>>> gini(a)
0.99890010998900103

For uniformly distributed random numbers, it will be low, around 0.33:

>>> s = np.random.uniform(-1,0,1000)
>>> gini(s)
0.3295183767105907

For a homogeneous sample, the Gini coefficient is 0.0:

>>> b = np.ones((1000))
>>> gini(b)
0.0

Input Assumptions

The Gini calculation by definition requires non-zero positive (ascending-order) sorted values within a 1d vector. This is dealt with within gini(). So these four assumptions can be violated, as they are controlled for:

def gini(array):
    """Calculate the Gini coefficient of a numpy array."""
    # based on bottom eq: http://www.statsdirect.com/help/content/image/stat0206_wmf.gif
    # from: http://www.statsdirect.com/help/default.htm#nonparametric_methods/gini.htm
    array = array.flatten() #all values are treated equally, arrays must be 1d
    if np.amin(array) < 0:
        array -= np.amin(array) #values cannot be negative
    array += 0.0000001 #values cannot be 0
    array = np.sort(array) #values must be sorted
    index = np.arange(1,array.shape[0]+1) #index per array element
    n = array.shape[0]#number of array elements
    return ((np.sum((2 * index - n  - 1) * array)) / (n * np.sum(array))) #Gini coefficient

Notes

Many other Gini coefficient functions found online do not produce equivalent results, hence why I wrote this.

More Repositories

1

git-and-github-cheat-sheet

Some very basic git and GitHub commands and resources.
115
star
2

redistrict

Gerrymandering and Computational Redistricting
Jupyter Notebook
29
star
3

connectionism

Teaching metarials for connectionism block practical.
TeX
27
star
4

weighted_k_means

A weighted k-means implementation.
Python
18
star
5

compcog.science

http://compcog.science
HTML
11
star
6

pdist

Calculate mean of pairwise weighted distances between points using great circle metric.
Python
11
star
7

oliviaguest.github.io

My personal website
HTML
11
star
8

pyceptron

Pyceptron is a simple neural network with a simple GUI.
TeX
8
star
9

what-is-computational-reproducibility

What is computational reproducibility?
TeX
8
star
10

cv

Olivia Guest's CV
TeX
6
star
11

neuroplausible

My blog
CSS
4
star
12

CFD-backgrounds

Python
4
star
13

pairwise_distance

Python
3
star
14

levels-of-representation-in-a-deep-learning-model-of-categorization

Levels of Representation in a Deep Learning Model of Categorization
Python
3
star
15

simulating-damage-in-networks

On Simulating Neural Damage in Connectionist Networks
C
3
star
16

redistrict.science

JavaScript
3
star
17

brain-imaging-and-the-neural-code

What the Success of Brain Imaging Implies about the Neural Code
TeX
3
star
18

topic-model

Topic modelling for visual/semantic features.
MATLAB
2
star
19

for-julie

Opening files from a list
Python
1
star
20

Silberbauer-2020

TeX
1
star
21

random-dots

Create randomly generated categories with predefined structures.
Python
1
star
22

andrea.science

HTML
1
star
23

smm

Various semantic memory models.
C
1
star
24

example-ci

Demoing unit testing, Travis CI, and codecov.
Python
1
star
25

transneuro.science

CSS
1
star
26

pizza-problem

1
star