• Stars
    star
    148
  • Rank 249,983 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 4 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Machine vision toolbox for Python

Machine Vision Toolbox for Python

A Python Robotics Package Powered by Spatial Maths QUT Centre for Robotics Open Source

PyPI version Python Version Powered by OpenCV Powered by Open3D License: MIT

Build Status Coverage PyPI - Downloads

A Python implementation of the Machine Vision Toolbox for MATLAB®

Synopsis

The Machine Vision Toolbox for Python (MVTB-P) provides many functions that are useful in machine vision and vision-based control. The main components are:

  • An Image object with nearly 200 methods and properties that wrap functions from OpenCV, NumPy and SciPy. Methods support monadic, dyadic, filtering, edge detection, mathematical morphology and feature extraction (blobs, lines and point/corner features), as well as operator overloading. Images are stored as encapsulated NumPy arrays along with image metadata.
  • An object-oriented wrapper of Open3D functions that supports a subset of operations, but allows operator overloading and is compatible with the Spatial Math Toolbox.
  • A collection of camera projection classes for central (normal perspective), fisheye, catadioptric and spherical cameras.
  • Some advanced algorithms such as:
    • multiview geometry: camera calibration, stereo vision, bundle adjustment
    • bag of words

Advantages of this Python Toolbox are that:

  • it uses, as much as possible, OpenCV and NumPy which are portable, efficient, comprehensive and mature collection of functions for image processing and feature extraction;
  • it wraps the OpenCV functions in a consistent way, hiding some of the gnarly details of OpenCV like conversion to/from float32 and the BGR color order.
  • it is has similarity to the Machine Vision Toolbox for MATLAB.

Getting going

Using pip

Install a snapshot from PyPI

% pip install machinevision-toolbox-python

From GitHub

Install the current code base from GitHub and pip install a link to that cloned copy

% git clone https://github.com/petercorke/machinevision-toolbox-python.git
% cd machinevision-toolbox-python
% pip install -e .

Examples

Reading and display an image

from machinevisiontoolbox import Image
mona = Image.Read("monalisa.png")
mona.disp()

Mona Lisa image

Images can also be returned by iterators that operate over folders, zip files, local cameras, web cameras and video files.

Simple image processing

The toolbox supports many operations on images such as 2D filtering, edge detection, mathematical morphology, colorspace conversion, padding, cropping, resizing, rotation and warping.

mona.smooth(sigma=5).disp()

Mona Lisa image with smoothing

There are also many functions that operate on pairs of image. All the arithmetic operators are overloaded, and there are methods to combine images in more complex ways. Multiple images can be stacked horizontal, vertically or tiled in a 2D grid. For example, we could display the original and smoothed images side by side

Image.Hstack([mona, mona.smooth(sigma=5)]).disp()

where Hstack is a class method that creates a new image by stacking the images from its argument, an image sequence, horizontally.

Mona Lisa image with smoothing

Binary blobs

A common problem in robotic vision is to extract features from the image, to describe the position, size, shape and orientation of objects in the scene. For simple binary scenes blob features are commonly used.

im = Image.Read("shark2.png")   # read a binary image of two sharks
im.disp();   # display it with interactive viewing tool
blobs = im.blobs()  # find all the white blobs
print(blobs)

	┌───┬────────┬──────────────┬──────────┬───────┬───────┬─────────────┬────────┬────────┐
	│idparentcentroidareatouchperimcircularityorientaspect │
	├───┼────────┼──────────────┼──────────┼───────┼───────┼─────────────┼────────┼────────┤
	│ 0-1371.2, 355.27.59e+03False557.60.34182.9° │  0.976 │
	│ 1-1171.2, 155.27.59e+03False557.60.34182.9° │  0.976 │
	└───┴────────┴──────────────┴──────────┴───────┴───────┴─────────────┴────────┴────────┘

where blobs is a list-like object and each element describes a blob in the scene. The element's attributes describe various parameters of the object, and methods can be used to overlay graphics such as bounding boxes and centroids

blobs.plot_box(color="g", linewidth=2)  # put a green bounding box on each blob
blobs.plot_centroid(label=True)  # put a circle+cross on the centroid of each blob
plt.show(block=True)  # display the result

Binary image showing bounding boxes and centroids

Binary blob hierarchy

A more complex image is

im = Image.Read("multiblobs.png")
im.disp()

Binary image with nested blobs

and we see that some blobs are contained within other blobs. The results in tabular form

blobs  = im.blobs()
print(blobs)
	┌───┬────────┬───────────────┬──────────┬───────┬────────┬─────────────┬────────┬────────┐
	│idparentcentroidareatouchperimcircularityorientaspect │
	├───┼────────┼───────────────┼──────────┼───────┼────────┼─────────────┼────────┼────────┤
	│ 01898.8, 725.31.65e+05False2220.00.46786.7° │  0.754 │
	│ 121025.0, 813.71.06e+05False1387.90.769-88.9° │  0.739 │
	│ 2-1938.1, 855.21.72e+04False490.71.00188.7° │  0.862 │
	│ 3-1988.1, 697.21.21e+04False412.50.994-87.8° │  0.809 │
	│ 4-1846.0, 511.71.75e+04False496.90.992-90.0° │  0.778 │
	│ 56291.7, 377.81.7e+05False1712.60.810-85.3° │  0.767 │
	│ 6-1312.7, 472.11.75e+04False495.50.997-89.9° │  0.777 │
	│ 7-1241.9, 245.01.75e+04False496.90.992-90.0° │  0.777 │
	│ 891228.0, 254.38.14e+04False1215.20.771-77.2° │  0.713 │
	│ 9-11225.2, 220.01.75e+04False496.90.992-90.0° │  0.777 │
	└───┴────────┴───────────────┴──────────┴───────┴────────┴─────────────┴────────┴────────┘

We can display a label image, where the value of each pixel is the label of the blob that the pixel belongs to, the id attribute

labels = blobs.label_image()
labels.disp(colormap="viridis", ncolors=len(blobs), colorbar=dict(shrink=0.8, aspect=20*0.8))

False color label image

We can also think of the blobs forming a hiearchy and that relationship is reflected in the parent and children attributes of the blobs. We can also express it as a directed graph

blobs.dotfile(show=True)

Blob hierarchy as a graph

Camera modelling

from machinevisiontoolbox import CentralCamera
cam = CentralCamera(f=0.015, rho=10e-6, imagesize=[1280, 1024], pp=[640, 512], name="mycamera")
print(cam)
           Name: mycamera [CentralCamera]
     pixel size: 1e-05 x 1e-05
     image size: 1280 x 1024
           pose: t = 0, 0, 0; rpy/yxz = 0°, 0°, 0°
   principal pt: [     640      512]
   focal length: [   0.015    0.015]

and its intrinsic parameters are

print(cam.K)
	[[1.50e+03 0.00e+00 6.40e+02]
	 [0.00e+00 1.50e+03 5.12e+02]
	 [0.00e+00 0.00e+00 1.00e+00]]

We can define an arbitrary point in the world

P = [0.3, 0.4, 3.0]

and then project it into the camera

p = cam.project(P)
print(p)
	[790. 712.]

which is the corresponding coordinate in pixels. If we shift the camera slightly the image plane coordinate will also change

p = cam.project(P, T=SE3(0.1, 0, 0) )
print(p)
[740. 712.]

We can define an edge-based cube model and project it into the camera's image plane

from spatialmath import SE3
X, Y, Z = mkcube(0.2, pose=SE3(0, 0, 1), edge=True)
cam.plot_wireframe(X, Y, Z)

Perspective camera view of cube

Color space

Plot the CIE chromaticity space

plot_chromaticity_diagram("xy");
plot_spectral_locus("xy")

CIE chromaticity space

Load the spectrum of sunlight at the Earth's surface and compute the CIE xy chromaticity coordinates

nm = 1e-9
lam = np.linspace(400, 701, 5) * nm # visible light
sun_at_ground = loadspectrum(lam, "solar")
xy = lambda2xy(lambda, sun_at_ground)
print(xy)
	[[0.33272798 0.3454013 ]]
print(colorname(xy, "xy"))
	khaki

Hough transform

im = Image.Read("church.png", mono=True)
edges = im.canny()
h = edges.Hough()
lines = h.lines_p(100, minlinelength=200, maxlinegap=5, seed=0)

im.disp(darken=True)
h.plot_lines(lines, "r--")

Hough transform

SURF features

We load two images and compute a set of SURF features for each

view1 = Image.Read("eiffel-1.png", mono=True)
view2 = Image.Read("eiffel-2.png", mono=True)
sf1 = view1.SIFT()
sf2 = view2.SIFT()

We can match features between images based purely on the similarity of the features, and display the correspondences found

matches = sf1.match(sf2)
print(matches)
813 matches
matches[1:5].table()
┌──┬────────┬──────────┬─────────────────┬────────────────┐
│# │ inlier │ strength │              p1 │             p2 │
├──┼────────┼──────────┼─────────────────┼────────────────┤
│0 │        │     26.4 │ (1118.6, 178.8) │ (952.5, 418.0) │
│1 │        │     28.2 │ (820.6, 519.1)  │ (708.1, 701.6) │
│2 │        │     29.6 │ (801.1, 632.4)  │ (694.1, 800.3) │
│3 │        │     32.4 │ (746.0, 153.1)  │ (644.5, 392.2) │
└──┴────────┴──────────┴─────────────────┴────────────────┘

where we have displayed the feature coordinates for four correspondences.

We can also display the correspondences graphically

matches.subset(100).plot("w")

in this case, a subset of 100/813 of the correspondences.

Feature matching

Clearly there are some bad matches here, but we we can use RANSAC and the epipolar constraint implied by the fundamental matrix to estimate the fundamental matrix and classify correspondences as inliers or outliers

F, resid = matches.estimate(CentralCamera.points2F, method="ransac", confidence=0.99, seed=0)
print(F)
array([[1.033e-08, -3.799e-06, 0.002678],
       [3.668e-06, 1.217e-07, -0.004033],
       [-0.00319, 0.003436,        1]])
print(resid)
0.0405

Image.Hstack((view1, view2)).disp()
matches.inliers.subset(100).plot("g", ax=plt.gca())
matches.outliers.subset(100).plot("r", ax=plt.gca())

where green lines show correct correspondences (inliers) and red lines show bad correspondences (outliers)

Feature matching after RANSAC

History

This package can be considered as a Python version of the Machine Vision Toolbox for MATLAB. That Toolbox, now quite old, is a collection of MATLAB functions and classes that supported the first two editions of the Robotics, Vision & Control book. It is a somewhat eclectic collection reflecting my personal interest in areas of photometry, photogrammetry, colorimetry. It includes over 100 functions spanning operations such as image file reading and writing, acquisition, display, filtering, blob, point and line feature extraction, mathematical morphology, homographies, visual Jacobians, camera calibration and color space conversion.

This Python version differs in using an object to encapsulate the pixel data and image metadata, rather than just a native object holding pixel data. The many functions become methods of the image object which reduces namespace pollutions, and allows the easy expression of sequential operations using "dot chaining".

The first version was created by Dorian Tsai during 2020, and based on the MATLAB version. That work was funded by an Australian University Teacher of the year award (2017) to Peter Corke.

More Repositories

1

robotics-toolbox-python

Robotics Toolbox for Python
Python
2,041
star
2

robotics-toolbox-matlab

Robotics Toolbox for MATLAB
MATLAB
1,259
star
3

spatialmath-python

Create, manipulate and convert representations of position and orientation in 2D or 3D using Python
Python
294
star
4

machinevision-toolbox-matlab

Machine Vision Toolbox for MATLAB
MATLAB
202
star
5

spatialmath-matlab

Create, manipulate and convert representations of position and orientation in 2D or 3D using Python
MATLAB
197
star
6

RVC3-python

Code examples for Robotics, Vision & Control 3rd edition in Python
Jupyter Notebook
191
star
7

bdsim

Simulate dynamic systems expressed in block diagram form using Python
HTML
174
star
8

RVC3-MATLAB

New toolbox, data and examples for Robotics, Vision & Control: 3rd edition in MATLAB
MATLAB
82
star
9

toolbox-common-matlab

Common code for MATLAB toolboxes for Robotics, Machine Vision
MATLAB
57
star
10

rvc-notation

RVC notation
TeX
19
star
11

ansitable

Quick, easy and pretty display of tabular data or matrices, with optional ANSI color and borders
Python
17
star
12

pgraph-python

Simple graph classes
Python
15
star
13

simple-threads-coder

A simple POSIX threading library for MATLAB Coder
C
9
star
14

help2doc

Convert MATLAB source files to various documentation formats
Python
4
star
15

petercorke.github.io

Static web pages
3
star
16

pybib

Python class to parse/sort/filter bibtex files
Python
2
star
17

tkbibtex

Simple python/tk graphical browser/editor for BibTeX files
Shell
2
star
18

RVC3-test

For testing MATLAB Online access to GH based toolboxes and scripts
MATLAB
2
star
19

robotics-toolbox-julia

Julia port of robotics toolbox
1
star
20

arduIO

Simple and efficient Arduino I/O server and Python client
Python
1
star
21

plane-catcher

Log aircraft flying low and loud over Brisbane's Western suburbs
Python
1
star