
Computer Science ETDs
Similarity Equivariance in Visual Computation
Abstract
Equivariance is a desirable property of computations that process constantly changing visual information. The commutative aspect of equivariance, i.e. a computation commutes with a transformation, not only avoids unnecessary non-linearity, but also allows the computation to preserve certain information, when the visual input is transformed. Convolution is a linear operator that implements equivariance. Group convolution generalizes the concept to linear operations on functions of group elements representing more general geometric transformations and which commute with those transformations. Since similarity transformation is the most general geometric transformation on images that preserves shape, the group convolution that is equivariant to similarity transformation is the most general shape preserving linear operator. Because similarity transformations have four free parameters, group convolutions are defined on four-dimensional, joint orientation-scale spaces. Although prior work on equivariant linear operators has been limited to discrete groups, the similarity group is continuous. In this dissertation, we propose a framework for continuous similarity group convolution in the joint orientation-scale spaces. This is achieved by using a basis of functions that is joint shiftable-steerable-scalable. These pinwheel functions use Fourier series in the orientation dimension and Laplace transform in the log-scale dimension to form a basis of spatially localized functions that can be continuously interpolated in position, orientation and scale. The two dimensional pinwheel is then used to form a basis in the four dimensional joint orientation-scale space. The integral of the multiplication between a function and this basis computes the frequency response of the function. The continuous group convolution is implemented in the frequency domain according to the convolution theorem.
The similarity group convolution between a 2D image and filters produces similarity equivariant representations in the joint orientation-scale space. An invariant representation can be derived from an equivariant representation by discarding information such as orientation and scale. We demonstrate a method that uses these representations to perform similarity invariant recognition of 2D shapes. This method requires an order of magnitude less training data than comparable methods based on convolutional neural networks. Finally, we demonstrate the utility of similarity group convolution by using it to compute a shape equivariant distribution of closed contours traced by particles undergoing Brownian motion in velocity. The contours are constrained by sets of points and line endings representing well known bistable illusory contour inducing patterns. Prior computational models of the bistability of these patterns demonstrated that the alternative percepts correspond to local optima in particle speed (scale). We replicate these results using a recurrent neural network implementing similarity equivariant convolution in a finite pinwheel basis.