Vision Starter
KitTM
User's Manual
Version 3.00
Ó Copyright 1999, 2002
Cogimatic
All Rights Reserved
Cogimatic
1436
Nighthawk Drive
Santa
Rosa, CA 95409
Phone: (707)539-7977
FAX: (707)539-7977
Email: support@cogimatic.com
WWW: http://www.cogimatic.com
Cogimatic
License Agreement
This is a legal agreement between you,
the end user, and Cogimatic. By using
this software you are agreeing to be bound by the terms of this Agreement. If you do not agree to the terms of this
Agreement, promptly return the unused software and the accompanying items to
the place you obtained them for a full refund.
License Grant
Cogimatic grants you a non-exclusive
license to make as many copies of the enclosed software (the
"SOFTWARE") as you need for back-up purposes. You may use this software on more than one
computer, provided there is no chance it will be used simultaneously on more
than one computer. You may not copy the
enclosed documentation for any purposes.
Redistributable Components
You have a royalty-free right to
reproduce and distribute the redistributable code files
"VISIONSK.DLL" and "VisionStarterKit.dll", provided that
you (a) distribute the files only in conjunction with and as a part of your
software application product; (b) agree to indemnify and hold Cogimatic
harmless from and against any claims or liabilities arising out of the use,
reproduction, or distribution of your software application product.
All Cogimatic libraries, source code,
redistributables, and other files remain Cogimatic's exclusive property.
LIMITED WARRANTY
NO WARRANTIES.
TO THE MAXIMUM EXTENT PERMITTED BY
APPLICABLE LAW, COGIMATIC EXPRESSLY DISCLAIMS ANY WARRANTY FOR THE
SOFTWARE. THE SOFTWARE AND ANY RELATED
DOCUMENTATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND EITHER
EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE
REMAINS WITH YOU.
NO LIABILITY FOR CONSEQUENTIAL DAMAGES.
TO THE MAXIMUM EXTENT PERMITTED BY
APPLICABLE LAW, IN NO EVENT SHALL COGIMATIC OR ITS SUPPLIERS BE LIABLE FOR ANY
DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
PROFIT, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR ANY OTHER
PECUNIARY LOSS) ARISING OUT OF THE USE OR INABILITY TO USE THIS COGIMATIC
PRODUCT, EVEN IF COGIMATIC HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Some states/jurisdictions do not allow
the exclusion or limitation of liability for consequential or incidental
damages, so the above limitation may not apply to you.
This agreement shall be governed by the
laws of the State of California and shall inure to the benefit of Cogimatic and
any successors, administrators, heirs and assigns. Any action or proceeding brought by either party against the
other arising out of or related to this agreement shall be brought only in a
STATE or FEDERAL COURT of competent jurisdiction located in San Francisco
County, California. The parties hereby
consent to in personam jurisdiction of said courts.
Information in this document is subject
to change without notice and does not represent a commitment on the part of
Cogimatic.
Borland, Borland C++, C++ Builder, OWL
and Delphi are trademarks of Borland International Corporation.
Microsoft, Microsoft Windows, Visual
Basic and MS DOS are trademarks of Microsoft Corporation.
Other trademarks referenced in this
publication are owned by their respective manufacturers.
TABLE OF CONTENTS
1. Getting Started............... 7
Minimum System
Requirements............... 7
Installation............... 7
Quick Results............... 7
Using the Demo Program............... 7
What’s new with this
release?............... 7
2. Introduction............... 9
Computer Vision............... 9
Image Processing with
the Vision Starter KitTM............... 9
Preview of the
functions offered............... 9
And more............... 10
Creating Applications
using the DLL............... 10
What else may be
needed............... 10
A reminder to make
sure the correct copy of visionsk.dll is present............... 11
3. General notes on
the DLL and class objects............... 12
What is a DLL............... 12
About the new .NET
DLL, ‘VisionStarterKit.dll’............... 12
About the main DLL,
‘visionsk.dll’............... 12
Creating And Using
Images............... 12
More Details on Image
Memory............... 13
4. Delphi class usage............... 14
Creation, Destruction
and Saving of Delphi Image Objects............... 14
Displaying Delphi
Image Objects............... 14
Manipulating Delphi
Image Objects............... 15
Delphi Image Class
Details............... 15
Gr8Image: 8 Bit
Grayscale Images............... 15
Gr32Image: 32 Bit
Grayscale Images............... 18
GrFImage: 32 Bit Float
(Single) Grayscale Images............... 19
RGBImage: 24 Bit Color
Images............... 19
5. C++ Image classes............... 22
Creation, Destruction
and Saving of C++ Image Objects............... 22
Displaying C++ Image
Objects............... 23
Manipulating C++ Image
Objects............... 25
C++ Image Class
Details............... 26
Gr8Image: 8-Bit
Grayscale Images............... 26
Gr32Image: 32-Bit
Grayscale Images............... 28
GrFImage: 32-Bit Float
Grayscale Images............... 28
RGBImage: 24-Bit Color
Images............... 29
6. Visual Basic Image
Classes............... 32
Setting up the VB
Environment............... 32
Creation, Destruction
and Saving of VB Image Objects............... 32
Displaying VB Image
Objects............... 33
Manipulating VB Image
Objects............... 34
VB Image Class Details............... 34
cGray8BitImage: 8-Bit
Grayscale Images............... 34
cGray32BitImage:
32-Bit Grayscale Images............... 38
cGrayFloatImage:
32-Bit Float Grayscale Images............... 38
cFFTImage: Float FFT
Images (written for VB)............... 39
RGB24BitImage: 24-Bit
Color Images............... 39
7. .NET Image Classes............... 43
Setting up the .NET
Environment............... 43
Creation, Destruction
and Saving of .NET Image Objects............... 43
Manipulating and
Displaying .NET Image Objects............... 44
.NET Image Class
Details............... 45
VSKImage: 8-Bit
Grayscale Images............... 45
Gray8BitImage: 8-Bit
Grayscale Images............... 45
Gray32BitImage: 32-Bit
Grayscale Images............... 48
GrayFloatImage: 32-Bit
Float Grayscale Images............... 49
RGBImage: 24-Bit Color
Images............... 50
8. Detailed DLL
Function Descriptions............... 53
Processors, Languages and
Calling Conventions............... 53
8 Bit, 16 Bit and
Floating Point Image Destruction............... 53
24 Bit RGB Image
Destruction............... 53
8 Bit Image Functions............... 53
32 Bit Image Functions............... 75
Float Image Functions............... 77
RGB Image Functions............... 81
Matrix Functions............... 94
A. Book References............... 96
Index............... 97
Chapter
1
IBM PC/AT or compatible running
Windows 98, NT 4.0 or higher. There
should be at least 2M disk space available, and at least 16M RAM is
recommended.
The Vision Starter KitTM (or VSK)
consists primarily of a 32-bit DLL file which is called by your program. To use the DLL, simply copy it to the
directory where your EXE file is located, or to the Windows directory.
To develop programs using the VSK,
first create a directory and copy the self-extracting file (named “unpack.exe”)
to this directory. Then execute the
file. The VSK interface code and
example programs for MSVC 5.0, Delphi and VB 5.0 will appear in separate sub
directories, ready to be copied into your own development directories.
For fast results it is recommended
that you copy the "quick" project of either language to your
development area, and load and run it from your development platform. This program loads an image and displays it,
and contains all the ingredients necessary to develop more advanced
applications using the VSK DLL.
The source code for the Demo program
is also provided as an example of a more involved program. The Demo exercises many functions in the
DLL, but often uses fixed values for passed-parameters in the calls. These fixed values can be replaced by
variables for rapid experimentation.
For example, the convolution filter masks could be altered, etc.
There are a few minor bug fixes in
this release, a new target ‘platform’, called ‘.NET’, and a few new functions
added to the DLL, which have also been added to some of the wrapper classes.
The .NET classes are described in their own chapter in this
manual. Briefly, this new programming
environment offers easier integration of code written in different languages.
In addition, there is a vast class library to speed up object-oriented
development. The VSK image classes have been created for this environment so
that whichever .NET language you prefer, it is likely that the VSK classes can
be imported and available immediately, since they are available as a .NET DLL.
This DLL named ‘VisionStarterKit.dll’, in turn calls the original
‘visionsk.dll’. When you deploy an application using these classes, you merely
copy both DLLs along with your EXE file into the same directory on the target
computer. No interaction with the Windows registry is necessary.
As for the new functions, there are two subject areas. The
first concerns Windows graphics resources and display settings. There is now
the ability to create an image based on a handle to a bitmap, or based on a
handle to a display context. These functions are meant for non-palletized
displays. That is, if your display settings are using only 256 colors or less,
then you are using a palette, and the functions won’t succeed. They are meant
for graphics environments where the selected bitmap has pixels whose colors are
fully specified. For example, 16-bit displays use RGB pixel values, where each
pixel holds 5 or 6 bit values for each color (red, green and blue).
The second area involves finding
bright points inside regions and returning them in an array, or finding bright
blobs inside regions and returning these in an array. These functions can be
useful for template matching. The result of a template match operation is a new
image with varying tones of gray. The brighter pixels represent a good match
between the original image and the template. The blob functions offer a
convenient way to find bright clusters, one cluster per ‘find’. By combining
the two operations, you can get a list of locations where the template was
found in the original image.
Chapter
2
Although it is assumed that the
reader is already familiar with image processing, a brief introduction seems in
order. Computer vision is a field which
uses computers to extract and analyze information from images in order to
identify (or measure) objects of interest.
Applications abound, including text and handwriting recognition,
astronomy, artificial intelligence and industrial quality control.
The image is often processed in
stages. For example let us consider an
application that reads dates on coins passed under a camera. The software accepts an image of the coin
and must determine the date of the coin.
The first stage might be a filter the to remove noise and correct for
uneven illumination. Next the image
might be scanned to find the location of the coin, by searching for a dark
circular shape. This region might then
be resized to some convenient dimensions, and then perhaps some spatial moments (or an FFT) would be calculated. These moments might be compared to a database for a match to
indicate the orientation of the coin and ensure that it was facing heads
up. With the orientation established
the date region could be extracted.
This new sub-image might then be converted to a line drawing and fed to
a text recognition function, which might be a highly customized algorithm.
Due to the large amount of
information contained in images, past approaches to vision systems often used
dedicated hardware for image processing.
But desktop computers have become powerful enough to perform many computer
vision tasks at reasonable speed.
The kit provides many of the fundamental image processing
functions commonly used in vision systems.
It is intended for programmers and others who program computers:
students, scientists, engineers and hobbyists.
The emphasis is on utility for vision applications. The functions provided are fully described
in the literature of the field. The kit
was designed to allow the developer to focus on higher level design issues, not
on tedious coding of standard algorithms.
The good news is that most of the standard algorithms, and even some
exotic ones are in the kit, and all for low price.
Convolution: the user can specify masks of up to 127 by 127 for both
integer and floating point convolutions.
There is a very fast 3 by 3 integer convolution, median filtering and a
fast mean filter.
Point
Operations: the usual point (brightness) operations, such as
thresholding, contrast, negatives, clipping, logarithms, bitwise and generic.
Cutting and
Pasting: rectangular chunks can be lifted from images and pasted to
other images at any point. Images can
also be cut and pasted to/from the Windows Clipboard.
Geometrical: translation with bilinear interpolation, rotation by any
floating point value, or by 90 degrees, fast zoom, flipping and polynomial
warping.
Morphological: erode, dilate, open, close and Laplacian of Gaussian edge
detection operator.
Profiles and
Histograms: line profiles of any length and direction; row, column and
overall image histograms; horizontal, and vertical projections.
Prepackaged
Shapes: generate images as ramps, mound, gaussian, cone, pyramid,
rectangle, circle, etc.
Image Pairwise
Combinations: linear, multiply pixels, divide, add, difference, use
max/min, and, or, etc.
Transforms: Fourier (FFT and IFFT) and Hough.
Template
Matching.
Blob Detection: Typically used with Template Matching, bright blobs are
detected, and their coordinates returned.
Hue Changing.
There are also floating point and
integer-based images with limited operations.
The integer size allows for averaging images together (for noise
removal), and the floating point allows for adding, multiplying images and
accurate FFT work.
There are some useful matrix
operations, such as multiplication, finding the inverse, finding the
pseudo-inverse and solving matrix equations.
Finally there is a vector form of the FFT (for one dimensional arrays, such as sound
clips).
Purchase of the Vision Starter Kit
entitles the purchaser to create and distribute applications which use the
DLL. The DLL itself can be distributed
with the application as long as its name is not changed (VISIONSK.DLL). However, information about the interface and
source code included in the Vision Starter Kit must not be distributed.
Programs using .NET will also need the VisionStarterKit.dll
along with VISIONSK.DLL. The VisionStarterKit.dll file is a C# “wrapper dll”
which converts the functionality of the original dll into a suite of .NET
classes, and is meant to work with VB.NET and C# applications. Other .NET
languages should also be able to use the DLL. As long as both dlls, along with
your application file (an exe) all occupy the same directory, the appication
should be able to load and call the dlls.
Currently the supported file format is the Windows BMP