Vision Starter KitTM

 

User's Manual

 

 

 

 

 

Version 3.00

 

 

 

 

Ó Copyright 1999, 2002 Cogimatic

All Rights Reserved

 

 

 

 

 

 

 

Cogimatic

1436 Nighthawk Drive

Santa Rosa, CA 95409

Phone:            (707)539-7977

FAX:                        (707)539-7977

Email:                        support@cogimatic.com

WWW:            http://www.cogimatic.com

 

 


Cogimatic License Agreement

 

This is a legal agreement between you, the end user, and Cogimatic.  By using this software you are agreeing to be bound by the terms of this Agreement.  If you do not agree to the terms of this Agreement, promptly return the unused software and the accompanying items to the place you obtained them for a full refund.

 

License Grant

Cogimatic grants you a non-exclusive license to make as many copies of the enclosed software (the "SOFTWARE") as you need for back-up purposes.  You may use this software on more than one computer, provided there is no chance it will be used simultaneously on more than one computer.  You may not copy the enclosed documentation for any purposes.

 

Redistributable Components

You have a royalty-free right to reproduce and distribute the redistributable code files "VISIONSK.DLL" and "VisionStarterKit.dll", provided that you (a) distribute the files only in conjunction with and as a part of your software application product; (b) agree to indemnify and hold Cogimatic harmless from and against any claims or liabilities arising out of the use, reproduction, or distribution of your software application product.

All Cogimatic libraries, source code, redistributables, and other files remain Cogimatic's exclusive property.

 

LIMITED WARRANTY

NO WARRANTIES.

TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, COGIMATIC EXPRESSLY DISCLAIMS ANY WARRANTY FOR THE SOFTWARE.  THE SOFTWARE AND ANY RELATED DOCUMENTATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE REMAINS WITH YOU.

 

NO LIABILITY FOR CONSEQUENTIAL DAMAGES.

TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT SHALL COGIMATIC OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFIT, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR ANY OTHER PECUNIARY LOSS) ARISING OUT OF THE USE OR INABILITY TO USE THIS COGIMATIC PRODUCT, EVEN IF COGIMATIC HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

 

Some states/jurisdictions do not allow the exclusion or limitation of liability for consequential or incidental damages, so the above limitation may not apply to you.

 

This agreement shall be governed by the laws of the State of California and shall inure to the benefit of Cogimatic and any successors, administrators, heirs and assigns.  Any action or proceeding brought by either party against the other arising out of or related to this agreement shall be brought only in a STATE or FEDERAL COURT of competent jurisdiction located in San Francisco County, California.  The parties hereby consent to in personam jurisdiction of said courts.


 

Information in this document is subject to change without notice and does not represent a commitment on the part of Cogimatic.

 

Borland, Borland C++, C++ Builder, OWL and Delphi are trademarks of Borland International Corporation.

 

Microsoft, Microsoft Windows, Visual Basic and MS DOS are trademarks of Microsoft Corporation.

 

Other trademarks referenced in this publication are owned by their respective manufacturers.


 


TABLE OF CONTENTS

 

1. Getting Started............... 7

Minimum System Requirements............... 7

Installation............... 7

Quick Results............... 7

Using the Demo Program............... 7

What’s new with this release?............... 7

2. Introduction............... 9

Computer Vision............... 9

Image Processing with the Vision Starter KitTM............... 9

Preview of the functions offered............... 9

And more............... 10

Creating Applications using the DLL............... 10

What else may be needed............... 10

A reminder to make sure the correct copy of visionsk.dll is present............... 11

3. General notes on the DLL and class objects............... 12

What is a DLL............... 12

About the new .NET DLL, ‘VisionStarterKit.dll’............... 12

About the main DLL, ‘visionsk.dll’............... 12

Creating And Using Images............... 12

More Details on Image Memory............... 13

4. Delphi class usage............... 14

Creation, Destruction and Saving of Delphi Image Objects............... 14

Displaying Delphi Image Objects............... 14

Manipulating Delphi Image Objects............... 15

Delphi Image Class Details............... 15

Gr8Image: 8 Bit Grayscale Images............... 15

Gr32Image: 32 Bit Grayscale Images............... 18

GrFImage: 32 Bit Float (Single) Grayscale Images............... 19

RGBImage: 24 Bit Color Images............... 19

5. C++ Image classes............... 22

Creation, Destruction and Saving of C++ Image Objects............... 22

Displaying C++ Image Objects............... 23

Manipulating C++ Image Objects............... 25

C++ Image Class Details............... 26

Gr8Image: 8-Bit Grayscale Images............... 26

Gr32Image: 32-Bit Grayscale Images............... 28

GrFImage: 32-Bit Float Grayscale Images............... 28

RGBImage: 24-Bit Color Images............... 29

6. Visual Basic Image Classes............... 32

Setting up the VB Environment............... 32

Creation, Destruction and Saving of VB Image Objects............... 32

Displaying VB Image Objects............... 33

Manipulating VB Image Objects............... 34

VB Image Class Details............... 34

cGray8BitImage: 8-Bit Grayscale Images............... 34

cGray32BitImage: 32-Bit Grayscale Images............... 38

cGrayFloatImage: 32-Bit Float Grayscale Images............... 38

cFFTImage: Float FFT Images (written for VB)............... 39

RGB24BitImage: 24-Bit Color Images............... 39

7. .NET Image Classes............... 43

Setting up the .NET Environment............... 43

Creation, Destruction and Saving of .NET Image Objects............... 43

Manipulating and Displaying .NET Image Objects............... 44

.NET Image Class Details............... 45

VSKImage: 8-Bit Grayscale Images............... 45

Gray8BitImage: 8-Bit Grayscale Images............... 45

Gray32BitImage: 32-Bit Grayscale Images............... 48

GrayFloatImage: 32-Bit Float Grayscale Images............... 49

RGBImage: 24-Bit Color Images............... 50

8. Detailed DLL Function Descriptions............... 53

Processors, Languages and Calling Conventions............... 53

8 Bit, 16 Bit and Floating Point Image Destruction............... 53

24 Bit RGB Image Destruction............... 53

8 Bit Image Functions............... 53

32 Bit Image Functions............... 75

Float Image Functions............... 77

RGB Image Functions............... 81

Matrix Functions............... 94

A. Book References............... 96

Index............... 97

 


Chapter 1

 

1. Getting Started

 

Minimum System Requirements

 

            IBM PC/AT or compatible running Windows 98, NT 4.0 or higher.  There should be at least 2M disk space available, and at least 16M RAM is recommended.

 

 

Installation

 

            The Vision Starter KitTM (or VSK) consists primarily of a 32-bit DLL file which is called by your program.  To use the DLL, simply copy it to the directory where your EXE file is located, or to the Windows directory.

 

            To develop programs using the VSK, first create a directory and copy the self-extracting file (named “unpack.exe”) to this directory.  Then execute the file.  The VSK interface code and example programs for MSVC 5.0, Delphi and VB 5.0 will appear in separate sub directories, ready to be copied into your own development directories.

 

Quick Results

 

            For fast results it is recommended that you copy the "quick" project of either language to your development area, and load and run it from your development platform.  This program loads an image and displays it, and contains all the ingredients necessary to develop more advanced applications using the VSK DLL.

 

Using the Demo Program

 

            The source code for the Demo program is also provided as an example of a more involved program.  The Demo exercises many functions in the DLL, but often uses fixed values for passed-parameters in the calls.  These fixed values can be replaced by variables for rapid experimentation.  For example, the convolution filter masks could be altered, etc.

 

What’s new with this release?

 

            There are a few minor bug fixes in this release, a new target ‘platform’, called ‘.NET’, and a few new functions added to the DLL, which have also been added to some of the wrapper classes.

 

The .NET classes are described in their own chapter in this manual.  Briefly, this new programming environment offers easier integration of code written in different languages. In addition, there is a vast class library to speed up object-oriented development. The VSK image classes have been created for this environment so that whichever .NET language you prefer, it is likely that the VSK classes can be imported and available immediately, since they are available as a .NET DLL. This DLL named ‘VisionStarterKit.dll’, in turn calls the original ‘visionsk.dll’. When you deploy an application using these classes, you merely copy both DLLs along with your EXE file into the same directory on the target computer. No interaction with the Windows registry is necessary.

 

As for the new functions, there are two subject areas. The first concerns Windows graphics resources and display settings. There is now the ability to create an image based on a handle to a bitmap, or based on a handle to a display context. These functions are meant for non-palletized displays. That is, if your display settings are using only 256 colors or less, then you are using a palette, and the functions won’t succeed. They are meant for graphics environments where the selected bitmap has pixels whose colors are fully specified. For example, 16-bit displays use RGB pixel values, where each pixel holds 5 or 6 bit values for each color (red, green and blue).

 

            The second area involves finding bright points inside regions and returning them in an array, or finding bright blobs inside regions and returning these in an array. These functions can be useful for template matching. The result of a template match operation is a new image with varying tones of gray. The brighter pixels represent a good match between the original image and the template. The blob functions offer a convenient way to find bright clusters, one cluster per ‘find’. By combining the two operations, you can get a list of locations where the template was found in the original image.


Chapter 2

 

2. Introduction

 

Computer Vision

 

            Although it is assumed that the reader is already familiar with image processing, a brief introduction seems in order.  Computer vision is a field which uses computers to extract and analyze information from images in order to identify (or measure) objects of interest.  Applications abound, including text and handwriting recognition, astronomy, artificial intelligence and industrial quality control.

            The image is often processed in stages.  For example let us consider an application that reads dates on coins passed under a camera.  The software accepts an image of the coin and must determine the date of the coin.  The first stage might be a filter the to remove noise and correct for uneven illumination.  Next the image might be scanned to find the location of the coin, by searching for a dark circular shape.  This region might then be resized to some convenient dimensions, and then perhaps some spatial moments (or an FFT) would be calculated.  These moments might be compared to a database for a match to indicate the orientation of the coin and ensure that it was facing heads up.  With the orientation established the date region could be extracted.  This new sub-image might then be converted to a line drawing and fed to a text recognition function, which might be a highly customized algorithm.

            Due to the large amount of information contained in images, past approaches to vision systems often used dedicated hardware for image processing.  But desktop computers have become powerful enough to perform many computer vision tasks at reasonable speed.

 

Image Processing with the Vision Starter KitTM

 

            The kit provides many of the fundamental image processing functions commonly used in vision systems.  It is intended for programmers and others who program computers: students, scientists, engineers and hobbyists.  The emphasis is on utility for vision applications.  The functions provided are fully described in the literature of the field.  The kit was designed to allow the developer to focus on higher level design issues, not on tedious coding of standard algorithms.  The good news is that most of the standard algorithms, and even some exotic ones are in the kit, and all for low price.

 

Preview of the functions offered

 

      Convolution: the user can specify masks of up to 127 by 127 for both integer and floating point convolutions.  There is a very fast 3 by 3 integer convolution, median filtering and a fast mean filter.

      Point Operations: the usual point (brightness) operations, such as thresholding, contrast, negatives, clipping, logarithms, bitwise and generic.

      Cutting and Pasting: rectangular chunks can be lifted from images and pasted to other images at any point.  Images can also be cut and pasted to/from the Windows Clipboard.

      Geometrical: translation with bilinear interpolation, rotation by any floating point value, or by 90 degrees, fast zoom, flipping and polynomial warping.

      Morphological: erode, dilate, open, close and Laplacian of Gaussian edge detection operator.

      Profiles and Histograms: line profiles of any length and direction; row, column and overall image histograms; horizontal, and vertical projections.

      Prepackaged Shapes: generate images as ramps, mound, gaussian, cone, pyramid, rectangle, circle, etc.

      Image Pairwise Combinations: linear, multiply pixels, divide, add, difference, use max/min, and, or, etc.

      Transforms: Fourier (FFT and IFFT) and Hough.

      Template Matching.

      Blob Detection: Typically used with Template Matching, bright blobs are detected, and their coordinates returned.

      Hue Changing.

 

And more

 

            There are also floating point and integer-based images with limited operations.  The integer size allows for averaging images together (for noise removal), and the floating point allows for adding, multiplying images and accurate FFT work.

            There are some useful matrix operations, such as multiplication, finding the inverse, finding the pseudo-inverse and solving matrix equations.  Finally there is a vector form of the FFT (for one dimensional arrays, such as sound clips).

 

Creating Applications using the DLL

 

            Purchase of the Vision Starter Kit entitles the purchaser to create and distribute applications which use the DLL.  The DLL itself can be distributed with the application as long as its name is not changed (VISIONSK.DLL).  However, information about the interface and source code included in the Vision Starter Kit must not be distributed.

Programs using .NET will also need the VisionStarterKit.dll along with VISIONSK.DLL. The VisionStarterKit.dll file is a C# “wrapper dll” which converts the functionality of the original dll into a suite of .NET classes, and is meant to work with VB.NET and C# applications. Other .NET languages should also be able to use the DLL. As long as both dlls, along with your application file (an exe) all occupy the same directory, the appication should be able to load and call the dlls.

 

What else may be needed

 

            Currently the supported file format is the Windows BMP