Last updated on
by Frans Janssens
Checklist of the Collembola:|
Image Analysis, Morphometry and Classification of scanned Collembola samples applied to Specimen Identification
Department of Biology, University of Antwerp (RUCA), Antwerp, B-2020, Belgium
Frank B. Dazzo,
Department of Microbiology & Molecular Genetics, Michigan State University, East Lansing, Michigan 48824-4320, USA
Identifying specimens of Collembola is a very time consuming task.
Especially in ecological studies, where the number of specimens to be
investigated can be very large, is the effort to be spend to identify
all specimens usually too high to produce the final results within the
relatively short time frame of the study.
The automatic counting of specimens of the same species, using digital
image analysis techniques, has been done with success
(Krogh, Johansen & Holmstrup, 1998:201-205).
The commercial application LemnaTec Collembol
counts springtails automatically
after suspending of the soil sample with stained water.
Besides number, the size of each springtail is quantified to give additional
information on number and size of each generation and individual quality of the
Further automatisation of the identifying process might relieve researchers
from the tideous task of handling the many specimens manually.
The technology of image processing and analysis tools has advanced currently
up to such an extend that it might become feasible to apply them to the
field of specimen samples classification and identification.
In the experimental approach, described in this work,
commonly available equipment and software is used to process samples of
specimens with the purpose to produce a report on the specimens in which
they are classified according to predefined criteria.
Overview of the process
- prepare a sample of with Tullgren or Berlese funnel extracted specimens
in a petri dish.
- make a scan (Fig.1) of the petri dish using a flatbed scanner.
- preprocess the image with a (customised) erosion filter to remove
all noise and speckles (Fig.2) without deforming the scanned objects.
- with an image analysing tool, find, number and count all isolated
objects in the image. Collect from the isolated objects a number of
measurements, such as area, perimeter, roundness, elongation, etc.
- classify the objects according to predefined classes based on
Preparing the specimens
Prepare a sample of with Tullgren or Berlese funnel extracted specimens
in a petri dish. Use a clear glass dish with a flat bottom. Specimens are
put in ethanol + 10% glycerol. The addition of the glycerol improves
the distribution of the individual specimens in the petri dish
and avoids in this way that specimens cross over one another.
Also make sure to use a low 'watermark' in the dish:
just as high as the specimens are thick. Then you can make use of
the surface tension to spread specimens evenly across the dish.
The problem remains that they can still touch eachother.
This is solved by applying an erosion filter during image (pre)processing.
Fig.1. A scanned sample of Collembola specimens
Scanning the specimens sample
Initially, a low resolution scan (Fig.1) of a sample is used to evaluate the
feasibility of the image processing and analysis technique
for specimen identification purposes. The advantage of the low resolution
is that the image files are relative small. Since a lot of processing
is to be done on the images, the small size will reduce processing time,
making results for evaluation available in a short time.
The disadvantadge of a low resolution is that a lot of information
is lost in the scanned image. This reduced information implies that
also the available criteria to classify the specimens are reduced.
The resolution used to make the scan is 300 x 300 dpi (dots per inch).
This means that the smallest in the image recognisable morphological feature
is about 85 micron. It is clear that a higher resolution will give more
detailed information that can be used by the classification logic to produce
a more accurate classification.
The scan is saved as a bilevel (black and white) image in the
OS/2 or Windows Bitmap file format (BMP). The bilevel image keeps the file
relatively small. The BMP format is an uncompressed format warranting that
no information is lost while saving the scan.
Filtering the scanned image
The scanned image contains a lot of noise. It is necessary to preprocess
the image to remove this disturbing noise before starting with the
image analysis itself. Several off-the-shelve available filters were
evaluated. But none seemed to produce the desired result: the images of the
specimens themselves should not be deformed by the noise filter.
Eventually the fully programmable filter for BMP formatted image
files 'BMP Wizard' of Andrea Benoni
was used. This filter can be programmed
to manipulate the image up to pixel level. The filter is programmed with
a simplified binary erosion algorithm (see pseudocode in Tab.1).
The erosion operation is one of the basic operators of what is called
mathematical morphology. Mathematical morphology operators concentrate
on the task of reducing imaging information. The erosion process causes
all isolated points and small objects to disappear.
The structuring element (mask) used in the erosion is a square of 4 pixels.
Within the context of the over the image sliding mask all
foreground (black) pixels that are touching 3 background (white) pixels
This operation is recursively repeated until no more pixels can be removed.
The result of this recursive binary erosion is that all isolated dots,
small objects and small protrusions of larger objects are removed (Fig. 2).
Tab.1. The binary erosion algorithm
For y=1 to Inbmp.Height-2
For x=1 to Inbmp.Width-2
Fig.2. The erosed scan
Fig.2a. All appendages removed
Fig.2b. Combined image to show removed parts
Applying another filter to the erosed image removes all appendages from the
bodies of the specimens (fig. 2a). In our experiment, all processes narrower
than 3 pixels are considered as appendages.
The combined image in fig.2b shows what the filter has removed in red colour.
Object analysis of the filtered image
Object analysis and classification (also called 'blob analysis')
is performed with the UTHSCSA Image Tool.
Each object can be analysed using measurements such as:
area, perimeter, compactness, roundness, elongation, major axis,
minor axis, gray level, etc.
Fig.3. Some zoomed-in traced objects
Finding the objects using automatic thresholding of the image
is the first step of the analysis.
The find objects operation identifies isolated regions in the current image.
Automatic thresholding will produce reproducible results.
Automatic object selection will ignore objects which touch any edge
of the image.
Since the border objects are most likely not complete, further analysis does not make any sense.
Therefore, objects at the edge of the image are excluded.
Objects with a predefined minimum size can be
discarded by automatic object selection.
Objects are annotated with cardinal identification numbers on the original image.
This is useful for interpreting the results of further analysis functions.
Identified objects are marked with annotated outlines.
The objects in the image are extracted by tracing the outlines of the objects.
Each traced object is numbered and counted.
The 8 times zoomed-in part (Fig.3) contains 9 traced objects, numbered as follows:
4, 7, 14, 15, 17, 19, 20, 21, and 22.
Based on their size, the objects can be grouped in 4 categories:
from small to large:
(20), (4,7,15,17,19,20,21), (22), and (14).
Obviously, object 20 is too small compared with the others. It can be
discarded because it is probably an artefact of the imaging process
or a sample contamination (dust, sand, etc.).
The object analysis process extracts the dimensional features
of identified objects in an image.
With UTHSCSA ImageTool, 19 different attributes of an object
can be computed.
Since a bilevel scanned image is used, not all attributes are relevant
(e.g. all attributes related to gray scale images are ignored).
The relevant attributes are defined as follows:
- Area: the area of the object;
- Perimeter: the length of the outside boundary of the object;
- Major Axis Length: the length of the longest line that can be drawn
through the object;
- Minor Axis Length: the length of the longest line that can be drawn
through the object perpendicular to the major axis;
- Elongation: the ratio of the length of the major axis to the length
of the minor axis.
If the elongation is 1, the object is roughly circular or square;
- Roundness: computed as: (4.PI.Area)/Perimeter2;
- Feret Diameter: the diameter of a circle having the same area
as the object, it is computed as: sqrt(4.Area/PI);
- Compactness: computed as: sqrt(4.Area/PI)/Major Axis Length;
this provides a measure of the object's circleness.
At 1, the object is roughly circular.
|Object||Area||Perimeter||Major Axis Length||Minor Axis Length||Elongation||Roundness||Feret Diameter||Compactness|
Tab.2. Relevant morphometric measurements of the objects in Fig.3
The complete image contains 82 objects in total.
of 19 morphometric parameters for all 82 objects.
The object analysis procedure has to be applied to the image with removed
appendages (fig.2a). Comparing both sets of measurements gives an indication
of the relative length of the appendages. To be completed.
Classification of the analysed objects
With UTHSCSA ImageTool, the object classification process can classify the
objects in an image based upon a single criterion.
Any of the classifiable attributes, such as area, compactness, elongation,
feret diameter, major axis length, minor axis length, perimeter
and roundness can be used to classify the objects into different groups.
A more object oriented classification, as opposed to the default feature
oriented classification of UTHSCSA ImageTool,
of the specimens can be performed with the
Center for Microbial Ecology Image Analysis System (CMEIAS),
which is actually a UTHSCSA ImageTool plug-in.
CMEIAS 1.27 is a free, scientific software tool of computer-assisted
microscopy and digital image analysis originally intended for use in
microbiological research and education. It was developed by a team of
microbial ecologists and computer scientists at the
Michigan State University Center for Microbial Ecology
to perform a semi-automatic
morphotype classification of the microbes present in digital images of
microbial populations and communities. CMEIAS 1.27 operates within the
UTHSCSA ImageTool Ver. 1.27
on a PC running Windows NT 4.0/2000/ME/XP.
To perform a morphotype classification using CMEIAS 1.27 in ImageTool,
the operator first finds the objects of interest in the image by using a
thresholding procedure, then conducts an Object Analysis to extract
various size and shape measurements from each microbe present, and
finally uses these Object Analysis data to perform an Object
Classification that automatically assigns the appropriate morphotype to
each microbe found. This object classification procedure uses a series
of pattern recognition algorithms optimized for
11 major microbial morphotypes
represented by 98% of the genera described in the 9th Edition of
Bergey's Manual of Determinative Bacteriology. Extensive testing using
large ground truth data sets indicate that CMEIAS performs with an
overall morphotype classification accuracy of 97% on properly edited
Fig.4. CMEIAS v1.27 supervised morphotype classification
Dr Frank Dazzo of the MSU
read with interest about our initial digital image analysis based morphometric
experiments in 1999 of Collembola specimens using UTHSCSA ImageTool,
was curious how the CMEIAS morphotype classifier
would perform on the same images,
and applied CMEIAS to the image of the scanned
petri dish sample of Collembola specimens in Fig.2.
The result of the classification is illustrated in the pseudocoloured
image of Fig.4.
Depending on the shape of the Collembola, the specimens are classified as
regular rods (13, blue), curved rods (3, magenta), U-shaped rods(1, pink),
prosthecates (3, yellow),
clubs (2, green), and rudimentary branched rods (2, gray) corresponding to the
major microbial morphotypes as currently defined by CMEIAS.
Note that the small Collembola specimens in the original image were too small
to classify, so they were removed manually.
This preliminary test result is quite promessing, and
Dr Frank Dazzo is prepared to develop the pattern recognition algorithms
for the major morphotypes of Collembola.
Frank B. Dazzo, Professor of Microbiology,
with questions and comments.
An unsupervised clustering or classification procedure could be used to
determine the ranges of classification criterion of the classes.
In unsupervised clustering a given collection of samples
is classified according to a criterion function.
The set of samples is partioned into
disjoint subsets. Each subset represents a cluster,
with samples in the same cluster being somehow more
similar than samples in different clusters.
Hierarchical clustering is typically applied in biological taxonomy,
where individual specimens are hierarchically grouped into
species, species into genera, genera into families,
and so on.
Agglomerative (bottom-up) procedures start with singleton
clusters and successively merge clusters. Divisive (top-down)
procedures start with one cluster containing all samples
and successively split clusters.
Fig.5. Basic Agglomerative Hierarchical Clustering
1. start with each sample = its own singleton cluster
2. stop if criterion function is satisfied
3. merge nearest distinct clusters pairwise
4. loop to 2
A classification is defined by specifying the ranges of the
classification criterion for each cluster.
Once a classification scheme is defined, the classification process
will classify the objects based upon this scheme.
The classification process basically can provide 3 different types of information:
it can report statistics on the classifications themselves,
it can report on the objects, and
it can display an image in which objects are colored by their classification.
To be completed...
applied to the feature extraction matrix might assist to get a kind of
pseudotaxonomic classification of the specimens.
E.g. for Collembola it should be feasible to at least
classify the specimens in the two main groups:
Arthropleona (with long stretched body) and
Symphypleona (with more globular body).
Classifying the specimens taking into account the measured features of
both the images of the complete specimens (fig.2) and the postprocessed images
of specimens without appendages (fig.2a)
allow further classification of the arthropleon Collembola into poduromorphs
(typically with short antennae, legs and furca) and
entomobryomorphs (typically with long antennae, legs and furca).
To be completed.
The standardised reproduction test with Collembola (Folsomia candida)
counts the springtails automatically
after suspending of the soil with stained water.
Besides number, the size of each springtail is quantified to give additional
information on number and size of each generation and individual quality of the
Dirk Vandenhirtz, CEO,
I thank Veselin Pizurica for his advise on using the mathematical morphology
- Benoni, A. 1996.
BMP Wizard, version 1.81,
a free programmable script driven digital image filter.
- Duda, R.O. & Hart, P.E. 1973.
Pattern Classification and Scene Analysis., John Wiley & Sons.
- Liu, J., Dazzo, F.B., Glagoleva, O., Yu, B. & Jain, A.K. 2001.
CMEIAS: A computer-aided system for the image analysis of bacterial morphotypes
in microbial communities. Microbial Ecology 41 (3), p.173-194 and 42, p.215.
- Wilcox, C.D., Dove, S.B., Doss-McDavid, W. & Greer, D.B. 1997.
UTHSCSA Image Tool,
a free image processing and analysis tool of the
University of Texas Health Science Center, San Antonio, TX, USA, October 12, 1997.