Compressing the Geospatial Data of Testing Grounds

ANATOL MITSIUKHIN

Institute of Information Technologies,

Belarusian State University of Informatics and Radioelectronics,

Belarus, 220037, Minsk, Kozlova St. 28,

REPUBLIC OF BELARUS

Abstract: - The paper discusses an algorithm for obtaining the compressed spatial images of testing grounds,

which are described by attribute data. The attributive representations of data make it possible to use the

hardware and software resources of geographical information systems (GIS) more economically. The objects,

that can be presented by a boundary, are considered as the observable ones. The efficiency of the testing ground

description is achieved by representing all its areas by boundaries, encoding the boundaries using the Freeman

code, and applying the entropy encoding at the last stage of image processing. At the same time, the

compressed set of attributive data can be unpacked into the original image with almost no recovery errors or

with acceptable accuracy, but not below the limit set by the specification. The theoretical principles of the

method are illustrated by the example of processing the segmented image of the contour. The comparative

assessment of the efficiency of compression of the combined method under consideration and that of the

entropy encoding method is presented.

Key-Words: - Testing ground, Protection of Nature, Image, Attribute, Freeman-Code, Compression, Boundary.

Received: March 21, 2023. Revised: October 13, 2023. Accepted: December 12, 2023. Published: December 31, 2023.

1 Introduction

The efficient method for encoding the remote

observation data has been considered. The areas of

application of monitoring of man-made and natural

objects are sufficiently fully reflected in the

scientific and technical literature, [1], [2]. The task

of monitoring nature and landscape is to provide

targeted and relevant information for the efficient

protection of the natural and landscape.

Environmental monitoring of agricultural territories

makes it possible to quantify the changes in nature's

biodiversity and state of the landscape as well as to

make practical conclusions from the results obtained

with appropriate correction of production activities

or protective measures, [3].

The spatial data may have high dimensionality.

In many applications, it is desirable to replace the

set of pixels depicting the object with a description

of the latter`s boundary to reduce the volume of the

data array. The representation in the boundary forms

is suitable for cases where the following geometrical

characteristics of the object are in the focus of

attention: length, area, bends, contours, and

concavities, [4].

The methods and technical means to be used for

reliable measurement, fast and reliable registration,

and automated analysis of video information of

remote observation require continuous

modernization of monitoring systems and

complexes. The task of technical improvement of

the GIS is associated to a large extent with the

introduction of new observation and storage tools as

well as modern methods for transmitting signals and

images based on t he application of an information

theoretic approach and fast (efficient) computational

algorithms for digital processing of 1D and 2D data.

2 Problem Formulation

The efficient description of the object boundaries

becomes a challenge when solving problems related

to the detection or search of certain objects on

images as w ell as recognition or identification of

them. One of the important characteristics of

engineering and geodetic monitoring is the

efficiency of automated measurements, obtaining

the spatial information about the observation of

changes in the behavior of infrastructure facilities

(utilities, transport systems, etc.) and natural objects,

for example, forests and wetlands of Polesia in

Belarus. Some monitoring images, for example,

agricultural areas and landscape territories usually

do not require measurements with centimetre-level

accuracy. For such objects under monitoring, their

attributive representation is convenient and efficient.

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1386

Volume 19, 2023

3 Problem Solution

It is assumed that the data are generated after the

process of segmentation of images of the geographic

objects of interest. The description of monitoring

data proposed for consideration is implemented

through the following computational stages.

1. Stage of initial description of the object of

interest. The stage includes processing and coding in

the spatial domain of the initial date of the testing

ground on a discrete grid, [5]. It is assumed that the

data of various monitoring objects (swamp, forest,

agricultural land, etc.) are stored in separate files.

The categories of land plots do not overlap. In this

case, the testing ground attributes can be stored in a

single spatial data layer, while the encoding

matrices differ in numerical attribute values or non-

spatial data. To reduce the computational costs of

spatial processing, it is proposed to perform the

encoding for each testing ground category. In

addition, the encoding should be only performed for

boundaries that define the territory of the category

(reservoir, roads, field, etc.).

2. Encoding the boundaries of the testing ground

objects using the Freeman-Code, [6].

3. Entropy encoding of the Freeman-Code

sequence with the specified component

connectivity, [7].

4. Formation, storage, and transmission of the

integrated compressed file of the observed image

data.

3.1 Theoretical Principles

Let

, 1,...,

gi K=

denote the

spatial object

present on t he image of interest under observation

and

is the number of objects of a si ngle-layer

testing ground G. The highlighted objects of the

testing ground were obtained after the segmentation

process. Note that the mathematical principles under

consideration underlie also the processing of

multilayer testing grounds. In this case, either a

sequential algorithm for processing each layer or a

parallel simultaneous computational algorithm for

processing images of objects using a multi-core

processor is implemented. In the digital

representation, the image of the testing ground

described by the matrix

()G=

with the

dimensions

.MN×

The matrix

is characterized

by the values of variances

and those of the

covariance function

cov( ).

A priori knowledge of

these statistical characteristics makes it possible to

pre-evaluate the efficiency of the data presentation

and description. The expression corresponds to a

single-layer testing ground.

=G

,,ij

∩ =∅≠gg

, 1,..., .iK

i=g

(1)

The improvement of the efficiency of processing

the testing ground image is achieved by reducing the

number of arithmetic operations. To do this, the

object is represented as attributes by performing a

dot encoding operation. As a result, the statistical

characteristics of the testing ground image change,

which is important for obtaining its compact record

and efficient transmission. The uniform point

encoding, [5], of the following kind

( )[ ], С= G

i mn

(2)

is performed, where

()С=

is the matrix, the

elements of which correspond to homogeneous

images of the testing ground objects;

( ), 1, ..., =

i mn

fg i K

is the point operation function

over the matrix (1).

The point operation (2) changes the levels of

brightness of all pixels of objects

of the testing

ground

. The encoding (2) highlights the main

highlights the main attribute of the object of interest

under observation. The encoding (2) of the process

is implemented by assigning the attributes

, 1, ..., , , .

= ∈≠

i i ij

сi Kc c c

For example, the 2D code word with the attribute

∈c

corresponds to all pixels with the “forest”

category; the code word with the attribute

∈c

corresponds to the pixels with the “swamp”, etc.

The new data image

with the variance 2

σ and

covariance

cov( ).

After operating (2), the

distribution of spatial two-dimensional variances of

the matrix

()С=

becomes highly uneven which

makes it possible to reduce the computational

complexity of processing the source data, [8].

To reduce the dimension of the testing ground

represented by the matrix

()

cС=

the additional

encoding is only performed for the boundaries of the

entire data array. The result will consist of a

sequential set of adjacent pixels located on t he

boundaries of the testing ground objects. The

distinctive statistical property of a typical image of

contour is a property of high linear dependence –

high correlation of the values of discrete samples.

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1387

Volume 19, 2023

The existence of this property makes it possible to

perform efficient data compression with a z ero

value of the image restoration error.

( )

1( ( )) 0, ,

−+

ε= − ≈

∑

nxn xn N=

where

( )

( ) , ..., −

xn x x

and

( )

( ) , ...,

−

 

xn x x

are sequences of values of the

boundary pixels before compression and after the

inverse transform (restoration);

is the number of pixels of the boundary;

is the current reading number.

It is assumed that the elements of boundaries of

the objects form a co nnected set with the

connectivity component

In this case it is possible

to achieve the high efficiency of describing the

boundaries using the Freeman-Code, Figure 1.

Fig. 1: Direction coding in a an 8-neighborhood

In general, the accuracy of the border

description is determined by the size of the grid step

and the number

, [4]. Several modifications of the

code with different connectivity are used, when the

direction code from the initial phase of 0 degrees

changes clockwise in increments of 90, 45 , or 22,5

degrees, respectively. Then the boundary image is

represented by the closed sequence of vectors

−

binary code words. The sequence elements are

determined by the current changes of the motion

direction over the boundary. The length of each

vector (code word) is determined by the

connectivity value

For example, encoding the

boundary with the use of four-component

connectivity is implemented by words with the

length of 2 bits. In this case, each pixel is described

by two bits instead of eight. As a result of the

encoding, the 1-D sequence is formed:

( )

( ) , ..., −

xn x x

(3)

with the base depending on t he connectivity

component value

Following the concept of the information theory,

sequence (3) describes a discrete memoryless source

{ }

0 –1

, ...,

with a known law of probability

( )

{ }

1, ..., =s

P Pq Pq

of the boundary pixels and.

The set of symbols

{ }

1, ..., s

of the source

corresponds to the vectors of directions, Figure 1.

For example, a binary vector of code word

corresponds to the symbol

3→q

(0 1 1).=x

The main characteristic of the random origin

from the standpoint of its efficient representation is

entropy

()

, [9]. Then an algorithm for optimal

non-uniform entropy encoding of the sequence (3)

can be applied for a compact description of such an

origin. To ensure the reliable and trustworthy

transmission of geospatial information via a channel

with noise, the encoding of the received entropy

code words with an interference-resistant code is

further performed.

3.2 Example of the Efficient Description of

Geodata

Figure 1 shows a relatively complex single-layer

testing ground.

Fig. 1: Testing ground image

The spatial objects of the testing ground belong

to the class of fuzzy ones, [10]. There are images of

four categories of objects: forest area, utility

territory used for processing and storage of wood,

deforested area, and lake mirror. Monitoring of this

testing ground may include information about

changes in the areas of objects, extent, features of

the structure of the soil, forest and other

characteristics.

At the stage of the initial compact description of

the testing ground, a grid sampling rate shall be

selected to meet the specified specification (level of

detail of the objects or spatial resolution of the

image, time spent for the data processing, etc.), [11].

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1388

Volume 19, 2023

The pixel size should be sufficient to cover the

required details, but not too large so that it would be

possible to perform efficiently the data analysis and

store (transmit) the data in a GIS database. The

categories of visible objects are encoded by the

attributes (identifiers) presented in Table 1.

Table 1. Testing ground attributes

Testing ground

Object

Forest

Utility

territory

Lake

Deforested

territory

Attribute

The evaluation of the efficiency of the

geomonitoring data description method is

considered for one of the testing ground objects, the

“Lake”. Based on the selected grid sampling rate,

the cost of the digital representation of the original

image of this object in pixels is

8271.L=

In general, the amount of memory for storing an

image with 8-bit quantization will be

82168V=

bits.

In the case of applying the approach with the

transition to attribute data, the required amount of

memory for storing the “Lake” category is

Attribut

813=V

bits.

At the same time, the shape of the object remains

unchanged. Note that the amount of memory does

not depend on the number of object categories

because it is necessary to ensure the storage

(transmission) of all pixels of the testing ground

image. In addition, testing grounds with a larger

number of categories require a larger number of bits

per pixel due to the peculiarities of the Freeman-

Code build.

The object boundary description is implemented

using the Freeman-Code with the connectivity

component

8.S=

Since the entropy value does not

depend on t he order of sequence of the source

symbol, the 1-D sequence (3) of the boundary is

represented by the matrix Xs in the alphabet

{ }

0 –1

, ..., = 0, 1, ..., 7

. To do this, a

lexicographical record of the sequence (3) by

columns was applied.

222222224444444

444444544445555

577666771100000

770107100700100







The amount of memory for storing the image of

the boundary of the “Lake” category is

Freeman 180=V

bits.

The matrix

describes a discrete memoryless

source with a known law of probability of the pixels

( ) ( )

{ }

( ) ( )

{ }

, ..., = 0 , ..., 7

0.2, 0.08, 0.133, 0, 0.28, 0.08, 0.1, 0.133 .

−

= =

P Pq Pq P P

The characteristic of the source Xs is the entropy

( ) ( ) log ( ) 2,6.

= =

∑

s ii

H X Pq Pq

The value

()

determines the upper value of

the average length

of the prefix code, with the

use of which you can compress the source

As a

result of the entropy encoding of the source s

X by

the optimum Huffman code the obtained value of

the average length of the code approaches the value

( ) 2,6.⇒=

E HX

The amount of memory for

storing the image of the boundary of the “Lake”

category is

Huffman

156≈V

bits.

Table 2 presents the data on the efficiency of

compression of the “lake” category at the

intermediate and final stages of the image

processing. The efficiency was evaluated by the

compression ratio

cod

/,η=VV

where

are the costs of the object description

without encoding;

cod

are the costs of storing (transmitting) the data

after efficient encoding according to the algorithm:

attributive encoding

⇒

Freeman-Cod encoding

⇒

Huffman encoding.

Table 2. Compression Efficiency for Different

Source Data View

Data type

Data

size, bit

Compression

ratio

Bit/pixel

8-bit

2168

Attributive

813

2.66

Freeman-

Code

180

0.67

Huffman-

Code

156

13.8

0.58

As seen from Table 2, the costs for describing

the attributes of the "Lake" object of the testing

ground, Figure 1 have been reduced from 8 bits per

pixel to 0,58 bits per pixel. The gain will increase

even more if the original testing ground image is

presented with a high resolution, for example, 10 or

12 bits per character.

As follows from example 3.2, t he considered

computational algorithm for the representation and

description of the boundary can be used for

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1389

Volume 19, 2023

analyzing the shape and technical objects. The

efficient solving of the problem of recognizing

technical (artificial) objects in shape is relevant for

many applications, in the particular communication,

space, and military ones. The recognition of objects

by shape in these applications is implemented based

on algorithms using computationally expensive

linear operations of convolution or correlation. The

transition from an operation on the set of all pixels

of the object to an operation on the set of pixels of

the boundary simplifies the execution of

convolution or correlation operations, which is

important if it is necessary to solve quickly the

recognition problem.

4 Results and Their Discussion

Experimental studies have shown that the method

makes it possible to reduce the initial testing ground

size represented by the attributes down to 15

20%. The efficiency of the description depends on

the geometric characteristics of the testing ground,

and the value of the correlation between adjacent

pixels of the Freeman sequence. On average, the

implementation of three encoding stages made it

possible to reduce the storage volume of geospatial

data on the testing ground by about 80%.

The main investigations have been performed for

boundaries with smooth characteristics of the shape.

The presence of abrupt changes in the boundary

shape reduces the processing efficiency for a

condition, where the value of the image restoration

error tends to zero.

Improvement of the boundary description

accuracy requires the reduction of the image

sampling grid interval and preliminary filtration of

the image to eliminate distortion due to the effect of

noise in the data channel. In turn, it reduces the data

processing efficiency due to the elongation of the

Freeman-Code. Therefore, the application of the

method in practice should be based on a

compromise between the efficiency of describing

the geodata as well as the accuracy and complexity

of the processing.

The proposed method provides a computational

effect in solving problems of recognizing technical

objects by shape.

4.1 Conclusion

The effective application of aerospace sensing

systems and geographic information systems to

solve various monitoring tasks requires the

introduction of more productive methods of digital

image processing. In this direction, an approach

making it possible to solve problems related to the

analysis of testing ground structures with less time

and computational costs was described. Based on

the research performed, the following conclusions

can be drawn:

1. The features of the algorithm for the point

numerical encoding of images of testing ground

areas make it possible to use relatively simple

encoding algorithms based on the Freeman code and

entropy approach for compression of testing

grounds.

2. The considered method based on the union of

the three encoding algorithms ensures the

acceleration of the process of information

transmission and processing and allows energy

saving.

3. The method can be used not only to describe

the shape of objects but also to solve the problems

of segmentation and classification of objects.

4. The application of the considered method for

compressing the information on the infrastructure of

transport networks and utility lines makes it possible

to reduce the excessiveness of the original spatial

data concerning such objects.

5. The shortening of the time required for the

information transmission improves the reliability of

the system from the information security standpoint

and reduces the probability of the information

capturing by a hacker.

6. The considered computational algorithm can

be used to quantify such geometric parameters as

the length of the boundary, contour, and area of the

object of observation.

7. The approach can be used in medicine to

increase the trustworthiness of clinical and

morphological diagnosis of diseases using a medical

image analysis system.

8. The considered data processing method is

relatively simple to implement with the use of

modern programming languages.

9. Further theoretical and experimental research

in the field of representation and description of

geospatial data can be aimed at constructing

effective computational algorithms with the use of

coordinate transformations on the Euclidean plane,

[12]. For example, coefficients of discrete cosine

transformation can be used in this case as

descriptors of the data source. In contrast to the

considered method of data processing in the spatial

domain, the results providing a computational and

temporal gain in the geodata processing in the field

of spatial frequencies can be also expected.

References:

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1390

Volume 19, 2023

[1] Zhu, X. GIS for Environmental Applications A

practical approach, London, Routledge,

2016.

[2] de Lange, N. Geoinformatics in Theory and

Practice: An Integrated Approach to

Geoinformation Systems, Remote Sensing and

Digital Image Processing, Berlin, Heidelberg,

Springer Spektrum, 2020.

[3] Longley, P. Geographic information science

& systems, NJ, Wiley, 2015.

[4] Gonzalez, R. C., Woods R. E. Digital Image

Processing, 4th ed., New Jersey, Prentice Hall,

2018.

[5] Jahne, B. Digital Image Processing.

Concepts. Algorithms, and Scientific

Applications, Heidelberg, Springer-Verlag,

2013.

[6] Mitsiukhin, A. I., Konopelko, V. K.

Description of the Binary Image Outline of

the Object of Interest. Eighth Belarusian

Space Congress, Oktober 25-27, 2022, Minsk:

Proceedings of the Congress in 2 Vol, Minsk,

OIPI NAS Belarus, Vol. 1, 2022, pp. 250-

253, ISBN: 978-985-7198-10-8, ISBN: 978-

985-7198-11-5, Vol. 2, ( Митюхин, А. И.,

Конопелько, В. К. Описание контура

бинарного изображения

объекта интереса. Восьмой Белорусский

космический конгресс, 25-27 октября 2022

года, Минск: материалы конгресса в 2 Т.,

Минск, ОИПИ НАН Беларуси, Т. 1, 2022, с.

250- 253, ISBN: 978-985-7198-10-8, ISBN:

978-985-7198-09-2, Т. 2)

[7] Cover, T. M., Thomas, J. A. Elements of

Information Theory, John Wiley & Sons, Inc.,

2012.

[8] Yan, L., Zhao, H., Lin, Y., Sun, Y. Digital

Image Compression. In: Math Physics

Foundation of Advanced Remote Sensing

Digital Image Processing, Springer,

Singapore, 2023.

[9] Gray, R. M. Entropy and Information,

Springer-Verlag, New York, 2023.

[10] Burger, W., Burge, M. J. Digital Image

Processing, London, Springer, 2016.

[11] Sundararajan, D. Digital Image Processing A

Signal Processing and Algorithmic Approach,

Springer Singapore, 2017.

[12] Pearlman, W. A., Said, A. Digital Signal

Compression. Principle and Practice,

Cambridge University Press, 2011.

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

The author contributed in the present research, at all

stages from the formulation of the problem to the

final findings and solution.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

No funding was received for conducting this study.

Conflict of Interest

The author has no conflicts of interest to declare.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on ENVIRONMENT and DEVELOPMENT

DOI: 10.37394/232015.2023.19.125

Anatol Mitsiukhin

E-ISSN: 2224-3496

1391

Volume 19, 2023