Shape and Geometric Features-based Semantic Image Retrieval Using Multi-class Support Vector Machine

In this paper, a new approach to retrieve semantic images based on shape and geometric features of image in conjunction with multi-class support vector machine is proposed. Zernike moment as shape feature is to verify the invariance of objects for silhouette image. In addition, a set of geometrical features is to explore the objects shape using two features of rectangularity and circularity. Then the extracted features are normalized and employed for multi-class support vector machine either for learning or retrieving processes. The retrieving process relies on three main tasks which namely Query Engine, Matching Module and Ontology Manger, respectively. Query Engine is to build the input text or image query using SPARQL language. The matching module extracts the shape and geometric features of image’s objects and employ them to Ontology Manger which in turn inserts them in ontology knowledge base. Benchmark mammals have been conducted to empirically conclude the outcome of proposed approach. Our experiment on text and image retrieval yields efficient results to problematic phenomena than previously reported.


INTRODUCTION
Image retrieval is a method for searching, browsing and retrieving images from a big digital images database.In the last decade, researchers focused on low-level vision in which the retrieval accuracy is still out the user expectation.So, interactive techniques appeared in current years to alleviate the large gap between low-level and highlevel concepts [1], [2], [3], [4].To interpret the meaning images, the low-level features such as color, texture and shape are mostly used.In contrast, the high-level techniques are active to retrieve patterns by scanning whole image.The semantic digital image retrieval is one the talented exploration field where several researchers consider approaches either for image analysis [5], [6], [7], [8] or image retrieval [9], [10], [11], [12].Most techniques of image retrieval usually used text meta-data that relied on the description oh image's textual [13], [14], [15].Currently, the retrieved of semantic images depends on keyword-based search.But there are silent numerous problems in Google search engines due to the absence of storing key and relationship among web images [16].In natural image domain, the images with their features are stored in semantic manner to construct an ontological data set.So, the accessible images with their ontological structure will retrieve when the pictorial image's features are allocated as input to SPARQL query.To learn these pictorial features, various classifier techniques are used and then retrieve the images with by features as input query image.A few researches progress on image retrieval with respect to content similarity.So, a slight experimental prototype techniques have been conducted as QBIC [17], Photobook [18], Netra [19], Virage [20], SIMPLICITY [21] and Visual SEEK [22].Furthermore, Content-based image retrieval (CBIR) is stated as all-inclusive surveys by [23], [24].S.K. Chang and S.H. Liu presented an approach to index and retrieve images with respect to proposed database retrieval [25].The images are indexed using their pictorial content such as color, texture and shape.There are several methods that retrieve images relied on either content-based or text-based.The indispensable difference between text-based and content-based is that the Human Interaction (HI) characterizes exceptional portion of final approach.Besides, the human's engagement the text keywords and descriptors as high-level features to interpret and measure images similarity.If not most features which routinely take out via the computer vision techniques characterize low-level features as color, texture, shape and spatial layout.In wide-ranging, the upfront link is not occurring between the low-level and high level in concepts and features and [26].The main contribution is to propose a new approach to retrieve semantic images based on shape and geometric features of image in conjunction with multi-class support vector machine is proposed.A set of geometrical features is to explore the objects shape using two features of rectangularity and circularity.The retrieving process relies on Query Engine, Matching Module and Ontology Manger.Query Engine is to build the input text or image query using SPARQL language.The matching module employs the shape and geometric features of image's objects Ontology Manger constructs ontology knowledge base.Our experiment on text and image retrieval of benchmark mammals yields efficient results to problematic phenomena when applying fortunate with large number of tested images.Our paper is organized as follows; Section 2 gives a detailed attributive of shape and geometric features, and support vector machine.In Section 3, the suggested approach is detailed.After that, the experimental results are debated in Section 4. Section 5 concludes the proposed approach.

RELATED LITERATURE
The success of retrieving semantic images due to a good selection of classifier technique in addition to features extraction which carry out the vigorous view-invariant task.The motivation behind this section is to understand and investigate the novel techniques that capable for real-time application.The following two subsections briefly review the shape and geometric feature as well as multi-class support vector machine classifier.

Shape Feature: Zernike Moments
Zernike moments are the orthogonal set of rotation-invariant moments and verify the invariance of objects for silhouette image [27].The motivation behind using Zernike moment is to carry out the translation and scale invariance using moment normalization.Mathematically, the complex Zernike moment (Z pq ) of repetition q and order p for the intensity function f(ρ, θ) of exact image is; where, the normalization factor is represented by λ N , P is to a positive integer.q represents either negative integer or positive with respect to the constraints condition; p -| q | = even, | q | ≤ p.With respect to the translation and scale factors, the function f is normalized using the of silhouette image centroid and the scaled factor a. R pq (ρ) is a redial polynomial [27].Thus, the invariant features of Zernike moment (i.e., invariant moment Hu) to scale, rotate and translate the shape is achieved by G z = [z 00 , z 11 , z 22 ].Here, the percentage error of feature invariants is experimentally smaller than 0.5%.

Geometric Features
The object shape is geometrical explored with standard specification as circle (i.e., circularity: Cir) and rectangle features (i.e., rectangularity: Rect) as in Eq. 2. These features can vary from one object to another to discriminate the objects in correct manner. (2)

Circularity
The shape of the object close to a circle is socalled Circularity.The motivation behind using this feature is to provide circularity as one.In addition, the circularity is ranged from 1 to infinity.Shortly, the circularity is denoted by the symbol Cir as follow; (3) In which; the Perimeter represents the object's contour while the Area refers to the total number of pixel's object.

Rectangularity
The shape of the object close to a rectangle is so-called Rectangularity.We compute the orientation of target object with respect to the central moments of all contour points.The length l and width w is calculated by obtaining the difference between the largest and the smallest orientation in a rotation.Thus, Rect is equal to 1 and is ranged from 0.5 to infinity (Eq.4); (4) Such that, Area is to the total pixels of object's contour.Thus, the scalability problem is appearing because of geometrical features infinity in range.The problem is alleviated via normalizing the feature vector as follows; (5) where maxCir and minCir are the maximum and minimum circularity of the object region respectively.It is being noted that, the notation Cir nom is the same of rectangularity.Thus, the shape and geometric feature is integrated by the vector F img as follow; (6)

Support Vector Machine
There are countless problems to train multiclass, which need a moral selection for the classifier to correctly carry out the retrieving process.Whereas, the extracted features from one frame is assign to on class.The motivation behind using Support Vector Machine (SVMs) is to has a highly accurate paradigm in addition to its capabilities for outstanding generalization.Moreover, SVMs can deal with structure principle of risk minimization as well as alleviating the problem of data over-fitting exists in neural network [28].Furthermore, SVMs structure is easily carried out dichotomic classes conspicuously at higher-dimensional space.The ability of SVMs is to build a maximal separating hyper-plane by maximizing the distance between two corresponding hyper-planes as in Figure 1.From the hyper-plane, we can supplementary be performing good separation relying on largest distance, which margin a low-slung classifier generalization error.
Mathematically, assume that the learning dataset is represented as Vapnik et.al [28] performed the learning problem by permitting some examples that be debased in their margin limitations.In so doing, slack variables are considered to verbalize the latent violations and to a drawback parameter for blocking the margin violations.The function which linearly used to learn SVMs is stated as; (7) where w is a weight vector, x is to input sample and b represents a used threshold.To characterize the minimum distance between hyper-plane and support vectors, we maximized the margin of hyper-plane, which separated via the trained SVMs learner.We can formulate SVMs as follows; Such that  refers to hyper-plane margin.Figure 1 shows the maximization margin of hyper plane while the input data is linearly separated and mapped to high dimension domain using SVMs (Figure 2).The mapping does not impact for the time of learning process since it has dot product and kernel trick.When the number of extracted features is so considerable, the SVMs classifier is a good classifier and vigorous to dimensionality profanity.
One advantage of SVM is to realize regression optimization throughout the learning and the testing operations.As well, the SVM structure can be developed using margin, kernel type and duality characteristics domain.But, there are various problems as local minima and non-linear, which can discriminately distinguish between classes and correctly separate them using SVMs.The decision

Slack
Margin Orign of SVMs relies on relevance score to determine the highest score among all classes.We can grade the retrieved image using fuzzy membership to determine relevance score that used as a measure to build a graded ground truth [28].The ground truth is constructed in binary label without manual energy.The fuzzy score is either negative or positive class posterior probability.It is based on the function of fitting sigmoid in which it is obtained after calculating the decision value of f(x s ) for each x s learning samples (Eq.9).( 9) It is being noted that the two parameters a and b are adapted depend on the learning dataset.Also, the learning sample which either negative or positive is sorted due to the fuzzy relevance scores.Here, multi-SVM classifier is considered to correctly treat with multiclass problem where the classes are associated to their increasing relevance score.The motivation behind using multi-SVM algorithm is to decrease the time complexity because of constructing binary SVMs.Thus, each binary classification can be relocated to multi-class easily using relevance scores σ s .

PREOPOSED APPROACH
The proposed approach includes on two major processes; learning and retrieving as in Figure 3.In learning process the shape and geometric feature and normalized and employed to SVMs to construct ontology knowledge base of mammal's dataset.Retrieving process performs Semantic Image Retrieval (SIR) based on the processes of Query Engine, Matching Module and Ontology Manger.

Query Engine
From Figure 3, the first process of retrieving semantic image is called Query Engine (QE).Two different ways can be used for input; text-based or image-based methods.In text-based method, the user used the system interface SIR to perform the search process by a text input.This process is typically frozen by current search engines as Yahoo, Alta Vista Google, Bing, etc.The chief impetus is to provide likelihood to users to inevitably learn and interact with SIR interface.For example, the text query as Giraffe, Deer, Ox, Tiger etc. is entered via the user.Then it is directly going to QE as Text-based, which is blamable to build the input text query.The second SIR input is named image-based method, which holds objects and some optional options for description.The key impetus is to run a new dimension for flexible searching.The QE construct the query using SPARQL language related to the input image of Ontology Knowledge base.Here, the object features are extracted using Zernike moment, circularity and rectangularity, which switched to high-level ontology features.After that, SPARQL generates the object parameters.Shortly speaking, the high-level ontology features is extracted semantic image as;

Query. Find the image of mammals with
SPARQL FROM: SELECT?x?y WHERE {?y reds:subClassOf:mammals.? x:

Matching Module
The second process of retrieving semantic image is named Matching Module (MM) as in

Query Engine Matching Module Ontology Manger
Figure 3. Based on ontology knowledge base, the input of this process is SPARQL query which obtained by QE while the output is to Ontology Manger process in case of successful search.The matching process carries out three major processes in case of failure to retrieve the relevant images.First, MM surfs for relevant images using query search engine as Google and then the gaining images are conceded to processing module to do their content verification (Figure 3).Furthermore, the obtained images are checked against relevant to user query.If not they are verified based on the features of Zernike moment, circularity and rectangularity, which switched to high-level ontology features.Finally, the query SPARQL is built relied on the high-level ontology and ontology knowledge base.Thus, we considered the retrieved relevant images when they match the user search query; else they are discarded.

Ontology Manger
The final process of retrieving approach in Ontology Manger (OM), which carries out the tasks of filtering, insertion and ranking.OM process filter the obtained relevant images using properties, class and instance of ontology knowledge base.After that it inserts image resultant in ontology knowledge base, which is constructed using semantic description for retrieved images.Finally, the ranking task runs the image ranking relied on matching value, which calculated by a summation of matched ontology features and user query reference.To obtain the results, the resultant images are sorted in descending order based on matching value and the higher ranking is selected as a user request.For further specifics, the person who reads can state to [27].

EXPERIMENTAL RESULTS
To motivate our work, we decided to build data set for representative mammals.The ontology knowledge base of this data set contains various images of 50 frames for a mammal.We have 25 various mammals as Lion, Horse, Deer, etc. Figure 4 depicts nearby limited images from mammal's dataset.To provide neutral estimation for SVMs classifier, we divided the dataset into two thirds for learning and one third for testing processes.Take in account, the sample for learning process is completely unalike testing process.To retrieve relevant semantic image either by text or image, the system interface (SIR) is projected in Matlab language (Figure 5).The SIR system provides good promising results to retrieve semantic images with tested the input images or input texts.In Figure 5, the user carries out the SIR interface by browsing the horse image as input.Then, the QE constructs the image's query in conjunction with ontology knowledge base.When the response of SIR is positive, the search in web images is overlooked and the processes of filtering and ontology well done.The OM ranks the semantic resultant images and shows high ten images in descending order.Uncertainty the SIR response is negative, the searching takes place in web images and then the filtering process and updating ontology contents process are carried out to provide the result.
Additionally, we can retrieve the relevant images semantically using the text as input.In Figure 6, the user carries out the SIR interface by writing 'Deer' as text in the text query bottom of SIR system interface.Then, the QE constructs the corresponding query of 'Deer'.Thus, the relevant Deer images will be shown and then marked due to ontology knowledge base and OM.We use Precision and Recall to evaluate our proposed system.They use metrics in International Relations World (IRW).From all ontology contents images, Precision measure the system performance to retrieve the relevant images (Eq.10).It is also called true positive.Formally speaking, the Precision does not afford all concrete information around the system performance since it is not reflecting all images, which retrieved.Moreover, in Eq. 11, Recall is to retrieve the relevant images via the total number of associated images that have been retrieved.It is being noted that, the Recall is named false negative in which it is not considering the retrieving distinct images.In Figure7, the Recall in addition to the average Precision of our proposed system is displayed, in which x-axis is to the 415 images tests of 25 gathering classes, and yaxis refers to the Recall and the average Precision which computed using Eq. 10 and Eq.11, respectively.
(10) (11) From figure 7, it is observed that the value of Precision/Recall is in the range of concluded 0.94/0.41.Furthermore, the higher value of Precision/Recall lies in the range of (0.08 over 0.41)/(0.33 over 1.0), respectively.The system can retrieve semantic images with 96.33 % retrieval accuracy.To measure the effectiveness of our proposed system, the gained results have been compared with my previous work as illustrated in Table 1.By the comparison, our system using Multi-class SVM achieves competitively with [27] and its result equated favorably.In addition, we compared our system with have used similar dataset and experimental setups.So, the comparison is significant.

SUMMARY AND CONCLUSION
In this paper, a new approach to retrieve semantic images based on multi-class support vector machine is proposed.The shape feature of Zernike moment is to verify the invariance of objects for silhouette image.Additionally, a set of geometrical features explores the objects shape using two features of rectangularity and circularity.After that the extracted features are normalized to multi-class SVM either for learning or retrieving processes.The retrieving process is based on three key tasks which so-called; Query Engine, Matching Module and Ontology Manger, respectively.Benchmark mammals of 25 different mammals of 50 images had been conducted to empirically conclude the outcome SIR system.Our experiment on text and image retrieval yields efficient results with 96.33% retrieval accuracy than previously reported.

Figure 2 :
Figure 2: Mapping Task From Complex Low To Simple High Dimensions.

Figure 3 :
Figure 3:Text and Semantic Image Retrieval Approach.

Figure 4 :
Figure 4:Sample Learning Images Of Mammal's Database.

Figure 5 :
Figure 5:Retrieving The Relevant Horse Images Relied On Ontology Base.

Figure 6 :Figure 7 :
Figure 6: Retrieving Relevant Deer Images By The Input 'Deer' In Text Query Bottom.

Table . 1
Comparison With Our Previous Work.