# Fast Classification Of Handwritten On Line Arabic Characters

In the presented work, we present a fast and accurate classification technique for Arabic characters.
In Figure \ref{fig:letters_classifier_learning_flow} we give a high level flow visualization of the classification system.
The classifier receives a sequence of points $S=\{p_{i}\}_{i=1}^{n}$ representing the letter trajectory and a letter position $\phi \in \{Ini, Mid, Fin, Iso\}$.
Similarly to the process performed on the training character trajectory sequences, the query sequence goes through 5 stages: preprocessing, feature extraction, embedding, dimensionality reduction and then classification.
It reduces the number of vertices in a piecewise linear curve, given a pre-set tolerance parameter $\varepsilon$, and outputs a simplified curve, that consists of a subset of the points that defined the original curve.
In this work the tolerance parameter $\varepsilon$ was empirically set to ${1 \over 75}$.

The simplification process produces a highly angular, and non-uniform distribution of points along the stroke trajectory.
This step, using splines interpolation, aims at producing an equidistant smoothed data sequence, given a re-sampling target number of points $R$, which was set to 40.
Given a stroke $S=\{(x_i,y_i)\}_{i=1}^{n}$, let $f_{x}(d)$ and $f_{y}(d)$ be the quadratic piecewise interpolations function of $\{x_i\}_{i=1}^{n}$ and $\{y_i\}_{i=1}^{n}$, respectively.
$f_{x}(d)$ and $f_{y}(d)$ are functions of the coordinate values with respect to the arc-length distance from the pattern's starting point.
Let $t_i=i\frac{L}{R}$ for $i=0,...,R$ where L is the arc-length of the pattern.
The re-sampled sequence is given as follows:

\widehat{S}=\{(f_x(t_i),f_y(t_i))\}_{i=1}^{R}

Figure \ref{fig:before_after_preprocessing} visually demonstrates the resulting sequence after applying each step in the preprocessing stage on a trajectory sequence of the letter \RL{b}.

\begin{figure}
\centering
\subfloat[]{
\label{fig:preprocessing_orig}
\includegraphics[width=0.4\columnwidth]{./figures/preprocessing_orig}
}
\subfloat[]{
\label{fig:preprocessing_norm}
\includegraphics[width=0.4\columnwidth]{./figures/preprocessing_norm}
} \
\subfloat[]{
\label{fig:preprocessing_simpl}
\includegraphics[width=0.4\columnwidth]{./figures/preprocessing_simpl}
}
\subfloat[]{
\label{fig:preprocessing_resamp}
\includegraphics[width=0.4\columnwidth]{./figures/preprocessing_resamp}
}
\caption{A sample of the letter \RL{b} before preprocessing (a); after normalization (b); after noise elimination (c) and after re-sampling (d).}
\label{fig:before_after_preprocessing}
\end{figure}

\subsection{Feature Extraction}
Two shape descriptors were employed in this work, the \emph{Shape Context} \cite{belongie2002shape} and the \emph{Multi Angular Descriptor} (MAD) \cite{saabni2013multi}.
The shape context descriptor can be considered as a point matching approach intended to be a way of describing shapes that allows measuring shapes similarity.
It has been proved to be one of the efficient features for shapes matching.

MAD captures the angular view to multi resolution rings in different heights.
The shape is treated as a two dimensional set of points and the different rings are upper view points from rings around the shape centroid with different sizes and heights.
To enables scale and translation invariance, the sizes and heights of these rings are calculated using the...

