Data-Driven Automated Detection of Autism Spectrum Disorder Using Activity Analysis: A Review

Background/ Introduction: Autism Spectrum Disorder (ASD) is a neuro-developmental disorder that limits social interactions, cognitive skills, and abilities. Since ASD can last during an aﬀected person’s entire life cycle, the diagnosis at the early onset can yield a signiﬁcant positive impact. The current medical diagnostic systems (e.g., DSM-5/ICD-10) are somewhat subjective; rely purely on the behavioral observation of symptoms, and hence, some individuals often go misdiagnosed or late-diagnosed. Therefore, researchers have focused on developing data-driven automated diagnosis systems with less screening time, low cost, and improved accuracy while signiﬁcantly reducing professional intervention.


Introduction
Autism Spectrum Disorder (ASD) is a complicated developmental disability; signs and symptoms typically appear at some point in early childhood and affect a child's potential to communicate with verbal and non-verbal style and interaction with others [120]. ASD is described with the aid of a set of behaviors and is a "spectrum circumstance" that influences individuals in unusual ways and to varying tiers. Studies have found that worldwide, 1 in 68 children is affected by ASD [44]. This estimate represents an average scenario, whereas developing countries have a higher number of autistic children compared to the developed countries [38]. Based on epidemiological studies carried out over the last 50 years, the widespread presence of ASD appears to be growing globally [71]. Till now, there is no established research that can confirm a single cause of ASD [8]. Some available scientific research suggests various types of environmental influences and genetic factors [51]. However, elevated awareness, early intervention, and access to appropriate services lead to noticeably improved effects [127].
It is worth noting that the health-care necessity of children with ASD is complex and requires quite a several incorporated offerings, which include promoting fitness, care, rehabilitation services, and collaboration with other sectors along with education, employment, and social care [43]. Therefore, early detection of ASD is the first and foremost need to ensure their right to be treated with more attention and special needs [63]. State of the art ASD Data-Driven Automated Detection of ASD Using Activity Analysis: A Review 3 detection techniques need expert physician knowledge in this field and a significant amount of time for a single screening as the whole process is manual and mostly depends on the co-operative behavior of children. Furthermore, very often, parents do not want to accept that their children may have any developmental disorder or abnormality and, therefore, do not take their children to a physician for a diagnosis. Considering all these, developing an automated system to detect ASD is an outcry of time now. Currently, medical professionals identify ASD patients by following the DSM-5 guidelines [2]. This costly and lengthy process proceeds through several tasks with the active participation of health professionals. It includes observing a patient's behavior and development, interviewing the parents, hearing and vision screening, genetic and neurological testing. However, DSM-5 is capable of detecting autism in children of at least four years of age [65]. This delayed detection of ASD can have adverse implications on children with impaired fundamental functional activities, conversational skills, behavior patterns, or activities that are most likely incumbent and repetitive. Appropriate developmental monitoring of the children with ASD has the essential precondition of early detection, which is often considered to be within the first 18-24 months of the child [13]. In this regard, the physician often uses the Modified Checklist for Autism in Toddlers (M-CHAT) [55] -a screening test that determines if a child has a risk of developing ASD or not. However, further evaluation is required to confirm ASD. According to recent research findings reported in [41], magnetic resonance imaging (MRI) identified 80% of babies who went on to be diagnosed with autism at the age of 2. Though this method allows a much earlier diagnosis of autism, capturing brain images using MRI of younger children or people with a high level of anxiety, hyperactivity, or sensitivity to noise can be very discomforting and stressful [103]. To mitigate the gaps in the existing ASD detection techniques, researchers have started looking for ASD screening abnormalities that strongly correlate with the outcome of medical guidelines. To this end, the advancement of research in Human Activity Analysis (HAA) [125,97] has already been powered as a potential tool [121] to develop an automated system for ASD detection. HAA can decipher body movement, gesture, or motion through some sensor data and their analysis to determine human action in that scene [1]. An autistic child may show some abnormal facial expression [123,18], unusual behavior [84], repetitive action [23], atypical walking pattern [124], an irregular salient region in an image [116]-these activities are being analyzed by the researchers as a classifying tool for ASD [28]. A cheaper, flexible, and user-friendly system may be developed and used at educational institutions or child etiquette centers to detect children with ASD automatically. This is genuinely ultimate usefulness of human activity analysis system with computer vision and machine learning system in autism research, that is producing accurate, data-driven, and robust computer-aided algorithms for the diagnostic system. Moreover, it helps the doctor develop a quantitative tool to evaluate the autistic child's activity [5].

Related Surveys
Hyde et al. [45] provided an exhaustive overview of extant supervised learning techniques of ASD detection and algorithms for text analysis and classification. Apart from this, it dilated a wide range of behavioral neuroimaging data, genetic data, and some sound approaches of statistics for mining ASD data, which is a beneficial tool for the researchers. Thabtah [107] critically reviewed recent machine learning techniques with different sorts of evaluation methods of algorithms and investigated some available datasets with their feasibility and imbalance. Furthermore, it articulated the content mentioned above and provided a forward path that recommends using machine learning in ASD to win regard to implementation, conceptualization, and mining data. Besides, Song et al. [96] dispensed a comprehensive survey on the use of artificial intelligence in screening ASD. It briefly reviewed existing ASD assessment techniques, facial expression, motor movement data analysis, and their results on different learning algorithms. In another work, Boucenna et al. [12] presented an informative review of information communication technology (ICT) applications in the treatment of ASD and the scope of using robotics systems for the early development of imitation and joint attention of children with ASD. It analyzed state-of-the-art ICT applications like interactive virtual environment, touchscreen-based interactive devices, avatar games, and telerehabilitation to treat ASD individuals.

Contributions
To the best of our knowledge, this paper is the first review of ASD detection through activity analysis. This review is not limited to any particular learning method. Instead, we have explored every possible approach of automated ASD detection through activity analysis, state of the art research methods and algorithms, research challenges in this domain with their probable solutions, and extant resources for the researchers. Furthermore, we investigate its future application pathways and directions for further developments. Therefore, this review paper will be a brief guideline for the new researchers in this field.

Review Techniques
The existing ASD detection techniques are substantially focused on medical diagnostic approaches than data-driven ones. This paper corroborated the effectiveness of data-driven HAA approaches for ASD detection with the recent research studies conducted in this field. The relevant papers were compiled together using appropriate keywords of HAA approaches associated with ASD detection, such as "repetitive behavior," "abnormal gait pattern," and "unusual visual attention." Then extensively reviewing these and their reference literature, papers with high relevance to our topic were thoroughly studied Data-Driven Automated Detection of ASD Using Activity Analysis: A Review 5 for the review work. It was followed by analyzing papers related to the HAA system challenges in ASD detection and their probable solutions. This paper accumulates insights from 38 papers concerned with ASD detection using HAA. A detailed inspection of 6 publicly available datasets was added for the purpose as well. This survey includes only peer-reviewed journal and conference papers to enrich the readers with the knowledge of different perceptions of HAA techniques for ASD detection. The rest of the paper is structured as follows: Section 2 is formalized with the challenges to proceed with research with the underlying hindrances for the practical experimentation of the research work. Section 3 denotes the state of the art approaches by the researchers currently using to develop an automated system of ASD screening. Some of the publicly accessible datasets with high relevance to our topic are discussed in Section 4. Section 5 sheds light on viable experimental environments to carry out the methods of data collection. Section 6 is a concise overview of our findings on this review paper and future scopes with real-life implementation possibilities. Finally, we have drawn the conclusion in Section 7.

Challenges in this Domain
In many ways classifying mental complexities are very different from diagnosing the more familiar physiological problems [108]. Besides detecting the gesture changes and the movements of an autistic child with various sensors precisely is more complicated as they may behave unpredictably with frequent chances of sudden seizures. In this section, we have focused on some of the challenges which seem to arise to proceed with research in this field.

Unavailability of Dataset
A major challenge in conducting research in the fields related to autism is the unavailability of datasets. A dataset is considered to be a high-level tool for commencing study in any field [87]. However, having sensitive subjects like autistic individuals and, more importantly, making them act accordingly to extract required data for any study is quite a big challenge. Even though experiments on people with ASD are found to be done, seldom the datasets are made publicly accessible. Most of the observations involving autistic children are funded by NGOs or research institutes who do not make the availability of the dataset public.

Non-homogeneous Symptoms in Subjects
Autism is regarded as a spectrum of disorder as the symptoms can be reflected in diversified types ranging from mild to severe [68]. Due to its heterogeneous condition, each individual with autism has a unique profile of symptoms. There can be an increasing amount of response diversity for a single action among different individuals with autism. Even the responses from the same child with ASD can be different from time to time. A research observed that three typically developing pre-schoolers had unvarying responsive action to a block building task, whereas diverse responses were recorded from individuals with ASD [74]. It was also noticed that adding concrete reinforcements had effected the response of an individual with autism to get differed from the previous one [119]. Therefore, capturing a dataset that includes all the varying responses is undoubtedly a challenging task.

Complications in Lab Setup and Experimentation
It gets more difficult to have a setup with accurate orientations of equipment and connectivity as individuals with ASD have to deal with them [95]. Moreover, individuals with ASD may not have the capacity to adapt in a laboratory environment as their disorder is associated with socializing disabilities. It is prevalent for autistic children to have disarray in language and cognitive functionalities, which may escalate the chances of difficulties in responding accordingly with the instructions of the experiment director [70]. They may be very uncomfortable with a certain scent, sounds, pattern, and tastes. Therefore, in many cases, an autistic child may get panicked with sudden audio instruction, or the laboratory environment may not be very favorable to their known surrounding experiences [62]. Hence, the experimental setup should be designed and implemented carefully considering the group's physiology and psychology.

Data Annotation
Data annotation is a crucial task in machine-classified activity analysis. DSM-5 is a handbook to evaluate and classify unhealthy mental conditions with ASD [110]. Implementing DSM-5 in ASD detection needs a handful of skilled physicians who may need a prolonged period to annotate the result. Sometimes expert screening of DSM-5 may vary with different physicians [77,106]. The difference in determining the severity level may also create difficulties in annotating the data. Along with these misclassifying 'late learners' as 'autistic individual' is a common phenomenon in ASD detection [64]. So, annotating data in researches like this is undoubtedly a challenging task with numerous possibilities of ambiguity.

Resource Constraints
Arrangement of the space for conducting the experimentation with Autistic children is one of the main limitations for the manifestation of ASD detection using activity analysis. The machine-based classifier decides for activity analysis by analyzing data from a diverse set of fields. The access to this kind of data is achieved by dense, incessant feedback from sensors with multiple modes and advanced machine learning algorithms [113]. These sensors are required to have sufficient memory space to store the data and the computations of the software components continuously [66]. Furthermore, the necessary hardware and software tools for the observations require significant funding and expenditure [100].

Frequent Device Calibration
Very often, electric components like sensors may not work as expected from the onset and are required to be calibrated for several times to get finer results [115]. For example, visual saliency-based ASD detection approaches use eye trackers that require calibration several times to include information about shapes, light reflection, and refraction properties of the different parts of the eyes, and during calibration, participants are asked to look different positions on the screen [53]. Nevertheless, an autistic individual is unlikely to cooperate with situations like these.

Privacy Issues
Activity analysis requires tracking of some specific data which may constitute sensitive user information. Strong Encryption with comparative higher computational power may be required to store and use these data reliably [89]. However, analyzing facial expressions needs images of participants, which may violate the privacy of that individual. Therefore, most of the time, parents do not allow their children to participate in the research due to consciousness about their privacy. Besides, skeleton data and eye-tracking data do not violate privacy to that extent as it does not require or expose any body parts of that individual [37].

Approaches
Studies on autistic children have found some unusual behavior and response by an ASD child [33]. As automated detection aims to automate the whole diagnosis process, so the system needs to focus on those identifying characteristics, e.g., repetitive behavior, atypical walking style, and particular visual saliency. In this section, we have explored the recent literature on activity analysis based ASD detection by arranging them in three main groups. Figure  1 shows the overall process of the activity analysis-based automated detection, followed by the existing approaches in this domain.   The activity analysis-based automated detection of ASD comprises three main steps: data collection, training, and classification. The data collection block shows three core approaches; each of them utilizes different sensors to capture different exclusive characteristics of autistic individuals: repetitive behavior, atypical gait pattern, and unusual visual saliency. Next comes the training phase, which utilizes machine learning, or deep learning approaches to learn discriminative features of the data to classify ASD and TD in the final step of the automated detection.

Study of Repetitive Behavior
Repetitive behavior refers to showing an abnormality in behavior, characterized by repetition, inappropriateness, less adaptability, and rigidity [11]. Repetitive behaviors have been frequently reported by parents of children with ASD as well as typically developed (TD) [58]. However, there have been noticeable differences between repetitive behaviors of ASD and TD children in qualitative and quantitative characters [3,85]. It is important to note that repetitive behavior, being a core symptom of ASD and a prominent cue to identify autism, requires an expert eye for the diagnosis. Therefore, machine learning-driven automated detection of ASD based on the repetitive behavior cue has been a potential research avenue.
Goodwin et al. [35] represented a pattern recognition algorithm with threeaxis accelerometer data to automatically detect hand flapping and body rocking movement, which got approximately 90% accuracy in both classroom and laboratory environment. Therefore, this study is applicable to both controlled and real-world environments. Goncalves et al. [34] proposed automatic detection of hand flapping movement using two systems-Kinect sensor and a watch with an accelerometer device. This study found better results using the watch with an accelerometer attached to the wrist by applying statistical analysis on the collected data. Jazouli et al. [50] focused on real-time automatic identification of five repetitive behaviors, i.e., hand flapping, hand on the face, hands behind back, fingers flapping, and body rocking, in individuals with ASD. In their work, they had used a 3D skeleton to characterize repetitive behavior and reported above 93.3% recognition rate using Artificial Neural Network (ANN). Later on, Jazouli et al. [49] proposed a point-cloud recognizer method based on the nearest-neighbor classifier with the Euclidean distance function that classifies five repetitive behaviors of their previous study.
Jaiswal et al. [48] suggested an end-to-end dynamic deep learning method using 3D analysis of behavior for ASD-TD classification, and this study attained a classification rate of 96%. Moreover, their study used RGB-D data for classification purposes. Riwahi et al. [86] provided first-ever publicly available 3D skeleton dataset of repetitive behavior that ASD individuals usually show. This study used data recorded by volunteers who were not diagnosed with ASD that indicated a drawback. However, their study provided a wide range of repetitive actions, including hand moving front of the face, toe walking, walking in circles, and playing with a toy, which is not common in most studies.
Zunino et al. [126] analyzed hand gestures during a bottle grasping task with four different underlying intentions, i.e., picking, placing, pouring, and passing a bottle. The study found that the execution of this simple task is different between ASD and TD individuals and, therefore, can be effectively utilized to discriminate between these two groups. Their study reported 82% classification accuracy using an LSTM network followed by a VGG-16 architecture. Later on, Tian et al. [109] used SA-B3D with the LSTM network, and Sun et al. [99] used a temporal pyramid network (TPN) on the same dataset [126]. These latter studies reported 87.17% and 95.2% classification accuracy, respectively.

Abnormal Gait Recognition
Abnormal gait is defined as an atypical style of walking that may cause stagnation in occupational and other ample ranges of daily activities. Numerous studies have reported that analyzing abnormal gait features is a powerful tool for early diagnosis with proper treatment planning for individuals with ASD [54]. In general, gait recognition techniques use temporospatial features like stride length (distance between successive ground contacts of the same foot), step length, step width, cadence (steps per minute), velocity, stance time (the duration that passes during the stance phase of one extremity in a gait cycle), and double support of both leg [118].
Calhoun et al. [14] provided a comprehensive study to compare kinetic and kinematic gait parameters of ASD and TD children. This study found noticeable differences between cadence, peak hip, and ankle kinematics by applying sound approaches of statistics and principal component analysis. Hasan et al. [39] proposed an automated technique for classification of gait patterns of ASD individuals based on the kinetic and kinematic features with the aid of advanced machine learning approaches. Here, impactful features were selected statistically using the t-test and Mann-Whitney U test. However, this approach suggested Linear Discriminant Analysis (LDA) with kinetic gait features as input, and this provided 82.5% classification accuracy.
In another study, Hasan et al. [40] presented an approach of automatically identifying the gait patterns of the ASD using three-dimensional ground reaction forces (3D-GRF). This study was a binary classification of ASD and TD children, whereas time series parameterization was applied to find important gait features and used k-nearest neighbor (KNN) classifier, which leads to having 83.33% performance accuracy. Ebrahimi et al. [30] proposed a markerless approach using a 3D sensor to distinguish between tip-toe walking and regular walking. This study used time domain and frequency domain feature to get the best set of feature combination for classification. However, it got 83.4% accuracy with a linear support vector machine (SVM) technique. Ilias et al. [46] used fusion of tempo-spatial and kinematic features to classify gait patterns, and this approach got 95% accuracy with neural network classifier and SVM with polynomial kernel individually. With SVM polynomial as kernel attains 100% sensitivity and 85% specificity, show the efficacy of their approach in utilizing the SVM to identify autism.

Analyzing Gaze Pattern
A gaze pattern refers to the viewing style of an individual towards a scenario or an image. Irregular gaze pattern decodes the abnormality of the activation of the social brain network, the inability of social communication, and atypical brain response [92]. Individuals with ASD show unusual gaze patterns, which can be determined through some eye-tracking matrices like heatmap, scan path, fixation point, fixation map, and attention time. Heatmap is a visualization tool that demonstrates a general distribution of gaze points in an image. It is illustrated as a color-gradient overlay on the stimulus, e.g., an image. Figure 2 is a representation of the heatmap of an image, where we can notice that TD and ASD individuals are different as their visual attention is not the same. In general, the gaze points or fixations (i.e., points in the visual field that are fixated by the two eyes) are distributed more densely towards the area that gets more visual attention than other areas of the image.
Scanpaths are also used to visualize an individual's gaze pattern through a series of dots and fine lines. Figure 3 represents scan paths for both TD and ASD children in a social scene. In the figure, fixation points and saccades (rapid movement of the eye between fixation points) are represented by dots and lines, respectively. Besides, the size of the dots represents the duration of corresponding fixation points.
Wang et al. [116] experimented on the gaze pattern of the healthy child and ASD-child in between 2 to 5 years of age. The subjects' gaze patterns were converted to probabilistic heatmaps according to their visual attention in a video. Eventually, a heatmap score, calculated based on the statistical features, used to differentiate a healthy child from an individual with ASD. In another work, Jiang and Zhao [52] proposed a method based on Deep Neural Network (DNN) to differentiate the gaze pattern of regular people and ASD individuals. This work selected images based on the Fisher score [81], which ensures that those images would allow the learning of discriminative features from image contents. Furthermore, this developed model evaluated multiple matrices, e.g., attention time, pupil movement, expression change, which are used in diagnostic tests, and it got an overall accuracy of 92%.
In a related study, Cho et al. [21] used fixation points for ASD detection and reported 93.96% classification accuracy using the KNN algorithm. Startsev et al. [98] reported a comprehensive study on scan path data and fixation maps using the Random Forest algorithm. This study used statistical features for classification and reported 76.5% accuracy. In another study, Chen and Zhao [19] proposed a deep learning approach using an LSTM network followed by a ResNet-52 architecture and achieved 93% classification accuracy. Moreover, this study provided a novel method of ASD/TD classification using a photo- taking task where the participants were asked to take photos of their region of interest in a natural scene. Wan et al. [114] analyzed the fixation time of six different areas of interest and able to discriminate ASD from TD with a classification accuracy of 85.1%. Yaneva et al. [122] provided a study on adult ASD individuals by analyzing their capability in tasks such as web browsing and searching. This study investigated the area of interest, fixation time, and fixation points for classification. They reported 75% classification accuracy using simple logistic regression. Nebout et al. [75] designed a coarse-to-fine convolutional neural network (CNN) to predict saliency maps for ASD children that provides better results than 6 of existing saliency models. This study reported that no center bias is applicable for the visual attention of individuals with ASD, which contradicts the findings of other studies in [26,116]. Dris et al. [25] proposed a method of classifying ASD using fixation duration on different regions of interest in the image and got 88.6% specificity using an SVM classifier.
Syeda et al. [101] provided a comprehensive study of face scanning and emotion recognition of ASD and TD children. The study found that individuals with ASD show less attention to prime features of faces like eyes, nose, and mouth during face scanning and experience more difficulty in perceiving basic human emotions. Sadira et al. [91] analyzed the face-scanning pattern of ASD and TD people. Their fixation time analysis found that ASD individuals spend more time looking at the mouth rather than eyes or nose in a human face. Arru et al. [4] analyzed the scan path of both ASD and TD children and extracted fixations as well as a center bias to classify them. This study developed a decision tree based classifier and reported 60% accuracy on the test set. Liu et al. [60] proposed a machine learning-based approach to classify ASD/TD children by analyzing their face-scanning pattern in a face recognition task. It reported 88.85% classification accuracy. Recently, Shihab et al. [94] provided Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 October 2020 doi:10.20944/preprints202010.0388.v1 Data-Driven Automated Detection of ASD Using Activity Analysis: A Review 13 a comprehensive study on children and adults with ASD. It analyzed the scan path data with principal component analysis (PCA) and developed an unsupervised classifying method, which gives a sensitivity of 78.6% and specificity of 82.47%. A detailed description of activity analysis based studies of ASD detection is given in Table 1.
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 October 2020 doi:10.20944/preprints202010.0388.v1 Table 1: A detailed description of activity analysis based studies for ASD detection, grouped by core approaches (i.e., study of repetitive behavior, recognition of abnormal gait, and analysis of gaze behavior). Description of each study includes a method ( machine learning (ML), deep learning (DL), or statistical analysis (SA)); the name of the specific algorithm and cue/feature used for learning; followed by details of dataset (capturing sensor, age-range and no. of subjects, total no. of instances) and results in terms of ASD-TD classification accuracy. Data-Driven Automated Detection of ASD Using Activity Analysis: A Review

Available Datasets
The data-driven approaches mentioned in Section 3 requires dataset comprises of different ASD-TD activities to train the ML or deep learning-based models.
We have already mentioned that most of the datasets related to this research are rarely made open for the research community. In this section, we will discuss the datasets publicly available for the detection of ASD using activity analysis.

Skeleton Dataset
Skeletal data encode the human body posture in terms of relative 3D coordinates of different body joints and the joints' orientation angle. Rihawi et al. [86] developed the first publicly available 3D dataset named '3D-AD' based on ASD subjects using the Kinect-v2 camera. This dataset includes depth maps, which have been captured at 33 frames per second. The sequence of skeleton joint features was collected for ten different actions, e.g., hands-on the face, hands back, hand moving front of the face, headbanging (or rocking back and forth), tapping ears, flicking, hands stimming, toe walking, playing with a toy, and walking in circles. This research reported Dynamic Time Wrapping (DTW) distance as a distinguishable feature to detect ASD and TD.
In another study, Hasan et al. [39] focused on three-dimensional kinematic data to observe gait abnormality in autistic children using a Vicon Motion Capture System. Thirty healthy children and thirty children diagnosed with ASD were recruited to develop the dataset. All the participants had the independent walking capability. The data was recorded with 100 Hz frequency of the system, and each participant performed an average trial of 10 times. The analysis was contrived with the following gait features: minimum pelvic rotation, maximum hip extension, hip extension foot, maximum hip flexion, knee flexion foot, maximum knee flexion, maximum knee abduction, maximum ankle plantarflexion, maximum ankle plantar flexion, knee flexion foot, maximum ankle adduction, and minimum ankle adduction.

Video Dataset
To the best of our knowledge, Zunino et al. [126] developed the only available video dataset to analyze the action style of an autistic child. The dataset includes activities such as the task of placing, picking, passing, and grasping a bottle of a particular size by an autistic child. A Vicon VUE video camera with resolution: 1280 x 720 pixels and a framerate of 100 frames/sec was used to devise the whole experimentation. The study includes 20 TD and 20 ASD children as subjects, whose state of autism was confirmed by the DSM-5 method. All participants were in between 7 to 12 years of age.

Gaze Dataset
Eye-tracking refers to the process of measuring the visual attention. These measurements are captured using an eye-tracking device that records the positions and movements that our eye makes while viewing a scene. Most of the methods in the current literature utilized eye-tracking data in two forms: fixation data and scan path data. To this end, Duan et al. [27] developed an eye movement dataset named 'Saliency4ASD' from 14 ASD and 14 typically developed children. All the participants were in between 5 to 12 years of age. During the eye-tracking process, the participants viewed 300 images, including natural scenes, animals, the human body, and objects. This dataset provided fixation points, fixation maps, heat maps, and scan path data of the participants.
Yaneva et al. [122] also developed a gaze dataset, which included autistic adults instead of children. The participants were in between 30 to 40 years of age. The experiment included 30 participants, where 15 were diagnosed with high functioning autism or Asperger's syndrome, while the rest were nonautistic individuals. In the data collection process, the participants were asked to perform visual tasks such as web browsing and searching using a mouse and keyboard according to the given instructions. The study provided fixation time on the area of interest in images, duration of searching task in a web page, and the number of fixation points. This work utilized this dataset to classify ASD individuals and healthy people applying simple logistic regression.
In recent work, Shihab et al. [94] provided a gaze dataset that comprises face-scanning data of adults and children diagnosed with ASD. The participants were in between 4 to 60 years of age. Participants were asked to view human face images and movie clips involving social interactions on the laptop screen while the eye-tracking data were recorded using two analog cameras placed in front of the laptop. This research studied the difference in the facescanning pattern of ASD and TD individuals and classified ASD and TD using the PCA. Table 2 presents a summarized description of the available datasets with information about the samples, capturing devices, scenarios, and limitations.

Ideal Experimental Setup
The experimental setup for data collection in activity analysis is preferred to be performed in constrained environments. It minimizes the additional features introduced in computational modeling due to different backgrounds or surroundings [117]. This makes the model more focused on the specific activity's dynamic features, which is to be analyzed. Consequently, this compels it to be well trained in detecting the desired activity primarily. Further computational development to this basic model with complex data can improve its capabilities to perform well in different environments. Figure 4 provides experimental setups for capturing skeleton, video, wearable sensors, and eye-tracking data for the detection of ASD. Figure 4(a) shows a simple setup for skeleton data recording using a single view Kinect device where a subject is viewed to be executing some actions in motion. The problem of self-occlusion happens when data extraction of the skeleton using Kinect is drawn from one side only [50]. Therefore, to reduce this drawback, the camera needs to be positioned to capture the individuals' frontview images. The quality of the data can be more subserved by introducing more cameras in the motion extraction system [88]. Also, the background needs to be uncluttered and static in this kind of experimental setup. The setup for video data collection should consider several issues. It has a stipulation for constant illumination as variable illumination in a scene may create complexity to the environment [82]. Moreover, a dataset created from a single observational viewpoint tends to address the problem of featuring less specification of the captured data. In contrast, a multiview based video dataset facilitates the operations for activity analysis by attributing better clarity and comprehensibility for the system [90]. In Figure 4(b), an individual exhibits a specific action for video clip, and every single action is necessitated to be clicked off in correspondence with a fixed viewpoint and against a quiescent background. The entire experimental setup should be held up as such the autistic children do not feel any discomfort in the whole process. Apart from these setup arrangement complications, a skilled instructor is ought to wield the ASD individuals to perform the desired task.
The applicability of sensor-based data collection methods in different contexts incurs due to its flexibility of having both the options of the wired or wireless flow of data from the environment to the data processing units [20]. Seemingly for integrating this technology into our portion of the research, it was requisitioned to form a wireless setup of the sensor for the involvement of autistic children. The data streamed from the sensors is often fraught with noise and inferences, so the structure of data classification should be made by considering these in appropriates [59]. In Figure 4(c), the developed wearable sensor-based network is embedded with accelerometers, gyroscopes, and EMG (electromyography) sensors. They are placed in different body parts like chest, hand, wrist, torso, kneecap, legs, and feet of the participant who is instructed to perform certain activities. The sensors track the positional data of those parts during different activities.
Eye-tracking technologies offer great conveniences in data collection for research studies where experimentation includes participants with impaired organ functionality. So, this technology provides great ease to perform different empirical observations of ASD detection using activity analysis [29]. As demonstrated in Figure 4(d), a participant is only required to be seated and follow the instructions to perform a visual task. His visual attention has been captured through eye-tracking devices attached to the screen. Though some head-mounted eye-tracking devices with higher accuracy of data extraction are available, to capture ASD individuals' data, eye trackers without any body contact are most preferably used [78].

Discussion
The atypical behavioral pattern of autistic children has significantly supported the characterization of the unique features in data-driven ASD detection techniques. An RGB camera or skeleton tracking device is enough for capturing videos or images of such instances. The gaze abnormalities can also be a source for tracing exclusive attributes for feature modeling. It protects data privacy and has less computational but high operational complexities. The eye-tracking system cannot be employed to track down multiple instances at a time, whereas skeletal or video-based motion detection can do so.
On the other hand, the experimental setup for skeleton data does not align with the real-life scenario much and has complicated structures or requirements. This offsets the prediction model's compatibility with real-world applications. Nevertheless, intuition about the dataset before training the model can mitigate the methodical incompetencies. Since the automated detection of ASD requires an annotated dataset, supervised learning algorithms are preferable. Moreover, for datasets with sequential instances like videos or images, the deep learning approach is more suitable as it can learn the temporal or sequential information better than in-general classical machine learning algorithms. Our study found that deep learning methods have a range of 76% to 95% classification accuracy among numerous approaches. On the other hand, as we have seen in Section 3, the classical machine learning algorithms give satisfactory results in terms of classification accuracy ranging from 60% to 93% for datasets with a fewer number of action-classes and instances.
Here, we presented a thorough analysis of the existing data-driven autism detection methods. Some of the methods can be machinated in relevant fields without adding further functionality. The next task of bringing it out as the more implementable screening tool of ASD from its mere lab or research entity is quite viable [124]. The later improvisations of the research are expected to develop a system that may be easily implementable in the form of an annual test program in educational institutions that may ensure children's early and regular diagnosis.

Conclusion
ASD is a neurological and developmental disorder that affects the interaction, communication, and learning mechanism of a person's life. Numerous reports and studies have shown that treatment of early ASD individuals can cure or suppress its effects at a certain level, which might allow a person to lead a healthy life altogether. Our work analyzed the scopes and potentialities of data-driven activity analysis in the automated detection of autism. To this end, we reviewed state-of-the-art data-driven approaches for ASD detection through activity analysis, such as repetitive behavior, atypical gait patterns, and unusual visual saliency. Besides, this paper provided an analysis of different machine learning and deep learning algorithms with the results obtained in the ASD/TD detection. Moreover, possible challenges with probable solutions, available resources, and ideal experimental setups have been described briefly in this work. According to our findings, this technology has already proved its capacity to be an alternative to the traditional clinical analysis of ASD detection methods, usually taking a prolonged period with minimal certainty of the service's feasibility to the mass population. Nevertheless, some constraints may limit the accuracy and flexibility of automated detection. The occurrence of an incorrect result may misclassify a non-ASD as an autistic child, which is very common to happen for late learners to be miscomprehended as they need more time to decide and complete tasks. Besides, the scope of the very recent researches is also limited to distinguish between ASD and TD children but cannot detect the severity level of autism. However, advancements in learning algorithms and computational devices will soon pave the way for more improvement and adaptation for such data-driven approaches. The parents who are unwilling to accept the fact that their child is displaying some traits of ASD and consequently refuse to take the child to a physician are unintentionally creating hindrance in the process of needful treatment. This barrier to detect ASD at the early onset of childhood cannot have a better solution than the techniques discussed throughout the paper. We hope this extensive review will be a comprehensive guideline for the researchers to study the trends and literature in automated data-driven ASD detection through activity analysis.