Sunday, September 29, 2019
User Authentication Through Mouse Dynamics
16 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 User Authentication Through Mouse Dynamics Chao Shen, Student Member, IEEE, Zhongmin Cai, Member, IEEE, Xiaohong Guan, Fellow, IEEE, Youtian Du, Member, IEEE, and Roy A. Maxion, Fellow, IEEE Abstractââ¬âBehavior-based user authentication with pointing devices, such as mice or touchpads, has been gaining attention. As an emerging behavioral biometric, mouse dynamics aims to address the authentication problem by verifying computer users on the basis of their mouse operating styles.This paper presents a simple and ef? cient user authentication approach based on a ? xed mouse-operation task. For each sample of the mouse-operation task, both traditional holistic features and newly de? ned procedural features are extracted for accurate and ? ne-grained characterization of a userââ¬â¢s unique mouse behavior. Distance-measurement and eigenspace-transformation techniques are applied to obtain featur e components for ef? ciently representing the original mouse feature space.Then a one-class learning algorithm is employed in the distance-based feature eigenspace for the authentication task. The approach is evaluated on a dataset of 5550 mouse-operation samples from 37 subjects. Extensive experimental results are included to demonstrate the ef? cacy of the proposed approach, which achieves a false-acceptance rate of 8. 74%, and a false-rejection rate of 7. 69% with a corresponding authentication time of 11. 8 seconds. Two additional experiments are provided to compare the current approach with other approaches in the literature.Our dataset is publicly available to facilitate future research. Index Termsââ¬âBiometric, mouse dynamics, authentication, eigenspace transformation, one-class learning. I. INTRODUCTION T HE quest for a reliable and convenient security mechanism to authenticate a computer user has existed since the inadequacy of conventional password mechanism was reali zed, ? rst by the security community, and then gradually by the Manuscript received March 28, 2012; revised July 16, 2012; accepted September 06, 2012. Date of publication October 09, 2012; date of current version December 26, 2012.This work was supported in part by the NSFC (61175039, 61103240, 60921003, 60905018), in part by the National Science Fund for Distinguished Young Scholars (60825202), in part by 863 High Tech Development Plan (2007AA01Z464), in part by the Research Fund for Doctoral Program of Higher Education of China (20090201120032), and in part by Fundamental Research Funds for Central Universities (2012jdhz08). The work of R. A. Maxion was supported by the National Science Foundation under Grant CNS-0716677. Any opinions, ? dings, conclusions, or recommendations expressed in this material are those of the authors, and do not necessarily re? ect the views of the National Science Foundation. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Sviatoslav Voloshynovskiy. C. Shen, Z. Cai, X. Guan, and Y. Du are with the MOE Key Laboratory for Intelligent Networks and Network Security, Xiââ¬â¢an Jiaotong University, Xiââ¬â¢an, Shaanxi, 710049, China (e-mail: [emailà protected] xjtu. edu. cn; [emailà protected] xjtu. edn. cn; [emailà protected] xjtu. edu. cn; [emailà protected] jtu. edu. cn). R. A. Maxion is with the Dependable Systems Laboratory, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213 USA (e-mail: [emailà protected] cmu. edu). Color versions of one or more of the ? gures in this paper are available online at http://ieeexplore. ieee. org. Digital Object Identi? er 10. 1109/TIFS. 2012. 2223677 public [31]. As data are moved from traditional localized computing environments to the new Cloud Computing paradigm (e. g. , Box. net and Dropbox), the need for better authentication has become more pressing.Recently, several large-scale password leakages exposed users to an unprecedented risk of disclosure and abuse of their information [47], [48]. These incidents seriously shook public con? dence in the security of the current information infrastructure; the inadequacy of password-based authentication mechanisms is becoming a major concern for the entire information society. Of various potential solutions to this problem, a particularly promising technique is mouse dynamics. Mouse dynamics measures and assesses a userââ¬â¢s mouse-behavior characteristics for use as a biometric.Compared with other biometrics such as face, ? ngerprint and voice [20], mouse dynamics is less intrusive, and requires no specialized hardware to capture biometric information. Hence it is suitable for the current Internet environment. When a user tries to log into a computer system, mouse dynamics only requires her to provide the login name and to perform a certain sequence of mouse operations. Extracted behavioral features, based on mouse movements and clicks, are compared to a legitimate userââ¬â¢s pro? le. A match authenticates the user; otherwise her access is denied.Furthermore, a userââ¬â¢s mouse-behavior characteristics can be continually analyzed during her subsequent usage of a computer system for identity monitoring or intrusion detection. Yampolskiy et al. provide a review of the ? eld [45]. Mouse dynamics has attracted more and more research interest over the last decade [2]ââ¬â[4], [8], [14]ââ¬â[17], [19], [21], [22], [33], [34], [39]ââ¬â[41], [45], [46]. Although previous research has shown promising results, mouse dynamics is still a newly emerging technique, and has not reached an acceptable level of performance (e. . , European standard for commercial biometric technology, which requires 0. 001% false-acceptance rate and 1% false-rejection rate [10]). Most existing approaches for mouse-dynamics-based user authentication result in a low authentication accuracy or an unreasonably long authenticatio n time. Either of these may limit applicability in real-world systems, because few users are willing to use an unreliable authentication mechanism, or to wait for several minutes to log into a system.Moreover, previous studies have favored using data from real-world environments over experimentally controlled environments, but this realism may cause unintended side-effects by introducing confounding factors (e. g. , effects due to different mouse devices) that may affect experimental results. Such confounds can make it dif? cult to attribute experimental outcomes solely to user behavior, and not to other factors along the long path of mouse behavior, from hand to computing environment [21], [41]. 1556-6013/$31. 00 à © 2012 IEEE SHEN et al. : USER AUTHENTICATION THROUGH MOUSE DYNAMICS 17It should be also noted that most mouse-dynamics research used data from both the impostors and the legitimate user to train the classi? cation or detection model. However, in the scenario of mouse-d ynamics-based user authentication, usually only the data from the legitimate user are readily available, since the user would choose her speci? c sequence of mouse operations and would not share it with others. In addition, no datasets are published in previous research, which makes it dif? cult for third-party veri? cation of previous work and precludes objective comparisons between different approaches.A. Overview of Approach Faced with the above challenges, our study aims to develop a mouse-dynamics-based user authentication approach, which can perform user authentication in a short period of time while maintaining high accuracy. By using a controlled experimental environment, we have isolated inherent behavioral characteristics as the primary factors for mouse-behavior analysis. The overview of the proposed approach is shown in Fig. 1. It consists of three major modules: (1) mouse-behavior capture, (2) feature construction, and (3) training/classi? cation. The ? st module serves to create a mouse-operation task, and to capture and interpret mouse-behavior data. The second module is used to extract holistic and procedural features to characterize mouse behavior, and to map the raw features into distance-based features by using various distance metrics. The third module, in the training phase, applies kernel PCA on the distance-based feature vectors to compute the predominant feature components, and then builds the userââ¬â¢s pro? le using a one-class classi? er. In the classi? cation phase, it determines the userââ¬â¢s identity using the trained classi? r in the distance-based feature eigenspace. B. Purpose and Contributions of This Paper This paper is a signi? cant extension of an earlier and much shorter version [40]. The main purpose and major contributions of this paper are summarized as follows: â⬠¢ We address the problem of unintended side-effects of inconsistent experimental conditions and environmental variables by restricting usersââ¬â ¢ mouse operations to a tightly-controlled environment. This isolates inherent behavioral characteristics as the principal factors in mouse behavior analysis, and substantially reduces the effects of external confounding factors. Instead of the descriptive statistics of mouse behaviors usually adopted in existing work, we propose newly-de? ned procedural features, such as movement speed curves, to characterize a userââ¬â¢s unique mouse-behavior characteristics in an accurate and ? ne-grained manner. These features could lead to a performance boost both in authentication accuracy and authentication time. â⬠¢ We apply distance metrics and kernel PCA to obtain a distance-based eigenspace for ef? ciently representing the original mouse feature space.These techniques partially handle behavioral variability, and make our proposed approach stable and robust to variability in behavior data. â⬠¢ We employ one-class learning methods to perform the user authentication task, so that the detection model is Fig. 1. Overview of approach. built solely on the data from the legitimate user. One-class methods are more suitable for mouse-dynamics-based user authentication in real-world applications. â⬠¢ We present a repeatable and objective evaluation procedure to investigate the effectiveness of our proposed approach through a series of experiments.As far as we know, no earlier work made informed comparisons between different features and results, due to the lack of a standard test protocol. Here we provide comparative experiments to further examine the validity of the proposed approach. â⬠¢ A public mouse-behavior dataset is established (see Section III for availability), not only for this study but also to foster future research. This dataset contains high-quality mouse-behavior data from 37 subjects. To our knowledge, this study is the ? rst to publish a shared mouse-behavior dataset in this ? eld. This study develops a mouse-dynamics-based user authenticat ion approach that performs user authentication in a short time while maintaining high accuracy. It has several desirable properties: 1. it is easy to comprehend and implement; 2. it requires no specialized hardware or equipment to capture the biometric data; 3. it requires only about 12 seconds of mouse-behavior data to provide good, steady performance. The remainder of this paper is organized as follows: Section II describes related work. Section III presents a data-collection process. Section IV describes the feature-construction process.Section V discusses the classi? cation techniques for mouse dynamics. Section VI presents the evaluation methodology. Section VII presents and analyzes experimental results. Section VIII offers a discussion and possible extensions of the current work. Finally, Section IX concludes. 18 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 II. BACKGROUND AND RELATED WORK In this section, we provide background on mouse- dynamics research, and various applications for mouse dynamics (e. g. , authentication versus intrusion detection).Then we focus on applying mouse dynamics to user authentication. A. Background of Mouse Dynamics Mouse dynamics, a behavioral biometric for analyzing behavior data from pointing devices (e. g. , mouse or touchpad), provides user authentication in an accessible and convenient manner [2]ââ¬â[4], [8], [14]ââ¬â[17], [19], [21], [22], [33], [34], [39]ââ¬â[41], [45], [46]. Since Everitt and McOwan [14] ? rst investigated in 2003 whether users could be distinguished by the use of a signature written by mouse, several different techniques and uses for mouse dynamics have been proposed.Most researchers focus on the use of mouse dynamics for intrusion detection (sometimes called identity monitoring or reauthentication), which analyzes mouse-behavior characteristics throughout the course of interaction. Pusara and Brodley [33] proposed a reauthentication scheme using m ouse dynamics for user veri? cation. This study presented positive ? ndings, but cautioned that their results were only preliminary. Gamboa and Fred [15], [16] were some of the earliest researchers to study identity monitoring based on mouse movements.Later on, Ahmed and Traore [3] proposed an approach combining keystroke dynamics with mouse dynamics for intrusion detection. Then they considered mouse dynamics as a standalone biometric for intrusion detection [2]. Recently, Zheng et al. [46] proposed angle-based metrics of mouse movements for reauthentication systems, and explored the effects of environmental factors (e. g. , different machines). Yet only recently have researchers come to the use of mouse dynamics for user authentication (sometimes called static authentication), which analyzes mouse-behavior characteristics at particular moments.In 2007, Gamboa et al. [17] extended their approaches in identity monitoring [15], [16] into web-based authentication. Later on, Kaminsky e t al. [22] presented an authentication scheme using mouse dynamics for identifying online game players. Then, Bours and Fullu [8] proposed an authentication approach by requiring users to make use of the mouse for tracing a maze-like path. Most recently, a full survey of the existing work in mouse dynamics pointed out that mouse-dynamics research should focus on reducing authentication time and taking the effect of environmental variables into account [21]. B.User Authentication Based on Mouse Dynamics The primary focus of previous research has been on the use of mouse dynamics for intrusion detection or identity monitoring. It is dif? cult to transfer previous work directly from intrusion detection to authentication, however, because a rather long authentication period is typically required to collect suf? cient mouse-behavior data to enable reasonably accurate veri? cation. To our knowledge, few papers have targeted the use of mouse dynamics for user authentication, which will be the central concern of this paper. Hashia et al. [19] and Bours et al. 8] presented some preliminary results on mouse dynamics for user authentication. They both asked participants to perform ? xed sequences of mouse operations, and they analyzed behavioral characteristics of mouse movements to authenticate a user during the login stage. Distance-based classi? ers were established to compare the veri? cation data with the enrollment data. Hashia et al. collected data from 15 participants using the same computer, while Bours et al. collected data from 28 subjects using different computers; they achieved equal-error rates of 15% and 28% respectively.Gamboa et al. [17] presented a web-based user authentication system based on mouse dynamics. The system displayed an on-screen virtual keyboard, and required users to use the mouse to enter a paired username and pin-number. The extracted feature space was reduced to a best subspace through a greedy search process. A statistical model based on the Weibull distribution was built on training data from both legitimate and impostor users. Based on data collected from 50 subjects, the researchers reported an equal-error rate of 6. 2%, without explicitly reporting authentication time.The test data were also used for feature selection, which may lead to an overly optimistic estimate of authentication performance [18]. Recently, Revett et al. [34] proposed a user authentication system requiring users to use the mouse to operate a graphical, combination-lock-like GUI interface. A small-scale evaluation involving 6 subjects yielded an average false-acceptance rate and false-rejection rate of around 3. 5% and 4% respectively, using a distance-based classi? er. However, experimental details such as experimental apparatus and testing procedures were not explicitly reported. Aksari et al. 4] presented an authentication framework for verifying users based on a ? xed sequence of mouse movements. Features were extracted from nine move ments among seven squares displayed consecutively on the screen. They built a classi? er based on scaled Euclidean distance using data from both legitimate users and impostors. The researchers reported an equal-error rate of 5. 9% over 10 usersââ¬â¢ data collected from the same computer, but authentication time was not reported. It should be noted that the above two studies were performed on a small number of usersââ¬âonly 6 users in [34], and 10 users in [4]ââ¬âwhich may be insuf? ient to evaluate de? nitively the performance of these approaches. The results of the above studies have been mixed, possibly due to the realism of the experiments, possibly due to a lack of real differences among users, or possibly due to experimental errors or faulty data. A careful reading of the literature suggests that (1) most approaches have resulted in low performance, or have used a small number of users, but since these studies do not tend to be replicated, it is hard to pin the discr epancies on any one thing; (2) no research group provided a shared dataset.In our study, we control the experimental environment to increase the likelihood that our results will be free from experimental confounding factors, and we attempt to develop a simple and ef? cient user authentication approach based on mouse dynamics. We also make our data available publicly. III. MOUSE DATA ACQUISITION In this study, we collect mouse-behavior data in a controlled environment, so as to isolate behavioral characteristics as the principal factors in mouse behavior analysis. We offer here SHEN et al. USER AUTHENTICATION THROUGH MOUSE DYNAMICS 19 considerable detail regarding the conduct of data collection, because these particulars can best reveal potential biases and threats to experimental validity [27]. Our data set is available 1. A. Controlled Environment In this study, we set up a desktop computer and developed a Windows application as a uniform hardware and software platform for the coll ection of mouse-behavior data. The desktop was an HP workstation with a Core 2 Duo 3. 0 GHz processor and 2 GB of RAM.It was equipped with a 17 HP LCD monitor (set at 1280 1024 resolution) and a USB optical mouse, and ran the Windows XP operating system. Most importantly, all system parameters relating to the mouse, such as speed and sensitivity con? gurations, were ? xed. The Windows application, written in C#, prompted a user to conduct a mouse-operation task. During data collection, the application displayed the task in a full-screen window on the monitor, and recorded (1) the corresponding mouse operations (e. g. , mouse-single-click), (2) the positions at which the operations occurred, and (3) the timestamps of the operations.The Windows-event clock was used to timestamp mouse operations [28]; it has a resolution of 15. 625 milliseconds, corresponding to 64 updates per second. When collecting data, each subject was invited to perform a mouse-operations task on the same desktop computer free of other subjects; data collection was performed one by one on the same data-collection platform. These conditions make hardware and software factors consistent throughout the process of data collection over all subjects, thus removing unintended side-effects of unrelated hardware and software factors. B.Mouse-Operation Task Design To reduce behavioral variations due to different mouse-operation sequences, all subjects were required to perform the same sequence of mouse operations. We designed a mouse-operation task, consisting of a ? xed sequence of mouse operations, and made these operations representative of a typical and diverse combination of mouse operations. The operations were selected according to (1) two elementary operations of mouse clicks: single click and double click; and (2) two basic properties of mouse movements: movement direction and movement distance [2], [39].As shown in Fig. 2, movement directions are numbered from 1 to 8, and each of them is sel ected to represent one of eight 45-degree ranges over 360 degrees. In addition, three distance intervals are considered to represent short-, middle- and long-distance mouse movements. Table I shows the directions and distances of the mouse movements used in this study. During data collection, every two adjacent movements were separated by either a single click or a double click. As a whole, the designed task consists of 16 mouse movements, 8 single clicks, and 8 double clicks.It should be noted that our task may not be unique. However, the task was carefully chosen to induce users to perform a wide variety of mouse movements and clicks that were both typical and diverse in an individualââ¬â¢s repertoire of daily mouse behaviors. 1The mouse-behavior dataset is available from: http://nskeylab. xjtu. edu. cn/ projects/mousedynamics/behavior-data-set/. Fig. 2. Mouse movement directions: sector 1 covers all operations performed degrees and degrees. with angles between TABLE I MOUSE MO VEMENTS IN THE DESIGNED MOUSE-OPERATION TASK C.Subjects We recruited 37 subjects, many from within our lab, but some from the university at large. Our sample of subjects consisted of 30 males and 7 females. All of them were right-handed users, and had been using a mouse for a minimum of two years. D. Data-Collection Process All subjects were required to participate in two rounds of data collection per day, and waited at least 24 hours between collections (ensuring that some day-to-day variation existed within the data). In each round, each subject was invited, one by one, to perform the same mouse-operation task 10 times.A mouse-operation sample was obtained when a subject performed the task one time, in which she ? rst clicked a start button on the screen, then moved the mouse to click subsequent buttons prompted by the data-collection application. Additionally, subjects were instructed to use only the external mouse device, and they were advised that no keyboard would be needed. S ubjects were told that if they needed a break or needed to stretch their hands, they were to do so after they had accomplished a full round. This was intended to prevent arti? cially anomalous mouse operations in the middle of a task.Subjects were admonished to focus on the task, as if they were logging into their own accounts, and to avoid distractions, such as talking with the experimenter, while the task was in progress. Any error in the operating process (e. g. , single-clicking a button when requiring double-clicking it) caused the current task to be reset, requiring the subject to redo it. 20 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 TABLE II MOUSE DYNAMICS FEATURES Subjects took between 15 days and 60 days to complete data collection.Each subject accomplished 150 error-free repetitions of the same mouse-operation task. The task took between 6. 2 seconds and 21. 3 seconds, with an average of 11. 8 seconds over all subjects. The ? nal dataset contained 5550 samples from 37 subjects. IV. FEATURE CONSTRUCTION In this section, we ? rst extract a set of mouse-dynamics features, and then we use distance-measurement methods to obtain feature-distance vectors for reducing behavioral variability. Next, we utilize an eigenspace transformation to extract principal feature components as classi? er input. A.Feature Extraction The data collected in Section III are sequences of mouse operations, including left-single-clicks, left-double-clicks, and mouse-movements. Mouse features were extracted from these operations, and were typically organized into a vector to represent the sequence of mouse operations in one execution of the mouse-operation task. Table II summarizes the derived features in this study. We characterized mouse behavior based on two basic types of mouse operationsââ¬âmouse click and mouse movement. Each mouse operation was then analyzed individually, and translated into several mouse features.Our study divi ded these features into two categories: â⬠¢ Holistic features: features that characterize the overall properties of mouse behaviors during interactions, such as single-click and double-click statistics; â⬠¢ Procedural features: features that depict the detailed dynamic processes of mouse behaviors, such as the movement speed and acceleration curves. Most traditional features are holistic features, which suf? ce to obtain a statistical description of mouse behavior, such as the mean value of click times. They are easy to compute and comprehend, but they only characterize general attributes of mouse behavior.In our study, the procedural features characterize in-depth procedural details of mouse behavior. This information more accurately re? ects the ef? ciency, agility and motion habits of individual mouse users, and thus may lead to a performance boost for authentication. Experimental results in Section VII demonstrate the effectiveness of these newly-de? ned features. B. Dis tance Measurement The raw mouse features cannot be used directly by a classi? er, because of high dimensionality and behavioral variability. Therefore, distance-measurement methods were applied to obtain feature-distance vectors and to mitigate the effects of these issues.In the calculation of distance measurement, we ? rst used the Dynamic Time Warping (DTW) distance [6] to compute the distance vector of procedural features. The reasons for this choice are that (1) procedural features (e. g. , movement speed curve) of two data samples are not likely to consist of the exactly same number of points, whether these samples are generated by the same or by different subjects; (2) DTW distance can be applied directly to measure the distance between the procedural features of two samples without deforming either or both of the two sequences in order to get an equal number of points.Next, we applied Manhattan distance to calculate the distance vector of holistic features. The reasons for th is choice are that (1) this distance is independent between dimensions, and can preserve physical interpretation of the features since its computation is the absolute value of cumulative difference; (2) previous research in related ? elds (e. g. , keystroke dynamics) reported that the use of Manhattan distance for statistical features could lead to a better performance [23]. ) Reference Feature Vector Generation: We established the reference feature vector for each subject from her training feature vectors. Let , be the training set of feature vectors for one subject, where is a -dimensional mouse feature vector extracted from the th training sample, and is the number of training samples. Consider how the reference feature vector is generated for each subject: Step 1: we computed the pairwise distance vector of procedural features and holistic features between all pairs of training feature vectors and .We used DTW distance to calculate the distance vector of procedural features for measuring the similarity between the procedural components of the two feature vectors, and we applied Manhattan distance to calculate the distance vector of holistic features . (1) where , and represents the procedural components of represents the holistic components. SHEN et al. : USER AUTHENTICATION THROUGH MOUSE DYNAMICS 21 Step 2: we concatenated the distance vectors of holistic features and procedural features together to obtain a distance vector for the training feature vectors and by (2) Step 3: we normalized vector: to get a scale-invariant feature nd sample covariance . Then we can obtain the mean of such a training set by (5) (6) (3) is the mean of all where pairwise distance vectors from the training set, and is the corresponding standard deviation. Step 4: for each training feature vector, we calculated the arithmetic mean distance between this vector and the remaining training vectors, and found the reference feature vector with minimum mean distance. (4) 2) Feature-Dis tance Vector Calculation: Given the reference feature vector for each subject, we then computed the feature-distance vector between a new mouse feature vector and the reference vector.Let be the reference feature vector for one subject; then for any new feature vector (either from the legitimate user or an impostor), we can compute the corresponding distance vector by (1), (2) and (3). In this paper, we used all mouse features in Table II to generate the feature-distance vector. There are 10 click-related features, 16 distance-related features, 16 time-related features, 16 speed-related features, and 16 acceleration-related features, which were taken together and then transformed to a 74-dimensional feature-distance vector that represents each mouse-operation sample. C.Eigenspace Computation: Training and Projection It is usually undesirable to use all components in the feature vector as input for the classi? er, because much of data will not provide a signi? cant degree of uniquene ss or consistency. We therefore applied an eigenspace-transformation technique to extract the principal components as classi? er input. 1) Kernel PCA Training: Kernel principal component analysis (KPCA) [37] is one approach to generalizing linear PCA to nonlinear cases using kernel methods. In this study, the purpose of KPCA is to obtain the principal components of the original feature-distance vectors.The calculation process is illustrated as follows: For each subject, the training set represents a set of feature-distance vectors drawn from her own data. Let be the th feature-distance vector in the training set, and be the number of such vectors. We ? rst mapped the measured vectors into the hyperdimensional feature space by the nonlinear mapping Here we centered the mapped point with the corresponding mean as . The principal components were then computed by solving the eigenvalue problem: (7) where and . Then, by de? ning a kernel matrix (8) we computed an eigenvalue problem for t he coef? ients is now solely dependent on the kernel function , that (9) For details, readers can refer to B. Scholkopf et al. [37]. Generally speaking, the ? rst few eigenvectors correspond to large eigenvalues and most information in the training samples. Therefore, for the sake of providing the principal components to represent mouse behavior in a low-dimensional eigenspace, and for memory ef? ciency, we ignored small eigenvalues and their corresponding eigenvectors, using a threshold value (10) is the accumulated variance of the ? st largest eigenwhere values with respect to all eigenvalues. In this study, was chosen as 0. 95 for all subjects, with a range from 0 to 1. Note that we used the same for different subjects, so may be different from one subject to another. Speci? cally, in our experiments, we observed that the number of principal components for different subjects varied from 12 to 20, and for an average level, 17 principal components are identi? ed under the threshold of 0. 95. 2) Kernel PCA Projection: For the selected subject, taking the largest eigenvalues and he associated eigenvectors, the transform matrix can be constructed to project an original feature-distance vector into a point in the -dimensional eigenspace: (11) As a result, each subjectââ¬â¢s mouse behavior can be mapped into a manifold trajectory in such a parametric eigenspace. It is wellknown that is usually much smaller than the dimensionality of the original feature space. That is to say, eigenspace analysis can dramatically reduce the dimensionality of input samples. In this way, we used the extracted principal components of the feature-distance vectors as input for subsequent classi? ers. 22IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 V. CLASSIFIER IMPLEMENTATION This section explains the classi? er that we used, and introduces two other widely-used classi? ers. Each classi? er analyzes mouse-behavior data, and discriminates between a legitimate user and impostors. A. One-Class Classi? er Overview User authentication is still a challenging task from the pattern-classi? cation perspective. It is a two-class (legitimate user versus impostors) problem. In the scenario of mouse-dynamicsbased user authentication, a login user is required to provide the user name and to perform a speci? mouse-operation task which would be secret, like a password. Each user would choose her own mouse-operations task, and would not share that task with others. Thus, when building a model for a legitimate user, the only behavioral samples of her speci? c task are her own; other usersââ¬â¢ (considered as impostors in our scenario) samples of this task are not readily available. In this scenario, therefore, an appropriate solution is to build a model based only on the legitimate userââ¬â¢s data samples, and use that model to detect impostors. This type of problem is known as one-class classi? ation [43] or novelty/anomaly detection [25], [26]. We thus focused our attention on this type of problem, especially because in a real-world situation we would not have impostor renditions of a legitimate userââ¬â¢s mouse operations anyway. B. Our Classi? erââ¬âOne-Class Support Vector Machine Traditional one-class classi? cation methods are often unsatisfying, frequently missing some true positives and producing too many false positives. In this study, we used a one-class Support Vector Machine (SVM) classi? er, introduced by Scholkopf et al. [36], [38]. One-class SVMs have been successfully applied to a number of real-life classi? ation problems, e. g. , face authentication, signature veri? cation and keystroke authentication [1], [23]. In our context, given training samples belonging to one subject, , each sample has features (corresponding to the principal components of the feature-distance vector for that subject). The aim is to ? nd a hyperplane that separates the data points by the largest margin. To separ ate the data points from the origin, one needs to solve the following dual quadratic programming problem [36], [38]: the origin, and is the kernel function. We allow for nonlinear decision boundaries. Then the decision function 13) will be positive for the examples from the training set, where is the offset of the decision function. In essence, we viewed the user authentication problem as a one-class classi? cation problem. In the training phase, the learning task was to build a classi? er based on the legitimate subjectââ¬â¢s feature samples. In the testing phase, the test feature sample was projected into the same high-dimensional space, and the output of the decision function was recorded. We used a radial basis function (RBF) in our evaluation, after comparative studies of linear, polynomial, and sigmoid kernels based on classi? ation accuracy. The SVM parameter and kernel parameter (using LibSVM [11]) were set to 0. 06 and 0. 004 respectively. The decision function would gen erate ââ¬Å" â⬠if the authorized userââ¬â¢s test set is input; otherwise it is a false rejection case. On the contrary, ââ¬Å" â⬠should be obtained if the impostorsââ¬â¢ test set is the input; otherwise a false acceptance case occurs. C. Other Classi? ersââ¬âNearest Neighbor and Neural Network In addition, we compared our classi? er with two other widely-used classi? ers, KNN and neural network [12]. For KNN, in the training phase, the nearest neighbor classi? r estimated the covariance matrix of the training feature samples, and saved each feature sample. In the testing phase, the nearest neighbor classi? er calculated Mahalanobis distance from the new feature sample to each of the samples in the training data. The average distance, from the new sample to the nearest feature samples from the training data, was used as the anomaly score. After multiple tests with ranging from 1 to 5, we obtained the best results with , detailed in Section VII. For the neural network, in the training phase a network was built with input nodes, one output node, and hidden nodes.The network weights were randomly initialized between 0 and 1. The classi? er was trained to produce a 1. 0 on the output node for every training feature sample. We trained for 1000 epochs using a learning rate of 0. 001. In the testing phase, the test sample was run through the network, and the output of the network was recorded. Denote to be the output of the network; intuitively, if is close to 1. 0, the test sample is similar to the training samples, and with close to 0. 0, it is dissimilar. VI. EVALUATION METHODOLOGY This section explains the evaluation methodology for mouse behavior analysis.First, we summarize the dataset collected in Section III. Next, we set up the training and testing procedure for our one-class classi? ers. Then, we show how classi? er performance was calculated. Finally, we introduce a statistical testing method to further analyze experimental results. (12) where is the vector of nonnegative Lagrangian multipliers to be determined, is a parameter that controls the trade-off between maximizing the number of data points contained by the hyperplane and the distance of the hyperplane from SHEN et al. : USER AUTHENTICATION THROUGH MOUSE DYNAMICS 23A. Dataset As discussed in Section III, samples of mouse-behavior data were collected when subjects performed the designed mouseoperation task in a tightly-controlled environment. All 37 subjects produced a total of 5550 mouse-operation samples. We then calculated feature-distance vectors, and extracted principal components from each vector as input for the classi? ers. B. Training and Testing Procedure Consider a scenario as mentioned in Section V-A. We started by designating one of our 37 subjects as the legitimate user, and the rest as impostors. We trained the classi? er and ested its ability to recognize the legitimate user and impostors as follows: Step 1: We trained the classi? er to b uild a pro? le of the legitimate user on a randomly-selected half of the samples (75 out of 150 samples) from that user. Step 2: We tested the ability of the classi? er to recognize the legitimate user by calculating anomaly scores for the remaining samples generated by the user. We designated the scores assigned to each sample as genuine scores. Step 3: We tested the ability of the classi? er to recognize impostors by calculating anomaly scores for all the samples generated by the impostors.We designated the scores assigned to each sample as impostor scores. This process was then repeated, designating each of the other subjects as the legitimate user in turn. In the training phase, 10-fold cross validation [24] was employed to choose parameters of the classi? ers. Since we used a random sampling method to divide the data into training and testing sets, and we wanted to account for the effect of this randomness, we repeated the above procedure 50 times, each time with independently selected samples drawn from the entire dataset. C. Calculating Classi? r Performance To convert these sets of classi? cation scores of the legitimate user and impostors into aggregate measures of classi? er performance, we computed the false-acceptance rate (FAR) and false-rejection rate (FRR), and used them to generate an ROC curve [42]. In our evaluation, for each user, the FAR is calculated as the ratio between the number of false acceptances and the number of test samples of impostors; the FRR is calculated as the ratio between the number of false rejections and the number of test samples of legitimate users.Then we computed the average FAR and FRR over all subjects. Whether or not a mouse-operation sample generates an alarm depends on the threshold for the anomaly scores. An anomaly score over the threshold indicates an impostor, while a score under the threshold indicates a legitimate user. In many cases, to make a user authentication scheme deployable in practice, minimizing the possibility of rejecting a true user (lower FRR) is sometimes more important than lowering the probability of accepting an impostor [46]. Thus we adjusted the threshold according to the FRR for the training data.Since calculation of the FRR requires only the legitimate userââ¬â¢s data, no impostor data was used for determining the threshold. Speci? cally, the threshold is set to be a variable ranging from , and will be chosen with a relatively low FRR using 10-fold cross validation on the training data. After multiple tests, we observe that setting the threshold to a value of 0. 1 yields a low FRR on average2. Thus, we show results with a threshold value of 0. 1 throughout this study. D. Statistical Analysis of the Results To evaluate the performance of our approach, we developed a statistical test using the half total error rate (HTER) and con? ence-interval (CI) evaluation [5]. The HTER test aims to statistically evaluate the performance for user authentication, which is de ? ned by combining false-acceptance rate (FAR) and falserejection rate (FRR): (14) Con? dence intervals are computed around the HTER as , and and are computed by [5]: (15) % % % (16) where NG is the total number of genuine scores, and NI is the total number of impostor scores. VII. EXPERIMENTAL RESULTS AND ANALYSIS Extensive experiments were carried out to verify the effectiveness of our approach. First, we performed the authentication task using our approach, and compared it with two widely-used classi? rs. Second, we examined our primary results concerning the effect of eigenspace transformation methods on classi? er performance. Third, we explored the effect of sample length on classi? er performance, to investigate the trade-off between security and usability. Two additional experiments are provided to compare our method with other approaches in the literature. A. Experiment 1: User Authentication In this section, we conducted a user authentication experiment, and compared our c lassi? er with two widely-used ones as mentioned in Section V-C. The data used in this experiment consisted of 5550 samples from 37 subjects.Fig. 3 and Table III show the ROC curves and average FARs and FRRs of the authentication experiment for each of three classi? ers, with standard deviations in parentheses. Table III also includes the average authentication time, which is the sum of the average time needed to collect the data and the average time needed to make the authentication decision (note that since the latter of these two times is always less than 0. 003 seconds in our classi? ers, we ignore it in this study). Our ? rst observation is that the best performance has a FAR of 8. 74% and a FRR of 7. 96%, obtained by our approach (one-class SVM).This result is promising and competitive, and the behavioral samples are captured over a much shorter period of time 2Note that for different classi? ers, there are different threshold intervals. For instance, the threshold interval fo r neural network detector is [0, 1], and for one. For uniform presentation, we mapped all of intervals class SVM, it is . to 24 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 TABLE IV HTER PERFORMANCE AND CONFIDENCE INTERVAL AT CONFIDENCE LEVELS DIFFERENT Fig. 3. ROC curves for the three different classi? rs used in this study: oneclass SVM, neural network, and nearest neighbor. TABLE III FARs AND FRRs OF USER AUTHENTICATION EXPERIMENT (WITH STANDARD DEVIATIONS IN PARENTHESES) information about mouse behavior, which could enhance performance. Finally, we conducted a statistical test, using the HTER and CI evaluation as mentioned in Section VI-D, to statistically evaluate the performance of our approach. Table IV summarizes the results of this statistical evaluation at different con? dence levels. The result shows that the proposed approach provides the lowest HTER in comparison with the other two classi? ers used in our study; the 95% con? ence interval lies at % %. B. Experiment 2: Effect of Eigenspace Transformation This experiment examined the effect of eigenspace-transformation methods on classi? er performance. The data used were the same as in Experiment 1. We applied a one-class SVM classi? er in three evaluations, with the inputs respectively set to be the original feature-distance vectors (without any transformations), the projection of feature-distance vectors by PCA, and the projection of feature-distance vectors by KPCA. Fig. 4 and Table V show the ROC curves and average FARs and FRRs for each of three feature spaces, with standard deviations in parentheses.As shown in Fig. 4 and Table V, the authentication accuracy for the feature space transformed by KPCA is the best, followed by the accuracies for feature spaces by PCA and the original one. Speci? cally, direct classi? cation in the original feature space (without transformations) produces a FAR of 15. 45% and FRR of 15. 98%. This result is not encouraging c ompared to results previously reported in the literature. However, as mentioned in Experiment 1, the samples may be subject to more behavioral variability compared with previous work, because previous work analyzed mouse behaviors over a longer period of observation.Moreover, we observe that the authentication results of % % by PCA, and % % by KPCA are much better than for direct classi? cation. This result is a demonstration of the effectiveness of the eigenspace transformation in dealing with variable behavior data. Furthermore, we ? nd that the performance of KPCA is slightly superior to that of PCA. This may be due to the nonlinear variability (or noise) existing in mouse behaviors, and KPCA can reduce this variability (or noise) by using kernel transformations [29].It is also of note that the standard deviations of FAR and FRR based on the feature space transformed by KPCA and PCA are smaller than those of the original feature space (without transformations), indicating that th e eigenspace-transformation technique enhances the stability and robustness of our approach. compared with previous work. It should be noted that our result does not yet meet the European standard for commercial biometric technology, which requires near-perfect accuracy of 0. 001% FAR and 1% FRR [10]. But it does demonstrate that mouse dynamics could provide valuable information in user authentication tasks.Moreover, with a series of incremental improvements and investigations (e. g. , outlier handling), it seems possible that mouse dynamics could be used as, at least, an auxiliary authentication technique, such as an enhancement for conventional password mechanisms. Our second observation is that our approach has substantially better performance than all other classi? ers considered in our study. This may be due to the fact that SVMs can convert the problem of classi? cation into quadratic optimization in the case of relative insuf? ciency of prior knowledge, and still maintain hig h accuracy and stability.In addition, the standard deviations of the FAR and FRR for our approach are much smaller than those for other classi? ers, indicating that our approach may be more robust to variable behavior data and different parameter selection procedures. Our third observation is that the average authentication time in our study is 11. 8 seconds, which is impressive and achieves an acceptable level of performance for a practical application. Some previous approaches may lead to low availability due to a relatively-long authentication time. However, an authentication time of 11. seconds in our study shows that we can perform mouse-dynamics analysis quickly enough to make it applicable to authentication for most login processes. We conjecture that the signi? cant decrease of authentication time is due to procedural features providing more detailed and ? ne-grained SHEN et al. : USER AUTHENTICATION THROUGH MOUSE DYNAMICS 25 TABLE VI FARs AND FRRs OF DIFFERENT SAMPLE LENGTH S Fig. 4. ROC curves for three different feature spaces: the original feature space, the projected feature space by PCA, and the projected feature space by KPCA.TABLE V FARs AND FARs FOR THREE DIFFERENT FEATURE SPACES (WITH STANDARD DEVIATIONS IN PARENTHESES) the needs of the European Standard for commercial biometric technology [10]. We ? nd that after observing 800 mouse operations, our approach can obtain a FAR of 0. 87% and a FRR of 0. 69%, which is very close to the European standard, but with a corresponding authentication time of about 10 minutes. This long authentication time may limit applicability in real systems. Thus, a trade-off must be made between security and user acceptability, and more nvestigations and improvements should be performed to secure a place for mouse dynamics in more pragmatic settings. D. Comparison User authentication through mouse dynamics has attracted growing interest in the research community. However, there is no shared dataset or baseline algor ithm for measuring and determining what factors affect performance. The unavailability of an accredited common dataset (such as the FERET database in face recognition [32]) and standard evaluation methodology has been a limitation in the development of mouse dynamics.Most researchers trained their models on different feature sets and datasets, but none of them made informed comparisons among different mouse feature sets and different results. Thus two additional experiments are offered here to compare our approach with those in the literature. 1) Comparison 1: Comparison With Traditional Features: As stated above, we constructed the feature space based on mouse clicks and mouse movements, consisting of holistic features and procedural features. To further examine the effectiveness of the features constructed in this study, we provide a comparative experiment. We chose the features used by Gamboa et al. 17], Aksari and Artuner [4], Hashia et al. [19], Bours and Fullu [8], and Ahmed a nd Traore [2], because they were among the most frequently cited, and they represented a relatively diverse set of mouse-dynamics features. We then used a one-class SVM classi? er to conduct the authentication experiment again on our same dataset with both the feature set de? ned in our study, and the feature sets used in other studies. Hence, the authentication accuracies of different feature sets can be compared. Fig. 5 and Table VII show the ROC curves and average FARs and FRRs for each of six feature sets, with standard deviations in parentheses.We can see that the average error rates for the feature set from our approach are much lower than those of the feature sets from the literature. We conjecture that this may be due to the procedural features providing ? ne-grained information about mouse behavior, but they may also be due, in part, to: (1) partial adoption of features de? ned in previous approaches C. Experiment 3: Effect of Sample Length This experiment explored the effe ct of sample length on classi? er performance, to investigate the trade-off between security (authentication accuracy) and usability (authentication time).In this study, the sample length corresponds to the number of mouse operations needed to form one data sample. Each original sample consists of 32 mouse operations. To explore the effect of sample length on the performance of our approach, we derived new datasets with different sample lengths by applying bootstrap sampling techniques [13] to the original dataset, to make derived datasets containing the same numbers of samples as the original dataset. The new data samples were generated in the form of multiple consecutive mouse samples from the original dataset. In this way, we considered classi? r performance as a function of the sample length using all bootstrap samples derived from the original dataset. We conducted the authentication experiment again (using one-class SVM) on six derived datasets, with and 800 operations. Table VI shows the FARs and FRRs at varying sample lengths, using a one-class SVM classi? er. The table also includes the authentication time in seconds. The FAR and FRR obtained using a sample length of 32 mouse operations are 8. 74% and 7. 96% respectively, with an authentication time of 11. 8 seconds. As the number of operations increases, the FAR and FRR drop to 6. 7% and 6. 68% for the a data sample comprised of 80 mouse operations, corresponding to an authentication time of 29. 88 seconds. Therefore, we may conclude that classi? er performance almost certainly gets better as the sample length increases. Note that 60 seconds may be an upper bound for authentication time, but the corresponding FAR of 4. 69% and FRR of 4. 46% are still not low enough to meet 26 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 8, NO. 1, JANUARY 2013 Fig. 5. ROC curves for six different feature sets: the feature set in our study, and the features sets in other studies.RESULTS OF TABLE VII CO MPARISON WITH SOME TRADITIONAL FEATURES (WITH STANDARD DEVIATIONS IN PARENTHESES) Note that this approach [2] is initially applied to intrusion detection, and we extracted parts of features closely related to mouse operations in our dataset. The reason for this decision is that we want to examine whether the features employed in intrusion detection can be used in user authentication. because of different data-collection environments; (2) using different types of thresholds on the anomaly scores; (3) using less enrollment data than was used in previous experiments.The improved performance based on using our features also indicates that our features may allow more accurate and detailed characterization of a userââ¬â¢s unique mouse behavior than was possible with previously used features. Another thing to note from Table VII is that the standard deviations of error rates for features in our study are smaller than those for traditional features, suggesting that our features might be more stable and robust to variability in behavior data. One may also wonder how much of the authentication accuracy of our approach is due to the use of procedural features or holistic features.We tested our method using procedural features and holistic features separately, and the set of procedural features was the choice that proved to perform better. Specifically, we observe that the authentication accuracy of % % by using the set of procedural features is much better than for the set of holistic features, which have a FAR of 19. 58% and a FRR of 17. 96%. In combination with the result when using all features, it appears that procedural features may be more stable and discriminative than holistic features, which suggests that the procedural features contribute more to the authentication accuracy.The results here only provide preliminary comparative results and should not be used to conclude that a certain set of mouse features is always better than others. Each feature set has it s own unique advantages and disadvantages under different conditions and applications, so further evaluations and comparisons on more realistic and challenging datasets are needed. 2) Comparison 2: Comparison With Previous Work: Most previous approaches have either resulted in poor performance (in terms of authentication accuracy or time), or have used data of limited size.In this section, we show a qualitative comparison of our experimental results and settings against results of previous work (listed in Table VIII). Revett et al. [34] and Aksari and Artuner [4] considered mouse dynamics as a standalone biometric, and obtained an authentication accuracy of ERR around 4% and 5. 9% respectively, with a relatively-short authentication time or small number of mouse operations. But their results were based on a small pool of users (6 users in [34] and 10 users in [4]), which may be insuf? ient to obtain a good, steady result. Our study relies on an improved user authentication methodolo gy and far more users, leading us to achieve a good and robust authentication performance. Ahmed and Traore [2] achieved a high authentication accuracy, but as we mentioned before, it might be dif? cult to use such a method for user authentication since the authentication time or the number of mouse operations needed to verify a userââ¬â¢s identity is too high to be practical for real systems. Additionally, Hashia et al. 19] and Bours and Fulla [8] could perform user authentication in a relatively-short time, but they reported unacceptably high error rates (EER of 15% in [19], and EER of 26. 8% in [8]). In our approach we can make an authentication decision with a reasonably short authentication time while maintaining high accuracy. We employ a one-class classi? er, which is more appropriate for mouse-dynamics-based user authentication. As mentioned in Experiment 3, we can make an authentication decision in less than 60 seconds, with corresponding error rates are FAR of 4. 9% and FRR of 4. 46%. Although this result could be improved, we believe that, at our current performance level, mouse dynamics suf? ce to be a practical auxiliary authentication mechanism. In summary, Comparison 1 shows that our proposed features outperform some traditional features used in previous studies, and may be more stable and robust to variable behavior data. Comparison 2 indicates that our approach is competitive with existing approaches in authentication time while maintaining high accuracy.More detailed statistical studies on larger and more realistic datasets are desirable for further evaluations. VIII. DISCUSSION AND EXTENSION FOR FUTURE WORK Based on the ? ndings from this study, we take away some messages, each of which may suggest a trajectory for future work. Additionally, our work highlights the need for shared data and resources. A. Success Factors of Our Approach The presented approach achieved a short authentication time and relatively-high accuracy for mouse-dynami cs-based user SHEN et al. : USER AUTHENTICATION THROUGH MOUSE DYNAMICS 27 TABLE VIII COMPARISON WITH PREVIOUS WORKAuthentication time was not explicitly reported in [4], [8], [17]; instead, they required the user to accomplish a number of mouse operations for each authentication (15 clicks and 15 movements for [17]; 10 clicks and 9 movements for [4]; 18 short movements without pauses for [8]). Authentication time was not explicitly stated in [2]; however, it can be assumed by data-collection progress. For example, it is stated in [2] that an average of 12 hours 55 minutes of data were captured from each subject, representing an average of 45 sessions. We therefore assume that average session length is 12. 5 60/45 17. 22 minutes 1033 seconds. authentication. However, it is quite hard to point out one or two things that may have made our results better than those of previous work, because (1) past work favored realism over experimental control, (2) evaluation methodologies were incons istent among previous work, and (3) there have been no public datasets on which to perform comparative evaluations. Experimental control, however, is likely to be responsible for much of our success. Most previous work does not reveal any particulars in controlling experiments, while our work is tightly controlled.We made every effort to control experimental confounding factors to prevent them from having unintended in? uence on the subjectââ¬â¢s recorded mouse behavior. For example, the same desktop computer was used for data collection for all subjects, and all system parameters relating to the mouse were ? xed. In addition, every subject was provided with the same instructions. These settings suggest strongly that the differences in subjects were due to individually detectable mouse-behavior differences among subjects, and not to environmental variables or experimental conditions.We strongly advocate the control of potential confounding factors in future experiments. The reaso n is that controlled experiments are necessary to reveal causal connections among experimental factors and classi? er performance, while realistic but uncontrolled experiments may introduce confounding factors that could in? uence experimental outcomes, which would make it hard to tell whether the results of those evaluations actually re? ect detectable differences in mouse behavior among test subjects, or differences among computing environments.We had more subjects (37), more repetitions of the operation task (150), and more comprehensive mouse operations (2 types of mouse clicks, 8 movement directions, and 3 movement distance ranges) than most studies did. Larger subject pools, however, sometimes make things harder; when there are more subjects there is a higher possibility that two subjects will have similar mouse behaviors, resulting in more classi? cation errors. We proposed the use of procedural features, such as the movement speed curve and acceleration curve, to provide mor e ? egrained information about mouse behavior than some traditional features. This may allow one to accurately describe a userââ¬â¢s unique mouse behavior, thus leading to a performance improvement for mouse-dynamics-based user authentication. We adopted methods for distance measurement and eigenspace transformation for obtaining principal feature components to ef? ciently represent the original mouse feature space. These methods not only overcome within-class variability of mouse behavior, but also preserve between-class differences of mouse behavior. The improved authentication accuracies demonstrate the ef? acy of these methods. Finally, we used a one-class learning algorithm to perform the authentication task, which is more appropriate for mousedynamics-based user authentication in real applications. In general, until there is a comparative study that stabilizes these factors, it will be hard to be de? nitive about the precise elements that made this work successful. B. Oppor tunities for Improvement While previous studies showed promising results in mouse dynamics, none of them have been able to meet the requirement of the European standard for commercial biometric technology.In this work, we determined that mouse dynamics may achieve a pragmatically useful level of accuracy, but with an impractically long authentic
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.