Walking and typing on a smartphone is an extremely common interaction. Previous research has shown that error rates are higher when walking than when stationary. In this paper we analyse the acceleration data logged in an experiment in which users typed whilst walking, and extract the gait phase angle. We find statistically significant relationships between tapping time, error rate and gait phase angle. We then use the gait phase as an additional input to an offset model, and show that this allows more accurate touch interaction for walking users than a model which considers only the recorded tap position.
A significant amount of research effort has been focused on developing innovative ways for improving typing accuracy and speed on mobile devices with touch screens. This is typically achieved by adapting keyboards or generating offset models for particular users or user groups. However, despite it being recognized that performance can be significantly affected by whether the use case is static or dynamic , and that inclusion of movement data can significantly improve the interaction experience , there has been relatively little effort in quantifying the effect of user movement on targeting and typing performance.
The goal of this paper is to reanalyse the accelerometer data logged during an experimental study of typing while walking to explore possible relationships between movement and typing behaviour. In particular, we answer the following research questions:
Does typing behaviour change when users are moving versus when stationary? Does error rate depend on the point in the gait cycle at which the touch is made?
Are there systematic changes in the touch distributions across the keyboard between sitting, standing, and walking? Does this suggest any changes to keyboard design?
Do offset models improve touch input accuracy if given extra information of acceleration or gait phase angle prior to the touch event?
The influence of posture is recognized as an important factor in designing more accurate text entry interfaces since different postures result in different typing patterns . The ContextType system  improved text entry accuracy by inferring the interacting fingers. Posture is typically inferred from on-board sensors, which can also be used to extract other useful information, as in . Here, gyroscope and swiping motions were used to infer which hand was holding the device and which hand was used for interaction. This technique yielded 84.3 % accuracy for hand posture detection and 99.7 % accuracy for table vs. hand detection. Movement also affects text entry accuracy. Goel et al.  described an accelerometer-based adaptive virtual keyboard named “WalkType” that employed an algorithm inspired by image stabilization for cameras. For two-thumb typing on a mobile phone, the system reduced uncorrected errors in the walking condition by 45.2 % and increased the typing speed by 12.9 %. Baldwin and Chai  implemented a dynamic, user-specific adaptation of virtual keyboard key sizes. This was achieved based on a language model and observed typing behavior. A possible limitation of such an approach is the inclusion of keyboard-related elements into the model. Mizobuchi et al.  found no significant interaction between target size and movement. Also, no effect of walking speed or input speed on accuracy was observed, while the decrease in walking speed while using a mobile phone compared to the usual walking speed was noted. In  subjects were placed on a treadmill, with the speed varied from 0–160 % of preferred walking speed (PWS). It was noted that no matter how slowly the users walked, accuracy was reduced compared to the static case. Additional observations was made: the mean decrease in speed when using a mobile phone was about 24 % from PWS compared when walking without the device. A similar decrease in walking speed (25 %) was observed in . Crossan et al.  explored the relationship between gait phase and interaction accuracy. Analysis showed that there was part of the gait cycle that was about 3 times more likely to be used for tapping, and for which tapping accuracy was higher, with a lower error mean and lower variability. The correlation between the gait phase angle and typing accuracy was also studied in  for several typing postures. The authors also examined the influence of walking speed on different performance parameters, such as accuracy, response time, and throughput, as well as offset models.
The so-called Fat Finger Problem , referring to the issue of touching small targets with the ambiguous contact area of a soft finger, has long been studied as a source of error in touch interactions, including interaction on the go. This is compounded with perceptual issues, as studied in , leading to a systematic offset between the user’s intended and actual touch locations. Henze et al.  collected millions of touch examples in two Android games and used polynomials to model the offsets on a device-specific basis. Weir et al.  identified that offsets have a random component as well as a systematic one, and used Gaussian Process regression to probabilistically model offsets. In later work, the authors extended this model to typing data and combined it with a statistical language model to perform autocorrection . Bi and Zhai  proposed a dual Gaussian explanation for touch accuracy, and used this to derive the Bayesian touch criterion. This was shown to improve selection accuracy for small targets. The vast majority of such models assume that either the use context of the device is not changing, or, it is changing but the change does not affect the user’s targeting behaviour. This motivates our investigation of whether targeting behaviour changes when in motion, and if so whether motion data can be used to improve targeting performance.
Acceleration data can be used to explore possible relationships between tapping accuracy and gait phase as well as relationships between input technique and gait phase. Figure 1 shows how gait phase angle can be physically interpreted: it illustrates the relationship between gait phase zero-crossings and actual movement for a walking user, and shows histograms of right knee angle values (kinematic parameter recorded using an optoelectronic motion capture system) sampled at time instances when zero-crossings occurred. The total duration of recording was 1.5 min. Note the existence of single peak in all subplots, demonstrating the consistency of our gait phase estimation algorithm. Further analysis is, however, needed to precisely define the relationship between gait phase as recorded by a mobile phone and kinematic parameters of the gait.
We investigate whether there are certain time instances during the gait cycle at which touch events are more likely, and also whether targeting accuracy varies throughout the gait cycle. We estimate the phase angle from acceleration, using a common approach, the Hilbert transform, which gives the instantaneous phase and amplitude of a signal s(t) . The Hilbert transform signal sH(t) allows the construction of the complex signal
where ϕ(t) is the phase at time t, and A(t) is the amplitude of the signal at time t.
Although A(t) and ϕ(t) can be computed for an arbitrary s(t) they are only physically meaningful if s(t) is a narrow-band signal. Following previous work [10, 19], we band-pass filtered the acceleration with a band-pass frequency determined by subjects’ walking pace. However, since we noticed that several subjects had multiple distinct peaks in the raw acceleration frequency spectrum (for all sensitive axes) we slightly modified this approach to accommodate for this by using a bank of narrow band-pass filters. Output from all individual filters was then summed to obtain the filtered signal. This allows more intricate details of the gait to be captured without violating the narrow-band requirement for the Hilbert transform. The distributions of gait phase angle along each axis at tap events are presented in Fig. 2. The angle is defined on [ −π,+π], segmented into 10 bins.
We performed our analysis on a dataset from a previous study for which acceleration data were available . The study evaluated a novel autocorrection technique combining touch and language models. 10 participants transcribed phrases from the Enron Mobile Email dataset  on a custom soft keyboard, implemented on a Galaxy S3 Mini smartphone running Android 4.0.
Participants entered text while sitting, standing and walking. Error rates in the walking condition were significantly higher than either of the static conditions. However, although accelerometer data were captured in this study, the correction technique did not make use of these. The goal of our gait phase analysis is to identify factors contributing to the observed differences in error rate when walking. Note that we discarded the data for one subject because accelerometer data were incomplete.
In the original study, error rates were reported using Character Error Rate (CER), defined as the number of substitutions, transpositions, insertions and deletions needed to transform the transcribed text into the stimulus, divided by the length of the stimulus. Here, however, we are interested primarily in the touch dynamics, rather than the corrections made possible by the language model. Only substitution errors (e.g. ‘thr’ instead of ‘the’) can be directly corrected by a touch model. We handle other errors in the following way:
Transpositions (e.g. ‘taht’ instead of ‘that’) are not counted as errors since the correct keys were hit, just in the wrong order.
Insertions (e.g. ‘thaat’ instead of ‘that’) are not considered errors if the inserted character(s) are repeats of the previous correct character. In other cases (e.g. ‘thast’) the inserted characters are considered as additional substitution errors.
Deletions (e.g ‘tht’ instead of ‘that’) are not considered errors since all successful touches hit the correct keys.
Some potential causes for these errors include: 1) hand-leg synchronization interference with two thumb typing which could cause small timing disturbances which could in turn lead to transpose letters or additional added characters; 2) feedback related issues caused by user movement which could interfere with screen visibility and/or haptic sensations; and 3) cognitive related issues. Additional testing, under more controlled conditions, is needed to positively identify error source(s).
The end result of the above discussion is that we use an alternative definition of error rate in our analysis: we manually mark the intended key for all touches and compute error as the number of substitution errors divided by the length of the stimulus. Our reported error rates are therefore lower than those in the original study.
Figure 3 shows the variation of baseline error rates in the three conditions. One-way repeated measures ANOVA showed there is no statistically significant effect between conditions (p = 0.0914, F = 2.79, df = 2). Mean values were 9.24 % (±2.92 %), 6.78 % (±2.12 %) and 9.39 % (±4.59 %) for sitting, standing and walking conditions, respectively. Note that the inclusion of all error types in the original study did lead to statistically significant differences — walking lead to about 5 % more errors overall compared to sitting and standing. This is a potentially interesting result as it might suggest that it is primarily non-substitution errors that increase in frequency when walking, rather than the substitution errors studied here. This might be explained by the divide in attention between the phone and surroundings while walking or one of the other sources described in the previous subsection.
Figure 4 gives an impression of the accuracy over the full keyboard in the three experimental conditions, for Subject 7. The touch variability is aggregated to a standard character key in Fig. 5 across all test subjects. The walking condition appears the most diffuse, followed by sitting. Interestingly the standing touches appear to be the most accurate. This reflects the earlier findings in . It remains unclear why this should be the case.
Gait phase analysis
Figure 2 shows the mean number of taps as a function of inferred x,y,z phase angles, averaged over all subjects. z corresponds to vertical motion, y is the direction of travel for the user, and x is lateral (left/right) movement. Looking at individual subjects tapping distribution against the y-axis inferred phase angle, all subjects apart from 4 and 9 have statistically significant deviations (at the α=0.001 level) from mean tapping distributions, based on a multinomial significance test, with correction for multiple testing performed using the False Discovery Rate (FDR) control approach .
The variation in key error rates by phase was not statistically significant, because of the relatively small number of errors. We made a further comparison by tightening the typing accuracy, looking at error rates for a virtual key of the same shape as the original keys but with side lengths reduces to 50 % of the original. At this level error counts were such that Subjects 1, 3, 6 and 7 showed statistically significant gait phase related error rate variation at α=0.05 after compensating for multiple comparisons using the FDR approach.
Inspection of the data suggested that the y axis acceleration was most reliable for inference of the gait phase angle. Figure 6 shows the raw acceleration values in all three axes over a three second window for a single subject, and the gait phase angle as extracted from the y acceleration. More detailed depiction of y axis acceleration and extracted gait phase angle is presented in Fig. 7. The peak in tapping density in bins 3–4 for y axis in Fig. 2 corresponds to −112 to −36 degrees. Additional insight into gait phase dependency of the interaction can be obtained if the number of taps and error rates are examined across the gait phase bins, as depicted in Fig. 8.
Adaptive offset models
Following previous work [15, 16], we trained offset models using Gaussian Process (GP) regression . A GP is determined by its mean and covariance functions. We choose a zero mean function — that is, we predict no offset in the absence of training data. As more calibration data is used, an offset function is learned.
For the covariance function, we use a standard squared exponential kernel with automatic relevance determination (ARD). The ARD and noise variance hyperparameters are set on a user specific basis by optimising the marginal likelihood on the training data. We train two GPs: one for predicting x offsets and one for y offsets.
The results above motivate the use of acceleration information as an additional input to the GP to improve the performance of our offset models for walking users. We investigated two methods of incorporating motion-sensitivity into the basic GP. First, we used the n filtered acceleration values leading up to a touch event as additional inputs. Since we found that the y signal gave the most reliable way of extracting gait phase, we used only that signal and ignored x and z. We experimented with a range of values of n, and found that models using more than 5 samples did not perform significantly differently from those with 5 samples, and so only report on 5 sample models here. The second approach was to use the y phase angle at the time of the touch event as an additional input.
As a baseline, we compared these augmented GPs to models using only the position as input. In each case, we trained our models on a user-specific basis using 300 training examples selected at random from all available touches. We tested the predictive performance of each model on the remaining points. The number of test points varied by user between 332 and 1866 (mean =1251), since participants typed for a fixed time rather than a fixed length of text. We averaged the performance over 10 restarts of this process, using a different random training set each time.
Figure 9 shows our results. The box plots show the distribution across users of the mean error rate over the 10 random training sets. As before, we measure error rate as the proportion of touches falling outwith the visual boundary of intended key.
The baseline is the error between the positions recorded by the device and the targets. All three GP models produce a significantly lower error rate than this baseline (t-test, p<0.05). In addition, both motion-augmented models offer a small but statistically significant further improvement of around 1 % on average compared to the position-only model (t-test, p<0.05). The two augmented models were not significantly different from each other.
This paper studied the relationship between gait phase and typing behaviour for users walking and using a smartphone. We found that users tended to type more frequently in certain parts of the gait cycle than others, and that their accuracy varied depending on which part of the gait cycle they were in when typing. Variations in accuracy over the gait cycle were particularly pronounced for very small targets.
We demonstrated that incorporation of motion-related information into user-specific offset models can improve the accuracy of typing input. Including gait phase or acceleration data as an additional input to a Gaussian Process offset model gave a 1 % absolute improvement in error rate over a position-only model.
Our results motivate a number of possible directions for future work. Existing touch models have assumed a single usage context, or considered coarse differences between stationary and walking users but ignored finer details. Commercial models can be quite basic — for example, early versions of the Android keyboard simply moved all touches up by ten pixels. Our results motivate a more nuanced approach. Gait phase angle, or other local movement predictions could be extracted at an operating system level and used to improve the input accuracy for all touches, not just those related to typing. Based on our results, this could have a related impact on use of small form factor devices like smartwatches. In general, our analysis reinforces the need for developers to consider the behaviour of users in multiple contexts, and tune their input techniques accordingly. This also points to another potential research area: accurate detection of these usage contexts and smooth adjustment of corrective behaviour.
In addition, we found that after removing non-substitution errors, there were no statistically significant differences in baseline typing error rates between sitting, standing, and walking. This is in contrast to the original study which included all errors. This potentially suggests that it is non-substitution errors — hitting keys in the wrong order, typing extra characters, or missing characters altogether — which increase when moving and typing. The exact causes of these phenomona are not known but we speculate they could be in the domain of touch/motor interaction or even cognition. Thus, it may make sense to investigate whether language models should be given greater weight in the text correction process when users are walking, as the language models can correct these errors when touch models cannot.
Goel M, Findlater L, Wobbrock J (2012) Walktype: using accelerometer data to accomodate situational impairments in mobile touch screen text entry In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2687–2696.. ACM, New York, NY, USA.
Azenkot S, Zhai S (2012) Touch behavior with different postures on soft smartphone keyboards In: Proceedings of the 14th International Conference on Human-computer Interaction with Mobile Devices and Services, 251–260.. ACM, New York, NY, USA.
Goel M, Jansen A, Mandel T, Patel SN, Wobbrock JO (2013) Contexttype: using hand posture information to improve mobile touch screen text entry In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2795–2798.. ACM, New York, NY, USA.
Goel M, Wobbrock J, Patel S (2012) Gripsense: using built-in sensors to detect hand posture and pressure on commodity mobile phones In: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, 545–554.. ACM, New York, NY, USA.
Baldwin T, Chai J (2012) Towards online adaptation and personalization of key-target resizing for mobile devices In: Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, 11–20.. ACM, New York, NY, USA.
Mizobuchi S, Chignell M, Newton D (2005) Mobile text entry: relationship between walking speed and text input task difficulty In: Proceedings of the 7th International Conference on Human Computer Interaction with Mobile Devices & Services, 122–128.. ACM, New York, NY, USA.
Bergstrom-Lehtovirta J, Oulasvirta A, Brewster S (2011) The effects of walking speed on target acquisition on a touchscreen interface In: Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, 143–146.. ACM, New York, NY, USA.
Schildbach B, Rukzio E (2010) Investigating selection and reading performance on a mobile phone while walking In: Proceedings of the 12th International Conference on Human Computer Interaction with Mobile Devices and Services, 93–102.. ACM, New York, NY, USA.
Musić J, Murray-Smith R (2015) Nomadic input on mobile devices: the influence of touch input technique and walking speed on performance and offset modeling. Hum Comput Interaction: 1–52. doi: 10.1080/07370024.2015.1071195.
Siek KA, Rogers Y, Connelly KH (2005) Fat finger worries: how older and younger users physically interact with pdas In: Human-Computer Interaction-INTERACT 2005, 267–280.. Springer-VerlagBerlin, Heidelberg.
Holz C, Baudisch P (2010) The generalized perceived input point model and how to double touch accuracy by extracting fingerprints In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 581–590.. ACM, New York, NY, USA.
Henze N, Rukzio E, Boll S (2011) 100,000,000 taps: analysis and improvement of touch performance in the large In: Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, 133–142.. ACM, New York, NY, USA.
Weir D, Rogers S, Murray-Smith R, Löchtefeld M (2012) A user-specific machine learning approach for improving touch accuracy on mobile devices In: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, 465–476.. ACM, New York, NY, USA.
Weir D, Pohl H, Rogers S, Vertanen K, Kristensson PO (2014) Uncertain text entry on mobile devices In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’14, 2307–2316.. ACM, New York, NY, USA.
Bi X, Zhai S (2013) Bayesian touch: a statistical criterion of target selection with finger touch In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, 51–60.. ACM, New York, NY, USA.
Murray-Smith R, Ramsay A, Garrod S, Jackson M, Musizza B (2007) Gait alignment in mobile phone conversations In: MobileHCI ’07: Proceedings of the 9th International Conference on Human Computer Interaction with Mobile Devices and Services, 214–221.. ACM, New York, NY, USA, doi:10.1145/1377999.1378009.
Vertanen K, Kristensson PO (2011) A versatile dataset for text entry evaluations based on genuine mobile emails In: Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services, 295–298.. ACM, New York, NY, USA.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Musić, J., Weir, D., Murray-Smith, R. et al. Modelling and correcting for the impact of the gait cycle on touch screen typing accuracy.
mUX J Mob User Exp5, 1 (2016). https://doi.org/10.1186/s13678-016-0002-3