Facial Expression : An Under-Utilized Tool for the Assessment of Welfare in Mammals

409 Received July 16, 2016; Accepted February 1, 2017; Epub February 8, 2017; doi:10.14573/altex.1607161 mal Welfare Act of 1966, last amended 20132; the UK Animal Welfare Act of 20063, and the New Zealand Animal Welfare Act, 1999, last amended 20154. The issue of animal welfare is particularly pertinent for the biosciences, where there is both an ethical and legal duty to minimize the impact of experimentation on animal models through refinement (e.g., EU Directive 2010/63/EU5, UK Animals (Scientific Procedures) Act of 1986, consolidated 20146) although such legislation does not cover all experimental animal models, for example, in the US, rats, mice, birds and farm animals used in bioscience are not covered by the US Animal Welfare Act2. This duty also extends beyond

Stereotypies and self-directed behavior may develop as coping mechanisms, and therefore individuals that perform these may experience better welfare states than those in comparable environments that lack coping strategies (Mason and Latham, 2004;Mohiyeddini and Semple, 2013). Furthermore, animals that perform stereotypies are resistant to behavioral extinction and therefore the existence of stereotypies do not necessarily indicate current welfare state (Mason and Latham, 2004;Mason, 2006). As a further complication, diverse animal internal states may manifest behaviorally in similar ways. For example, a reduced behavioral reaction to repeated stimuli may indicate either desensitization or learned helplessness, with polar opposite ramifications for interpreting welfare (Overall, 2013). In summary, behavior is essential for the assessment of welfare in animals but some limitations exist in terms of accurately interpreting internal states, or indicating triggering stimuli.
One observational tool that may strengthen the assessment of welfare by complementing current behavioral or other measures is the use of facial expressions (Tab. 1). In humans, facial expressions have been extensively studied as a measure of the psychological and emotional experience (Darwin, 1872;Ekman, 1993;Hole and Bourne, 2010). Despite this, the systematic use of facial expression in animal welfare science is rare, with the notable exception of emerging research on pain indicators (e.g., Langford et al., 2010;Leach et al., 2011;Gleerup et al., 2015a). Facial expressions in mammals are widespread with many facial movements conserved across species (Darwin, 1872;Diogo, 2009;Waller and Micheletta, 2013). Facial expressions have the potential to reliably indicate psychological and emotional experiences in animals, and can provide information on temporal or stimuli-specific reactions. Facial expressions also have social and reproductive functions (e.g., Moehlman, 1998;Parr et al., 2005) and can therefore be more broadly relevant to welfare assessment than exclusively as indicators of affective state. Facial expressions can determine generalized, species-specific patterns, as well as accommodate individual variation, and reliable systems for the recording and measurement of facial expressions with high validity already exist for several taxa (e.g., Parr et al., 2010;Wathan et al., 2015). Humans have an innate observational bias to focus on the facial region, even when instructed to monitor other body areas (Leach et al., 2011), which may facilitate the use of facial expressions in welfare monitoring programs. Moreover, animals appear to have less voluntary control over facial expressions in comparison to motor behavior, although the current evidence is restricted to primate species (Jürgens, 2009;Hopkins et al., 2011). This is similar to the amount of volition over vocalizations (Jürgens, 2009). In humans, voluntary control of facial expression weakens as emotional intensity heightens leading to "emotional leakage" , suggesting that facial expressions in animals may, at least in some circumstances, be "honest" signals of welfare states, and useful as adjunct measures alongside existing indicators.
In this paper, we review the current literature on facial expression function and modulation in mammalian species, and discuss potential applications to the empirical determination of welfare. Only mammals are included in this review due to the experimental protocols to include all aspects of the laboratory animal's life, including transportation, housing and husbandry (Rennie and Buchanan-Smith, 2006). In biomedical research, it is also critical that high welfare standards are maintained, including the minimization or prevention of pain, as data validity may be compromised when taken from animal models with impaired welfare (Würbel, 2001;Poole, 1997;Everds et al., 2013;Hall et al., 2015).
Promoting animal welfare is generally considered by society as a moral duty, with the expectation that those who use animals will protect their welfare as far as possible. For example, society is more accepting of animal use in biomedical research when it is considered humane, as outlined in a recent MORI poll, where 69% of people surveyed accepted animal research "as long as there is no unnecessary suffering to the animals and there is no alternative" (Leaman et al., 2014, p. 6). Welfare states also impact the quality of service that animals provide to humans. For example, in agricultural industries poor health and stress can reduce livestock meat quality, and in biomedical science stress may contribute to the collection of unreliable or unrealistic data from animal models (Würbel, 2001;Klumpp et al., 2006;Ferguson and Warner, 2008;Schwartzkopf-Genswein et al., 2012;Hall et al., 2015). Therefore, animal-oriented industries can also benefit directly from good animal welfare.
The assessment and improvement of animal welfare depends on reliable and valid measurement tools, which may include behavioral, physiological, clinical and psychological indicators (Mason and Mendl, 1993;Dawkins, 2004;Mormède et al., 2007;Mendl et al., 2009). No single indicator can yield a completely accurate picture of an animal's welfare state, and multiple indicators may not result in agreement (Mason and Mendl, 1993). Behavioral measures such as activity, attention and vocalizations are valuable and commonly used indicators of welfare state, as they are immediate, non-invasive and require a relatively short training period for observers (Mason and Latham, 2004;Manteuffel et al., 2004;Bethell, 2015).
Animals show individualized responses to their internal, external and social environments, including variables that are introduced to improve welfare, such as socialization, training and enrichment (Izzo et al., 2011;Coleman, 2012). Individual responses may be predicted by factors such as age, sex and life history while others may be more aligned to variables such as temperament (Izzo et al., 2011;Coleman, 2012). It follows that achieving good welfare in animals requires understanding of predictable and generalized patterns, as well as modifications to account for the experiences and needs of the individual. Traditionally, welfare assessment has focused on the adequacies of physical resources (e.g., nutrition, space), however it is now well recognized that animal welfare is intrinsically linked to psychological wellbeing. Unfortunately, the psychological experiences of non-human animals and the behavioral manifestations of these experiences are still not well understood, making them challenging to identify. For instance, stereotypy performance, self-directed behavior, and reproductive failure may indicate poor welfare states, however they also lack temporal or stimulus specificity and so cannot be easily attributable to a direct cause (Mason, 2006;von Borell et al., 2007;Novak and Meyer, 2009). used to study social communication, particularly amongst primate species, leading to key insights about animal cognition (e.g., Parr et al., 1998). Conflict between facial expression as a communicative tool and as an expression of emotion (Fridlund, 1991) may contribute to its underutilization in animal welfare science, although we argue that both are useful for the interpretation of welfare states. Therefore, in this review, the relevance of facial expression to welfare assessment will be discussed in the context of communication as well as in relation to affective states, with each providing explanatory power to identify the internal animal experience. Finally, measuring methods for facial expressions will be outlined, and potential challenges of using facial expression as a welfare indicator will be discussed.

Eye region
The adjustment of eyelid aperture is a common element in emotional display, with increasing aperture and eye white visibility associated with negative emotion in both humans and other animals (Sandem et al., 2002(Sandem et al., , 2006Lee et al., 2013). Eyelid aperture is predominantly controlled by elevation of the upper eyelid from the levator palpebrae superioris muscle, found in the facial structure of most mammals (Spencer and Porter, 2006). In humans, eyelid aperture increases in the fear, anger and surprise expressions (Williams, 2002;Waller et al., 2008a). Widening of the eyes improves the peripheral visual field resulting in greater sensory intake and more effective vigilance (Susskind and Anderson, 2008). In sheep (Ovis aries), eyelid aperture increases in aversive contexts (e.g., isolation from the social group) and negatively correlates with cardiac measures of parasympathetic nervous system activation (Reefmann et al., 2009a(Reefmann et al., , 2010. Similarly, increased eyelid aperture, along with panting, is a sign of anxiety in dogs (Canis familiaris) during intra-venous catheter placement, and was reduced by a sedative (acepromazime), an analgesic (oxymorphone), a placebo, and by restraint (Light et al., 1993), although pharmacological muscular relaxation may have contributed to some of these effects. Increased visibility of eye white sclera may present alongside widened eyes in fearful and/or stressful situations in humans, horses (Equus caballus), and cows (Sandem et al., 2002(Sandem et al., , 2004Whalen et al., 2004;Sandem and Braastad, 2005;Sandem et al., 2006;von Borstel et al., 2009), and the administration of the anti-anxiety drug diazepam reduces this response in cows (Sandem et al., 2006). Exposure of the sclera is caused by movement of the eyeball within the eye socket and so may present independently of changes in eyelid aperture (Wathan et al., 2015). Eyebrow raising through activation of the medial portion of the frontalis muscle is associated with the negative states of surprise and fear in humans (Waller et al., 2008b). Primates, horses, and dogs also have the capacity for a similar expression Parr et al., 2010;Caeiro et al., 2012;Gleerup et al., 2015a). There is some evidence that brow raising is activated by pain states in horses (Gleerup et al., 2015a), although this action is caused by activation of the levator anguli occuli medialis muscle in this species (Wathan et al., 2015). This facial action increases the perceived size of the eye region, although it does not increase the actual aperture of the eyes. Proportionally large eyes are infantile characteristics in many mammals, and induce a care-giving response from humans (Glocker et al., 2009;Archer and Monton, 2011). In line with this, shelter dogs that display high rates of eyebrow raising are re-homed sooner than those that do so at a lower rate . This suggests that this facial movement may result in improved fitness through social recruitment.
In contrast to eye widening, mice (Mus musculus) in aggressive social situations may "tighten" their eyes by reducing eyelid aperture in combination with ear flattening, and nose and cheek swelling (Defensor et al., 2012). This constricted expression is

Affective state, welfare and facial expressions
It is increasingly accepted in the general and scientific communities that animals lead emotional lives, despite the inherent difficulties of measuring affective components in animals (Désiré et al., 2002;Mendl et al., 2010;Panksepp et al., 2011). Emotions are "unlearned response systems" that are experienced as "intense but short-living affective responses to an event" (de Waal, 2011). Emotions are considered to serve an adaptive function because they reinforce behavior that enhances fitness (Dawkins, 1990;Fredrickson, 2004;Fraser and Duncan, 1998;Panksepp, 1998). Moods are long-term responses arising from the cumulative experience of short-term emotional responses, and both moods and emotions are encompassed in the term "affective state" (Mendl et al., 2010).
Affective states are often described in terms of a valence / intensity model, with valence ranging between negative and positive, and intensity referring to the level of arousal (Désiré et al., 2002). Conscious affective states are integral to individual experience and central to understanding animal welfare (Boissy and Erhard, 2014). Within an affective state framework, adequate animal welfare can be defined as the absence of long-term or severe negative emotions or moods, in combination with the opportunity to experience positive emotions and moods (Boissy et al., 2007). In humans, conscious emotional states ("feelings") can be self-reported using language (e.g., Au et al., 1994). In animals, vocalizations may differ dependent on affect (e.g., ultrasonic vocalizations in rodents: Knutson et al., 1998;Portfors, 2007), however the reliability of these measures is in some doubt (Jourdan et al., 2001;Wallace et al., 2005). Although there are other methods that can be used with animals in order to determine preferences or needs of individuals (e.g., conditioned place preferences, Bardo and Bevins, 2000), a self-report comparable with humans is impossible. Therefore, assessment of affective states in animals is reliant on measurable proxy indicators.
Facial expressions are temporally relevant, measurable and sensitive indicators of emotional valence (Dimberg and Thunberg, 1998). This is true even in response to subliminal triggering stimuli, or when attempts are made to suppress the emotional response (Dimberg et al., 2000(Dimberg et al., , 2002Porter and ten Brinke, 2008). For these reasons, the observation of facial expressions in animals has significant potential for the assessment of internal states, and therefore welfare, of animals.

Can facial expression indicate negative affective states?
The avoidance of long-term negative affect is a defining requirement of adequate animal welfare (Boissy et al., 2007). In humans, negative emotional states have prototypical facial configurations (Ekman and Rosenberg, 2005;Waller et al., 2008a). From a social context, negative facial expressions convey adaptive advantages to both signalers and observers. They draw more attention than positive expressions and interrupt task performance in observers (Vuilleumier et al., 2001;Eastwood et pression of a "bulging lip face" has been found in chimpanzees, and an open mouth with a direct stare is used to signal threat in rhesus macaques (Macaca mulatta) (Partan, 2002;Parr et al., 2007;Waller et al., 2008b).
"Disgust" expressions are reflexive behaviors present even in neonates. They occur in response to aversive tastes or visual or emotive stimuli, and are important for individual and group fitness (Steiner et al., 2001;Erickson and Schulkin, 2003;Chapman et al., 2009). Lip retraction as a disgust response is common to both humans and non-human primates, as are other facial responses of mouth gaping and downward tongue extension (Vrana, 1993;Steiner et al., 2001). Disgust in other species has been less frequently studied although it is known that rats (Rattus norvegicus) show facial expressions in response to taste, with the valence of the expression dependent on satiety, innate taste preferences and learned experiences (Garcia et al., 1974;Grill and Norgren, 1978;Pelchat et al., 1983;Cabanac and Lafrance, 1990). Taste aversion in rats is demonstrated by mouth opening (gaping) into a triangle shape along with forward protrusion of the head (Grill and Norgren, 1978;Cabanac and Lafrance, 1990).
Many animals (including humans) also perform mouth movements as displacement activities; (behavior apparently irrelevant in the context performed that may offer insight into the internal state) (Maestripieri et al., 1992). Displacement activities appear when conflicting motivations are experienced simultaneously or when an animal is frustrated in performing a motivated action (Maestripieri et al., 1992). Displacement activities may present as a wide range of actions including licking, yawning, chewing and mouth twisting (Baker and Aureli, 1997; De Marco et al., 2010;Vick and Paukner, 2010;Mohiyeddinin and Semple, 2013). Displacement yawning is broadly recognized to increase with anxiety or social conflict in primates (e.g., Macaca nigra, Hadidian, 1980;M. mulatta, Graves and Wallen, 2006; Pan troglodytes, Vick and Paukner, 2010) but has also been observed in other species including non-mammals, e.g., ostriches (Struthio camelus, Sauer and Sauer, 1967), dogs (Buttner and Strasser, 2014), fish (Microspathodon chrysurus, Rasa, 1971), and horses (Fureix et al., 2011). In horses, the frequency of yawning correlates positively with the performance of stereotypic behavior (Fureix et al., 2011). Like displacement behaviors, stereotypies appear functionless in the context in which they occur, but are "repetitive behaviours induced by frustration, repeated attempts to cope, and/or central nervous system dysfunction" (Mason, 2006, p. 326). Oral stereotypies occur across many mammal species including giraffe (Giraffa camelopardalis tippelskirchi, Fernandez et al., 2008), cows (Bos taurus, Redbo, 1998), bears (Helarctos malayanus, Tan et al., 2013), walruses (Odobenus rosmarus, Bergeron et al., 2006), primates (e.g., Macaca silenus, Mallapur et al., 2005), and horses (Fureix et al., 2011), and can result in serious oral injuries (Mason et al., 2007;Mason, 2010). Oral stereotypies manifest as a variety of mouth behaviors. In the horse for example, these may include lip snapping, crib-biting, and chewing of inedible substrates (Bergeron et al., 2006;Benhajali et al., 2010). In primates, oral stereotypies commonly present as repetitive mouth movements, lip smacking, tongue thrusting, coprophagy, or regurgitation (Lewis et al., 1990; Bour-observed in resident mice exposed to intruding conspecifics and is assumed to protect sensitive areas of the face from attack, a hypothesis supported by differences in attack style between residents and intruders. Resident mice received more bites to their face and intruders (who do not exhibit the constricted face) received more bites to the back and flank (Defensor et al., 2012). In humans, eyelid aperture reduction is associated with anger and may signal dominance or impending threat (Waller et al., 2008a;Shariff and Tracy, 2011). Threat signaling in some species (e.g., primates / canids) incorporates a fixed stare (Fox, 1970;Partan, 2002;Oettinger et al., 2007). Facial expressions that are precursors of agonistic encounters are highly relevant to welfare assessment because poor welfare can lead to increased aggression; and conversely, social instability can lead to psychological and/or physiological stress (Broom et al., 1995;Beerda et al., 1999;Tamashiro et al., 2005;Broom, 2008). This will be further discussed in Section 6.

Nose and cheek region
In humans, several nose and cheek actions contribute to negative emotional expressions. Nose wrinkling (procerus contraction) is a component of disgust and engagement of the cheek's zygomatic minor muscle is used in sadness expressions, commonly resulting in a deepening of the nasiolabial furrow (Vrana, 1993;Waller et al., 2008a). As many species are equipped with the relevant facial musculature (Diogo et al., 2009), it seems likely that contraction of muscles in the nose and cheek regions may also indicate negative affect in some other mammals, although it is infrequently mentioned in the literature. Nose and cheek swelling in mice was noted in combination with tightened eyes as a protective mechanism in aggressive encounters and a similar expression occurs when experiencing pain states (discussed in more detail in Section 4) (Langford et al., 2010;Defensor et al., 2012).

Mouth and jaw region
Many mammalian species frequently engage mouth and jaw movements in displays of affective states, in social communication, and as displacement or stereotypical behaviors; all of these are useful for determining welfare states. Fearful expressions in humans are sometimes accompanied by lip stretching, in chimpanzees (Pan troglodytes) by lip corner pulling (zygomatic major, a similar retraction of lip corners may be generated by contraction of the platysma in some species), lip parting and funneling, in horses by upper lip elongation, and in dogs by extended tongue and snout licking (Beerda et al., 1997;Williams, 2002;Casey, 2007;Parr et al., 2007;Waller et al., 2008a;Leiner and Fendt, 2011). In social communication, a fearful expression may act as an appeasement signal to mitigate conflict, however fear experiences are also associated with increased performance of aggressive behavior, which may be characterized by or combined with other facial components (Hsu and Sun, 2010;Bloom and Friedman, 2013;Beisner and McCowan, 2014;Ley et al., 2016). Dogs, for example, may raise the lips, expose the teeth and gape the jaw to indicate a threat (Fox, 1970;Goodwin et al., 1997). Pursing of the lips by funneling, tightening and pressing are associated with anger in humans, while an analogous ex-this underutilization is that pain, like any internal state, can be challenging to recognize in animals (Sneddon et al., 2014). This is unsurprising when it is impossible to directly measure any internal state (Bateson, 1991;Flecknell et al., 2011). However, we pragmatically assume animals experience pain, as demonstrated by the implementation of animal protection and welfare legislation, e.g., in the UK 8 . In humans, pain is routinely assessed using self-report (e.g., visual analogue scale, McGill pain questionnaire (Hawker et al., 2011)), an option not currently available for the communication of animal pain experience to caregivers. Consequently, the assessment of pain in animals is reliant on proxy pain indices, with many advances in the development and validation of such measures (see Rutherford, 2002;Weary et al., 2006;Sneddon et al., 2014).
Pain assessment indices have limitations to their efficacy in assessing animal pain, including a lack of specificity in identifying pain over other negative internal states, a requirement for expertise on species-specific behavior, innate biases of observers, and in some cases, being time consuming to develop and implement (Weary et al., 2006;Rutherford, 2002;Leach et al., 2011;Sneddon et al., 2014). For humans that are unable to verbally or diagrammatically express their pain (i.e., pre-lingual children and patients with dementia) proxy assessment measures using facial expression are routinely used (Williams, 2002). Humans have a prototypical "pain face" (Fig. 1) that changes with aging but is generally characterized by a lowered brow, raised cheeks, tightened eyelids, wrinkled nose, raised upper lip and closed eyes (Prkachin, 2009). Recent advances in this area have identified facial expressions associated with pain in several mammalian species. Grimace scales (scale comprising different expressions that are considered to be associated with pain) (Fig. 1) have been developed to identify when animals are in pain and to potentially assess its severity in mice (Langford et al., 2010), rats (Sotocinal et al., 2011), rabbits (Oryctolagus cuniculi, Keating et al., 2012), horses (Dalla Costa et al., 2014;Gleerup et al., 2015a), cows (Gleerup et al., 2015b) and sheep (McLennan et al., 2016).
The study of Langford et al. (2010) in laboratory mice was the first to systematically demonstrate that mouse facial expression changes in response to noxious stimuli that are potentially painful. This culminated in the development of the Mouse Grimace Scale (MGS), which is comprised of five facial configurations: Orbital tightening, nose bulge, cheek bulge, ear position, and whisker change (Langford et al., 2010). An important potential feature of the MGS is that it can identify not only the presence or absence of pain but also the intensity of the pain experienced, with more extreme pain experiences correlating with more intense facial configurations. This seminal study has led to the development of similar scales for other species. The Rat Grimace Scale was developed by Sotocinal et al. (2011), with further validation by Oliver et al. (2014), and incorporates four facial configurations: orbital tightening, nose/cheek flattening, ear changes and whisker changes. The Rabbit Grimace Scale incor-geois and Brent, 2005;Bloomsmith et al., 2007;Hill, 2009). Stereotypies are commonly used as indicators of welfare; however, they lack specificity to causal variables, resist modification once established, and act as a coping mechanism to facilitate better welfare states in challenging environments (Mason, 2006).

Ear movements
In animals with mobile ears, ear position is an important indicator for both social communication and internal states (Andrew, 1963;Parr et al., 2005;Diogo et al., 2009;Defensor et al., 2012;Wathan and McComb, 2014). As ear position is controlled by the facial muscles, movement of the ears is classified as a facial expression. In horses, backward ears are associated with fear or a non-specific negative affective state, and forward-facing ears may represent arousal or attention; however, both backward and forward ear postures have been observed during agonistic encounters, indicating a need for further study to differentiate these responses (McDonnell, 2003;Waring, 2003;Kaiser et al., 2006;von Borstel et al., 2009;Reefmann et al., 2009b;Boissy et al., 2011). A study on positive and negative reinforcement training found that horses exposed to negative reinforcement training used the ears back position more commonly than those that were positively reinforced for behavior (Briefer Freymond et al., 2014). Negative emotional experiences in sheep are expressed by ear position, with backward positioned ears performed in negative situations over which the sheep has no control (Boissy et al., 2011). In negative, but controllable contexts the ears are pointed up (hypothesized by the authors to represent anger) and in situations when the animals were exposed to unexpected stimuli the ears were up but asymmetrical (Boissy et al., 2011). In some species (e.g., chimpanzees and mice) flattened ears are associated with the performance or anticipation of aggressive behavior (Parr et al., 2005;Defensor et al., 2012). Canids (e.g., foxes (Vulpes vulpes) and domestic dogs) hold their ears in a low position during anxious or fearful emotional states (Fox, 1970;Beerda et al., 1997). mammals and many facial movements are evolutionarily conserved across mammalian species, including humans (Diogo, 2009;Waller and Micheletta, 2013). The consequence of this may be that facial expressions are easier for humans to identify and score due to a degree of universality / generalizability. Facial expressions provide a means for studying the affective component of pain in animals over nociception. From humans, it is known that the affective pain experience has a significant impact on welfare, and is expressed through prototypical facial configurations and this is likely to be also true for animals (Williams, 2002). In human studies, lesioning of the rostral anterior insula (associated with the affective component of pain) can result in pain asymbolia: the disassociation of the unpleasant experience and the nociceptive response to pain (e.g., Berthier et al., 1987). In the study by Langford et al. (2010), the lesioning of the rostral anterior insula in mice eliminated performance of the "pain face", but not behavioral reactions, e.g., abdominal writhing. Although this study was conducted with a small number of animals (n = 6), the results suggest that the pain face may be representative of the affective component of pain in this species (Langford et al., 2010). Despite significant advances in identifying "pain faces" in several species of mammals, the use of facial expression scales for the assessment of pain has limitations. There is the potential for false positives (i.e., indicating pain when none is present) in animals that are asleep, sedated or anesthetized (e.g., Langford et al., 2010;Sotocinal et al., 2011;Miller et al., 2015). In mice, some of the individual facial actions in the MGS have been observed during agonistic encounters, indicating they are not pain specific (Defensor et al., 2012). In order to apply grimace scales in a clinical context we need to better understand what a normal or non-pain facial expression looks like, and there is evidence in mice that this is influenced by strain and gender (Miller and Leach, 2015b). Therefore, facial expressions should only be used to assess pain in animals that are awake, caution porates five facial configurations: orbital tightening, cheek flattening, nose shape, whisker position, and ear position (Keating et al., 2012). The Horse Grimace Scale incorporates six facial configurations: Stiffly backward ears, orbital tightening, tension above the eye area as determined by visibility of the temporal crest bone, prominent chewing muscles, strained mouth with a prominent chin, and strained nostrils with flattening of the profile (Dalla Costa et al., 2014). Prior to the Horse Grimace Scale, several studies suggested individual features in the horse were associated with pain: lip curling and an "abnormal facial expression" in synovitis (Bussières et al., 2008); lip curling in colic (Jöchle, 1989); and nostril flaring in the respiratory disease heaves (Couroucé-Malblanc et al., 2008). Recently, the Sheep Pain Facial Expression Scale (SPFES) was developed to assess pain responses to footrot and mastitis (McLennan et al., 2016). The SPFES uses six facial changes: Orbital tightening, cheek tightening, rotation of the ear, lip and jaw profile changes, and shortening and narrowing of the philtrum (McLennan et al., 2016). Lip curling has also been reported in response to castration, where it intermittently occurred in pain states but was absent in control lambs and those treated with analgesia (Molony et al., 2002). In cows, facial configurations associated with pain include a tense ear position in a backwards or low profile, a tense stare or a withdrawn appearance, furrow lines above the eyes, muscle tension on the side of the head, strained nostrils, dilated nostrils, "lines" above the nostrils, and increased tonus of the lips (Gleerup et al., 2015b).
Facial grimace scales may have advantages over the use of other proxy measures of pain in animals. Grimaces are comprised of a few key indicators, resulting in a potentially more practical scale for implementation even in real-time application Leung et al., 2016). Furthermore, the grimace scale indicators are concentrated in the facial area and exploit the tendency of human observers to focus on animal faces (Leach et al., 2011). Facial expressions are widespread in  Held and Špinka, 2011). Play behavior varies in its expression between species (Bekoff and Byers, 1998;Spinka et al., 2001) with many mammals using a play face: a ritualized facial expression that communicates a playful intent (e.g., canids, Fox, 1970;Rooney et al., 2001;chimpanzees, Parr and Waller, 2006;rhesus macaques, Yanagi and Berman, 2014;humans, Young and Décarie, 1977). Play faces are used with both conspecifics and heterospecifics, for example between dogs and their owners (Rooney et al., 2001), and may help others to interpret gross motor behavior as playful, because play can be rough and may resemble some aspects of aggression (Shyan et al., 2003).
In addition to specific facial configurations, generalized facial relaxation may also indicate positive affect. In humans, contentment is characterized by a relaxed facial expression (Burton and Crossley, 2003). Similarly, the play face in many primate species has been generally described as a relaxed expression with an open mouth (Andrew, 1963;Parr et al., 2005;Waller and Dunbar, 2005;Judge and Bachmann, 2013). In the horse, relaxation of the muzzle, upper eyelids and ears has been described as indicating a "well state" (Gleerup et al., 2015a).

Eye region
A reduction in eyelid aperture is associated with some negative emotions, however it is also associated with positive affect or playful situations in humans (Fig. 2), cats (Felis catus), and canids (Fox, 1970;Ekman et al., 1990;Ley, 2016). However, in humans the narrowing of the eyes seen in negative and positive situations is quantitatively different and this difference is perceivable by observers (Ekman et al., 1990;Waller et al., 2008a;Meletti et al., 2012). In some positive situations (e.g., happiness) eye narrowing can involve raising of the infraorbital area, while in others a relaxed or contented state can lead to relaxation or contraction of the eye area or the eyelids (Hietanen, 1998;Waller et al., 2008a). This is absent in the eye-narrowing should be used in their interpretation with respect to the environmental context, and they currently should be used alongside other validated indices of pain assessment (e.g., Dalla Costa et al., 2014) to ensure they are not specific to one type of pain or painful procedure and to minimize the potential for false negatives or positives in detecting pain states. Facial expressions of pain may also only indicate pain of a particular severity or duration, and be less useful, for example, in the identification of very acute or chronic pain (Langford et al., 2010;Miller and Leach, 2015a). These aspects should be incorporated into future studies on facial expressions of pain.

Can facial expression indicate positive affective states?
In the study of animal emotions and animal welfare, positive states have received less empirical attention than negative ones, however awareness of the importance of positive experiences is increasing, as is characterization of what constitutes a positive experience of an animal (Burgdorf and Panksepp, 2006;Boissy et al., 2007;Mellor and Beausoleil, 2015).
Play behavior is generally accepted to indicate positive affect (Panksepp, 2005;Burgdorf and Panksepp, 2006;Bekoff, 2015) as it reduces in frequency when conditions are challenging, energetic availability is low, or as a consequence of poor health, deprivation or reduced parental care (Loy, 1970;Lawrence, 1987;Thornton and Waterman-Pearson, 2002;Krachun et al., 2010;Held and Špinka, 2011). Play behavior is intrinsically rewarding (Boissy et al., 2007) and has been described as an "opioid-mediated pleasurable emotional experience" (Held and Špinka, 2011, p. 891). Play has both immediate and future benefits for psychological and long-term fitness, and as a contagious behavior can promote welfare at the group level (Bekoff, 2001;  (Langner et al., 2010) in a wide range of mammals including marsupials (e.g., wombat, Lasiorhinus latifrons, Gaughwin, 1979), ungulates (e.g., horse, Weeks et al., 2002;Arabian camel, Camelus dromedaries, Fatnassi et al., 2014), primates (e.g., mandrill, Mandrillus sphinx, Charpentier et al., 2013), and felids (e.g., puma, Puma concolor, Allen et al., 2014). Recognizing sexual motivation by flehmen expression may also assist with the interpretation of other behavioral changes that occur during reproduction or courting, such as increased locomotion or aggression, which can confound interpretations of welfare (Morgan et al., 2004). It is important to note that equids may show a similar expression when in pain (Pritchett et al., 2003), highlighting the need for multi-modal tools that allow for different welfare states to be differentiated, for example using facial expressions to complement behavioral or physiological measures. Increased investigation of these signals may improve differentiation between similar expressions performed in different contexts.

Ears
Ear position may be useful indicators of emotional valence or intensity in animals with ear mobility, for example, relaxed ears correspond with a neutral emotional state in sheep and a positive one in cows (Schmied et al., 2008;Boissy et al., 2011). In horses, front-oriented, pricked ears indicate attention or alertness, and although this is commonly considered to indicate positive emotional valence, this has not yet been empirically determined (Innes and McBride, 2008;Heleski et al., 2009;Proctor and Carder, 2014). However, in animals with mobile ears the neutral ear position can vary both between and within species, and therefore it is important that a baseline position be established for each species, and that individual differences are also taken into account (Andrew, 1963;Wathan et al., 2015). As with negative welfare states, ear position may provide important information on positive states in animals but further research is needed to classify ear position responses in detail.

Can facial expression as a social signal indicate welfare?
In many species, the opportunity for positive social interaction is a key component of maintaining good captive welfare (Mason, 1991;Olsson and Westlund, 2007). Communication between conspecifics is an important component of social stability, particularly in gregarious animal societies such as primates (Sussman et al., 2005). Group communication and social stability has health benefits for individuals within those groups (Silk et al., 2009(Silk et al., , 2010Nunez et al., 2014). Communication is multi-modal and may contain auditory, visual or olfactory components, dependent on context and distance between signaler and receiver (Parr et al., 2005;Burrows, 2008, da Cunha andByrne, 2009;. Signaling is important for social information transfer, and facilitates affiliation, spacing, agonistic intent, or predator avoidance (Partan, 2002;da Cunha and Byrne, 2009;Kiriazis and Slobodchikoff, 2006;Micheletta et al., 2013). Facial expressions are most important for communicating in close range interactions and may indicate signaler configuration performed in negative situations (e.g., anger), which arises from contraction of the eyelids and sometimes by lowering of the eyebrow (Waller et al., 2008a). However, to what extent this might also apply to non-human animals has yet to be examined.

Nose and cheek region
In humans, raising of the cheeks, which leads to changes in the eye area (see Section 3.1), is associated with positive emotions and can differentiate between "enjoyment" and "social" smiles (Ekman et al., 1990;Waller et al., 2008a). This facial movement has not previously been reported as an indicator of positive affect in other species.

Mouth and jaw region
In humans, happiness is expressed via relaxed facial muscles and the affiliative facial expressions of laughing and smiling, which configures as a lip corner retraction caused by action of the zygomatic major muscles (Ekman et al., 1990;Ruch, 1995;Waller et al., 2008a). Analogous expressions occur in primate species such as chimpanzees, also characterized by lip corner retraction, however with the upper teeth covered and lower lip relaxed ("relaxed open mouth expression"), often used as a "play face", or with the lips retracted from the teeth ("bared teeth display"), although the latter is also used as an appeasement signal (van Hooff, 1972;Preuschoft and van Hooff, 1997;de Waal, 2003). The mouth and jaw region are common components in play face configuration, and have also been observed in non-primates such as in canids (e.g., C. aureus, C. lupus), equids (e.g., Equus quagga), mustelids (Mustela putorius), and domestic cats (Poole, 1978;Martin, 1984;Schilder et al., 1984;Feddersen-Petersen, 1991). Common features include an open mouth, relaxed or stretched jaw, teeth covered to varying degrees and some lip corner retraction (Darwin, 1872;Fox, 1970;Schilder et al., 1984;Rooney, 2001). Some features may resemble aggression (e.g., nose wrinkling and teeth baring in wolves, Canis lupus) but are distinguishable when combined with other signals such as posture or body tension (Fedderson-Petersen, 1991). Mouth movements are made in response to pleasant taste stimuli such as sweet foods. In rats, this behavior is seen as licking or movement of the upper lip and tongue protrusion (Grill and Norgren, 1978;Cabanac and LaFrance, 1990). Humans and non-human primates also protrude the tongue in response to sweet foods and may also smack their lips (Steiner et al., 2001).
Sexual motivation may also be indicated by some facial expressions and is relevant to welfare assessment as reproduction can be suppressed when welfare is poor (Broom, 2008). One such facial expression is flehmen, characterized by movement around the mouth, jaw and nose, and thought to be functional in monitoring estrous cycles of females from their urine, although it may also serve other communicative functions (Stahlbaum and Houpt, 1989;Weeks et al., 2002). In the donkey (Equus asinus), for example, this has been described as "raising the head with the muzzle pointed toward the sky, the upper lip drawn back extensively and puckered, with the upper teeth and gums exposed, and nostrils wrinkled into a longitudinal and closed position" (Moehlman, 1998, p. 136). Flehmen has been observed petting, and are performed more towards familiar people than those who are unfamiliar. Therefore, facial communication can provide insight into internal states in mammals and allow for interpretations on welfare and environmental effects.
The contingency of using social signals as an indicator of welfare is dependent upon the "honesty" of the signaler (Krebs and Dawkins, 1984;Fridlund, 1991;Weary and Fraser, 1995). In some cases, a given signal may be actively deceptive in that the signaler actively attempts to mislead the observer, or passively deceptive where a genuine signal is suppressed by the presence of an observer. If expression of a signal increases an animal's vulnerability, for example, pain vocalization in a prey species, the signal may be suppressed. In this case it would be incorrect to assume that a lack of signal implies a lack of need. The hiding of pain responses is considered prevalent by vets (Fenwick et al., 2014), however pain behavior may also serve as a strategy to recruit altruistic assistance from others (Langford et al., 2006;de Waal, 2008), and concealment when assistance could be expected would be maladaptive. Signal suppression is most likely to occur in the presence of either a threat or a competitor, and has direct relevance to human-animal interactions including but not limited to observer effects and learned helplessness (Overmier and Seligman, 1967;Seligman and Maier, 1967;Weary and Fraser, 1995;Jack et al., 2008;Crofoot et al., 2010). Signals are most likely to be honest when the signaler and receiver are related, the animals have shared interests compared to competing interests, the degree or intensity of the signal varies with the need, and the production of the signal has a cost to the signaler (Weary and Fraser, 1995). However, these issues are not specific to the study of facial expression but are true of all animal signals including vocalizations and posture, and strategies that avoid behavioral alteration from observer or competitor effects may be equally applied to facial displays as to other behavior. In fact, evidence from human studies suggests that facial expressions are subject to "emotional leakage" when suppression is attempted, and in some cases, are more reliable indicators of internal states and motivations than body motor movements (Craig et al., 1991;Williams, 2002;Ekman and Rosenberg, 2005;ten Brinke et al., 2012). This suggests that facial expressions can be a useful and honest measure that can be applied to the identification of underlying affective states in animals.

Methods of measurement
Facial expression has been measured using both "bottom-up" and "top-down" techniques. Facial action coding systems (FACS) are a bottom up method of identifying and recording facial expressions based on the underlying facial musculature and muscle movement (Ekman and Friesen, 1978). Rather than categorizing gestalt expressions associated with one specific context, FACS documents all the observable facial movements for a species, accommodating all potential facial configurations and making this method suitable for use across a wide range of settings. The original FACS was developed for use in humans (Ekman and Rosenberg, 2005), and this framework has intent and impending behavior to the receiver (Partan, 2002;Parr and Waller, 2006;Oettinger et al., 2007).
The relevance of facial signaling to animal welfare assessment is illustrated by the use of facial displays to replace or precede aggressive intent. Threatening facial expressions benefit both aggressor and receiver by allowing direct aggression and its potential consequences to be avoided (Judge and de Waal, 1993). Aggression, which may include facial signals, can also indicate perceived threat by the signaler, which may be directed towards within-group conspecifics, humans, or heterospecifics (e.g., neophobia) (Mitchell et al., 1992;Partan, 2002;Leonardi et al., 2010;Peiman et al., 2010). Aggressive behavior is associated with fearful or anxious affective states and stress (Galac and Knol, 1997;Boissy, 1995;Honess and Marin, 2006) that are relevant within a welfare framework. Agonistic facial expressions in reaction to ambiguous stimuli may also be useful as an indicator of cognitive bias, a measure of the animal's perceptual valence that ranges from an optimistic to a pessimistic bias (Bar-Haim et al., 2007;Bethell et al., 2012). Rates of agonistic and submissive facial expressions can indicate changes in social dynamics or escalation of aggressive interactions, which are normal in a natural context but are undesirable at elevated frequencies or intensities because of the potential for injury and distress (Kikusui and Mori, 2009;Akre et al., 2011). In golden-bellied mangabeys (Cercocebus galeritus), for example, aggressive facial displays were measured in a zoo setting (Mitchell et al., 1992). It was found that zoo visitor numbers had a significant effect on the frequency of facial displays; lower visitor numbers were associated with fewer aggressive facial displays both towards humans and conspecifics. Although the authors regarded these changes as within the parameters of normal behavior, it supports the premise that facial displays can reflect environmental conditions. Although the majority of studies incorporating facial expression in non-primate species use few facial features, one study in donkeys detailed expressions under numerous contexts (Moehlman, 1998), suggesting that more comprehensive accounts of situational facial configurations are achievable in other species. Donkeys use an open-mouth face as a social threat (Moehlman, 1998). In males, a protruding and downward pulled upper lip is displayed when courting a female, and occasionally in response to threats by another male (Moehlman, 1998). A jawing mouth movement (repetitive opening and closing of the jaw) is displayed by females during copulation as well as by males when mounted by other males, or when approached by a more dominant animal (Moehlman, 1998).
Appeasement and affiliative signaling can similarly indicate internal state, social conflict, and the presence or perception of threat, all of which are relevant for animal welfare. In chimpanzees, for example, affiliation is characterized by a silent bared-teeth display as a signal of benign intention or submission (Preuschoft and van Hooff, 1997;Waller and Dunbar, 2005). Fearful expressions serve a communicative role in appeasing potential or actual threat from conspecifics (Marsh et al., 2005a;Shariff and Tracy, 2011). In dogs, for example, appeasement and "stress" signals include panting, lip or nose licking, and tongue flicking (Kuhne et al., 2012). These signals increase when dogs are exposed to uncomfortable situations such as inappropriate review of both body and facial lateralization in response to emotional stimuli concluded that across the vertebrates, a generalized pattern exists for processing negative emotional contexts (e.g., fear, aggression) with the right cerebral hemisphere and positively associated experiences (e.g., food rewards) with the left (Leliveld et al., 2013). More empirical evidence is needed in a range of species to determine generalized patterns specifically in facial lateralization that have the potential to be applied to welfare contexts.

When is facial expression not a reliable indicator for welfare?
The reliability of using facial expression as a welfare indicator is reliant on several assumptions.
Firstly, that the species of interest has the facial structure that allows sufficient facial mobility to generate observable expressions (Chevalier-Skolnikoff, 1973;Cooke, 2015). The use of facial signals by mammals is related to taxa with those species characterized by gregariousness involving intricate social environments, a factor which is thought to have adaptively increased facial muscle structure and facial expression use (Byrne and Whiten, 1985;Burrows, 2008). It may be possible that as the capacity to generate facial expression becomes more complex it can be used for greater specificity in detection of emotions, while in less social or visual mammals it may only be reliable in indicating either valence (negative/positive) or intensity.
Secondly, changes in facial expression must be observable. Overt or sustained expressions may be noted by direct observation, however subtle or fleeting facial changes are captured more easily using technology. Still images from video footage have been used with success in grimace pain scales (Sotocinal et al., 2011;Leach et al., 2012;Miller et al., 2015), and advances in technology yield high quality still and video footage. Stills are taken when the face is clearly visible, and coding is then conducted on a random selection of this pool of images (Miller et al., 2015). Live coding of grimace scales has been attempted with some success (Leung et al., 2016), however in other studies the results were found to be significantly different to those obtained by still images (Miller and Leach, 2015b). Both photographs and video have been used for FACS, however this method of fine-grained measurement can be challenging and time-consuming (Ekman and Rosenberg, 2005;Vick et al., 2007;Parr et al., 2010). Video footage allows movement to be detected, which facilitates detection of facial changes. For ease and accuracy of FACS style coding, close range, high quality, high definition video is necessary. Poor filming conditions and the physical appearance of the animal or human may also affect how observable facial configurations are (Marsh et al., 2005b;Dalla Costa et al., 2014). For example, rhesus macaques have individual differences in brow size that may contribute to an open-eyed "surprised" appearance, or a lowered-brow "angry" appearance, and therefore an accurate neutral expression should be obtained prior to coding of muscle activation. Shadows can also be cast on the face during different head positions and this may mimic the changes in appearance resulting from muscle since been applied to a number of different nonhuman primates and domesticated species, i.e., chimpanzees , orangutans (Pongo pygmaeus, Caeiro et al., 2012), rhesus macaques (Parr et al., 2010), gibbons and siamangs (Hylobatidae, Waller et al., 2012), horses (Wathan et al., 2015), dogs  and cats . This methodology allows direct comparisons using identical techniques across species with a different facial morphology (e.g., Waller et al., 2014). Frequencies and intensities of individual action units and configurations for multiple muscle actions can be analyzed. Grimace scales for pain identification use a simplified version of the FACS approach, with muscle movements defined by changes in appearance of key facial features occurring during pain states (e.g., Sotocinal et al., 2011;Dalla Costa et al., 2014). These appearance changes may be created by individual or grouped muscle actions and grimace scales often incorporate a 3-point intensity scale to better assess pain intensity.
In contrast to FACS, facial expressions used in social communication research are categorized according to multiple simultaneous muscle movements that have commonly accepted configurations such as "fear grimace" and "relaxed open mouth display" (Parr et al., 2005;Waller and Dunbar, 2005;Parr and Waller, 2006;De Marco et al., 2008). This is a "top-down" system of coding, with expressions then counted or timed for analysis. This protocol is useful for characterizing social communication in well-studied species such as primates, however, pre-determined labels risk becoming misnomers when applied to studies of emotion or welfare, and may thus incorrectly guide interpretation in a welfare context. For example, the "fear grimace" in primates may not necessarily reflect an internal fearful state but has other communicative functions such as submission, appeasement or affiliation (de Waal, 2003;Waller and Dunbar, 2005;Beisner and McCowan, 2014).
An alternative method of assessing emotion or welfare by facial expression is by measuring laterality in expression production (Fernández-Carriba et al., 2002;Wallez and Vauclair, 2012). The phenomenon of laterality, or asymmetry, in motor activity, auditory processing, and visual attention is widespread across vertebrates and is caused by an imbalanced contribution of the cerebral hemispheres to cognitive processing (Rogers, 2014). Presence or strength of lateralization is affected by variables such as species and individual differences, however it has also been proposed as a useful welfare indicator by Rogers (2010) because stressed animals can become more active in their right hemisphere, correlating ipsilaterally to greater dominance on the left side of the body. An alternate hypothesis is that strength in laterality is less affected by emotional valence and more by level of arousal or emotional intensity. In humans, for example, the production of emotional facial expressions is stronger on the left side of the face (Sackeim et al., 1978) and dogs show more left facial activation when reunited with their owners than when reacting to strangers (Nagasawa et al., 2013). Asymmetrical ear position may indicate pain in horses (Gleerup et al., 2015a) and a startle response in sheep (Boissy et al., 2011). Rhesus macaques exhibit some asymmetry in the production of facial expression and vocalizations although this is thought to be unrelated to emotional valence (Hauser and Akre, 2001). A recent their production and are also accompanied by a higher blink rate (Porter and ten Brinke, 2008;ten Brinke et al., 2012). In practice, a combination of facial expression and somatic movement is likely to provide the most reliable indicator of internal states. However, further research into signal honesty and audience effects on production is required to assess the potential impact of these factors on the reliability and validity of facial expressions as a welfare measure.

Summary and conclusions
Identification of the internal state of animals has inherent challenges that impair our ability to measure welfare states, and restrain opportunities for experimental refinement when animal models are utilized. Although facial expressions are infrequently used as a measure of welfare in animals, evidence suggests that such expressions, in mammals at least, may provide important insights into internal states. Facial expressions can potentially indicate psychological and emotional experiences in animals, as well as temporal and stimuli specific reactions. Robust, objective systems for the recording and measurement of facial expressions already exist for several species, and may take advantage of the innate human observational bias to focus on the facial area. Furthermore, evidence from primates suggests that facial expression may be a more honest signal of internal state than general behavior. While facial displays cannot replace other behavioral or physiological indicators of welfare, emotion or health, they are a largely untapped resource with considerable potential to enhance our understanding of affective states and experiences in animals and subsequently to underpin improvements in applied animal welfare.

References
Akre, A. K., Bakken, M., Hovland, A. L. et al. (2011) Thirdly, different affective states must be sufficiently differentiated in the face, or contribute significantly to the interpretation of gross level behavior. In the development of the MGS it was observed that sleeping and sick mice show similarities in some of the grimace muscle actions (Langford et al., 2010). Similarly, Defensor et al. (2012) described a similar facial expression in mice that were exposed to intruders in their territory. In horses, an upper lip curl can be due to both flehmen (Stahlbaum and Houpt, 1989) and abdominal pain (Pritchett et al., 2003), and in nonhuman primates yawning indicates both threat and displacement behavior (Andrew, 1963;Vick and Paukner, 2010). These examples suggest that facial expressions of similar appearance may derive from multiple etiologies. However, these may be differentiated by closer examination of facial changes, or by combining this information with other behavior, vocalizations and context. Displacement activities are often fragmented, incomplete versions of the "source" behavior (Russell and Russell, 1985;Maestripieri et al., 1992) and this may assist in distinguishing between similar behaviors with different functions. For example, Vick and Paukner (2010) demonstrated that displacement yawning in chimpanzees could be differentiated from other yawning types by facial configuration, intensity, and by the succeeding behavior. Alternatively, in some circumstances it may only be possible to identify valence, without further specification.
An additional methodological consideration is that the production of vocalization results in facial muscle actions. Facial and vocal communications are motivationally linked, and are combined for multi-modal social expression (Andrew, 1963;Chevalier-Skolnikoff, 1973;Lehner, 1978;Partan, 2002;Micheletta et al., 2013). Both may be important measures of welfare and facial expressions created in the production of sound should be differentiated rather than disregarded. In rhesus macaques, the mouth creates fixed movements when producing vocalizations, while non-vocal mouth expressions are more flexible in movement and shape (Partan, 2002), and again subtleties or multimodal information may assist in differentiating affect or motivation (e.g., Slocombe et al., 2011).
Finally, interpreting facial expression or behavior as a signal of welfare state relies on the honesty of that signal in reflecting the internal condition. Animals may suppress honest signals when it is advantageous, for example when they are vulnerable to attack or to protect available resources (Weary and Fraser, 1995). The animal's affective state may also be influenced by external circumstances. For example, environmental and social conditions modulate pain experiences in rodents (Rivat et al., 2007;Sorge et al., 2014). In humans, facial expressions can be voluntarily generated or suppressed, which can result in observer deception (Bartlett et al., 2014). However, voluntary and genuine facial expressions (e.g., smiling) in humans have subtle defining differences, and suppression of expression, for example hiding of the pain face, is often incomplete (Craig et al., 1991, Ekman, 1992Erickson and Schulkin, 2003). In humans, falsified facial expressions tend to be more inconsistent in  (03)