How useful are psychometric scales when measuring personality constructs?

11 Mar

Psychometric measures used to inform personality constructs have long been held in high regard by psychologists and organisations that may rely on well-established or modified versions of personality tests as a way to gain insight on individuals’ which can then be used to help make decisions such as who to employ. Similarly, within health psychology, personality test results can be used as predictors of health and wellbeing (Maltby, 2010).

Whilst personality tests have proven to demonstrate some significance in determining overall persona, it is debatable whether personality is a stable measurable feature in individuals at all (Little, 2016) therefore can we rely on a test to determine personality when they are malleable, shaped by socialisation and experiences?

In completing personality tests which commonly use self-report techniques, issues such as between-person variance raises questions regarding the reliability of self-report measures. Dang et al., (2017) purports that evidence indicates weak correlations between self-report and behavioural measures of constructs I.e., Extraversion due to poor reliability of many behavioural measures and the distinct response processes involved.

Responder bias, the mood of the participant at the time of the test can both equally alter the outcome. Not only that, people can easily read into positive descriptions and agree with the appealing items in comparison to negative statements, therefore what we refer to as the ‘Barnum effect’ can lead to biases in overall results (Dickson, 1985).

Apprehensions pertaining to the reliability and validity of psychometric measures such as construct validity (which is concerned with whether a test is measuring what it is suggested to measure) in relation to the Eysenck Inventory Personality Questionnaire (EPQ), which was created with the intention to measure three basic personality dimensions, I found that some of the P scale items were ambiguous. Additionally, I was not confident in answering some items honestly in fear of what the result could be.

This might lead us to question whether personality tests require an individual to have an element of self-awareness and a readiness to answer honestly as some questions might unearth feelings of guilt or shame. These “darker elements” of the personality (which some of us may try to repress) according to Jung’ s model of the psyche (1947) is referred to as the “shadow-self” – an area of personality that people try not to acknowledge as they may dislike the traits (or be a personality blind spot) therefore, they would prefer to suppress them rather than openly discuss them.

Returning to the earlier point raised regarding EPQ items being ambiguous, for example, ‘were you ever greedy by helping yourself to more than your share of anything?’ I was not clear on whether this question wants me to refer to whether I was greedy enough to help myself to an extra slice of chocolate cake this evening instead of sharing it with my son or is it referring to something that can be viewed as amoral I.e., not distributing the waiting tips between colleagues?

Whilst Eysenck’s trait theory has been advantageous to psychology, there are flaws with the EPQ. The most notable flaw derives from the hierarchical factor analytic studies due to used lack of historical perspective, lack of scientific sophistication and limited understanding of the particular problem which the factor analyst is trying to solve (Eysenck, 1953).

Eysenck’s Inventory Personality Questionnaire (EPQ), first measure looks at sociability, where Extraversion is at one end of the continuum and Introversion at the other. According to Eysenck, extraverts can be defined as those who are excitable, seek external fulfilment, impulsive and sociable whilst introverts tend to be quiet, introspective and focused on their inner reality.

The second dimension seeks to measure Neurotiscm/Stability which can be characterised by high levels of depression, feelings of guilt, moody, shy and irrational – according to Eysenck’s theory which is based on activation thresholds in the sympathetic nervous system or visceral brain. However, Eysenck identified that there were two characteristics that neurotics do not display being anxiety and fear. This led him to create the P-dimension, a third dimension to personality known as psychoticism which is supposedly common amongst the prison population (Maltby, 2010)

In 1985 a revised version of EPQ was described being the EPQ-R, this version has 100 yes/no questions in its full version and 48 yes/no questions in its short scale version. It is assumed by the psychometric principle that greater length enhances reliability (Lord & Novick, 1968) therefore one may assume that the long EPQ version is more reliable however there are disadvantages in using long tests for example participant fatigue and issues with clinical application due to their length.

In reflecting back to my personal experience of completing the questionnaire, I completed both the short and long version, whilst completing the long version when I reached question 39, I became distracted by my home environment and so I began to overthink the questions before moving to the next one.

I found that there were some questions that I did not answer honestly and knew at the time of answering this was case, therefore I do believe I was trying to be socially desirable in my responses. For example, one of the items asked, “Do you occasionally have thoughts and ideas that you would not like other people to know about?”

My immediate thought was yes, I have thoughts I wouldn’t want others to know about for example, what I think of my mother’s twenty-year inability to heal following divorce from my father or what I may think about a handsome stranger walking by. Whilst these thoughts may be common amongst many, the question made me instinctually worry about stigma that might be attached if I selected “yes”.

Due to this, the “lie” result scored 6 out of 9, this section measured how socially desirable I was trying to be in my answers. Those who score 5 or more on this scale are assumed to be trying to make themselves look good and therefore not totally honest in the responses.

Ferrando (2017) purports that the EPQ’s E items were least affected by social desirability (SD), though the direction of the impact depended on the type of item. As expected, in the N and P cases the relations obtained were consistently negative, but the strength of the SD impact also depended considerably on the type of item. The P scale was the most problematic in terms of convergent and discriminant validity – thus, suggesting that the P scale is largely independent of the other factors in the EPQ which can be seen as a weakness of the psychometric test overall.

In addition to these findings, Paulhus (2011) found that typically the Lie items are statements about attitudes and practices that are socially undesirable but common, such as minor dishonesties, bad thoughts, weaknesses of character. Reflecting back to my previous concern in answering the item relating to thoughts I wouldn’t want others to know about, I am aware that it is common for humans to have private thoughts we would prefer to keep to ourselves yet, knowing it is not always socially desirable to share personal wonderings I hesitated in my response to the question in fear of returning a result I wouldn’t like.

As in the case of N, negative correlations are generally obtained between P scale scores and SD measures which Helmes (1980) proposed is expected since the behaviours described in the P items are socially undesirable. For example, ‘would you feel sorry for an animal caught in a trap?’

Furnham and Henderson’s (2001) research found response bias in completing the EPQ. Subjects were requested either to give a good impression, give a bad impression, give an impression of mental instability or respond honestly. Subjects who faked good had significantly higher Extraversion, Lie and Social Desirability scores but lowest Neuroticism, Psychoticism and Social Anxiety scores. Subjects who faked bad had significantly lower Extraversion and higher Psychoticism and Social Anxiety scores.

These findings can lead us to question the overall accuracy of the test and how valid it may be as a true indicator of personality as Furnham and Henderson (2001) research supposes it is possible to ‘fake’ a result that leads to a desirable outcome, particularly if there’s rewards to be gained which are dependent upon the conclusion of the test I.e., employment opportunities.

Expanding on this point further, when I completed the shorter version of the EPQ my result demonstrated a Phlegmatic personality characteristic with the scores plotted nearer to the outside of the circle however I did respond in a socially desirable manner, upon completing the longer version of the EPQ (which I attempted to answer honestly) my result returned with mostly Melancholic characteristics, with scores plotted nearer to the centre of the graph.

The variety in my results back up Furnham and Henderson’s (2001) research in that I could distinguish between which responses were likely to generate the more ‘desirable’ result, therefore responded accordingly.

Kline (1986) proposes that a well-designed psychometric test question avoid including items that will mean different things to different respondents which can lead to indeterminate results. For example, items from the P scale such as ‘should people always respect the law’ I answered no, yet for the additional P scale question ‘is it better to follow society’s rules rather than to go your own way’ I answered yes.

The responses seem contradictory as the items are similar. Additionally, my interpretation of ‘should people always respect the law’ made me think of Jim Crow Segregation and the breaking of the ‘law’ at the time by Rosa Parks in the Montgomery Bus Boycott (Schudson, 2012). Would change have come about if the laws of the time were respected?

On the other hand, I do believe it is healthy to follow society’s ‘rules’ (where they are morally acceptable) to prevent the collapse of civilisation.

Due to only being able to select yes or no without other varying degrees of a response I do not feel the personality test provided a true account of who I am which again provides a criticism of the ‘yes/no’ options. Furthermore, depending on who the responder is will determine their interpretation of the item, for example nature and nurture influences on the socialisation process across the lifespan.

Costa (2019), suggests that normative life experiences and events that many of us go through which shape our personality, level of education and career choice will influence personality. Likewise, the socio-economic background of the individual can influence personality across the lifespan.

Is it possible that I could have responded differently to the items relating to respecting the law and whether society’s rules should be followed had I not had the awareness of civil rights changes that have occurred due to law breaking in comparison to an individual who is oblivious to injustices overcome by challenging policies of an era?

Furthermore, had I completed the EPQ during my adolescent years, I believe I would have responded differently due to the unacquired skill of thinking pragmatically and to some extent critical thinking in my adulthood years. Though Eysenck’s personality theory is largely underpinned by genetic influences, personality can change across the lifespan irrespective of genetic heritability.

Costa et al., (2019) considers that people stay within similar boundaries of personality scores due to genetic influences. Twin Studies provide evidence towards this such as The Minnesota Study of Twins Reared Apart which looked at monozygotic and dizygotic twins parted during their earlier life, Bouchard and McGue, (1981) discovered a large correlation between monozygotic twins who were reared apart for neuroticism r = 0.40.

Nevertheless, environmental and non-shared environmental factors (peer friendships) can cause personality changes as pointed out by Baker and Daniels (1990). This suggests that genetics is not wholly responsibly for personality as Eysenck proposed. Similarly, in utilising the ‘test-retest’ measure of the EPQ resulted in varying results, perhaps due to the fact one was the shorter version it didn’t allow enough insight into my behaviour to provide an accurate outcome or is it possible that personality is constantly changing and adapting to the stimuli in the environment?

John et al., (2003) research compared the biological view of the Five-factor theory, which has roots in factor analytic tradition like the EPQ. John (2003) proposes the plaster hypothesis, in that all personality traits stop changing by age 30. In contrast, contextualist perspectives propose that changes should be more varied and persist throughout adulthood.

It was found that Conscientiousness demonstrated substantial change throughout early and middle adulthood, with the strongest effects in earlier adulthood; which is commonly when adults enter careers and form committed partnerships. Agreeableness showed the largest changes somewhat later, when adults are typically caring for children.

From this research we might infer that as personality can change across the lifespan and due to life experiences, the utility of psychometric tests can only be considered to an extent. Psychometricians could find it beneficial to consider the situational factors impacting personality and design questions that fit with the changes to enhance reliability of the tests.

In conclusion, Eysenck’s research is supported with cross-cultural and longitudinal evidence relating to the personality dimensions being stable and consistent amongst adults, children and in Twin Studies across varying cultures, thus providing confirmatory evidence of a genetic basis for primary personality types. Nevertheless, Eysenck failed to theorise on how the environment might interrupt personality development which is reflected in the lack of item questions that can address environmental influence.

References

Alexopoulos, D. S., & Kalaitzidis, I. (2004). Psychometric properties of Eysenck Personality Questionnaire-Revised (EPQ-R) Short Scale in Greece. Personality and Individual Differences, 37(6), 1205–1220. https://doi.org/10.1016/j.paid.2003.12.005

‌Aluja, A., Garcı́aÓ., & Garcı́aLuı́s. F. (2003). A psychometric analysis of the revised Eysenck Personality Questionnaire short scale. Personality and Individual Differences, 35(2), 449–460. https://doi.org/10.1016/s0191-8869(02)00206-4

‌Dang, J., King, K. M., & Inzlicht, M. (2020). Why Are Self-Report and Behavioural Measures Weakly Correlated?. Trends in cognitive sciences, 24(4), 267–269. https://doi.org/10.1016/j.tics.2020.01.007

Dickson, D. H., & Kelly, I. W. (1985). The “Barnum Effect” in Personality Assessment: A Review of the Literature. Psychological Reports, 57(2), 367–382. https://doi.org/10.2466/pr0.1985.57.2.367

Eysenck, H. J. (1992). The definition and measurement of psychoticism. Personality and Individual Differences, 13(7), 757–785. https://doi.org/10.1016/0191-8869(92)90050-y

Ferrando, P. J. (2008). The impact of social desirability bias on the EPQ-R item scores: An item response theory analysis. Personality and Individual Differences, 44(8), 1784–1794. https://doi.org/10.1016/j.paid.2008.02.005

‌Helmes, E. (1980). A Psychometric Investigation of the Eysenck Personality Questionnaire. Applied Psychological Measurement, 4(1), 43–55. https://doi.org/10.1177/014662168000400106

Journal Psyche | Exploring the nature of consciousness. (2014, January 1). Journalpsyche.org. http://journalpsyche.org/

‌Little, B. (2016). Who are you, really? The puzzle of personality. In TED. https://www.ted.com/talks/brian_little_who_are_you_really_the_puzzle_of_personality?language=en

‌Maltby, J., Day, L., & Macaskill, A. (2013). Personality, individual differences, and intelligence. Pearson.

‌Schudson, M. (2012). Telling Stories about Rosa Parks. Contexts, 11(3), 22–27. https://doi.org/10.1177/1536504212456177

Srivastava, S., John, O. P., Gosling, S. D., & Potter, J. (2003). Development of personality in early and middle adulthood: Set like plaster or persistent change? Journal of Personality and Social Psychology, 84(5), 1041–1053. https://doi.org/10.1037/0022-3514.84.5.1041

Jodie Jasmin

How useful are psychometric scales when measuring personality constructs?

Is Memory Decline Inevitable?

Becoming Confident

JodiePsychology.org