A rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute. In the social sciences, common examples are the Likert scale and 1-10 rating scales in which a person selects the number which is considered to reflect the perceived quality of a product.

Contents

Background

A rating scale is an instrument that requires the rater to assign the rated object that have numerals assigned to them.

Types of Rating Scales

All the rating scales can be classified into one of the following three classifications:-

  1. Some data are measured at the nominal level. That is, any numbers used are mere labels : they express no mathematical properties. Examples are SKU inventory codes and UPC bar codes.
  2. Some data are measured at the ordinal level. Numbers indicate the relative position of items, but not the magnitude of difference. One example is a Likert scale:
Statement: I could not live without my computer.
Response options:
  • 1. Strongly Disagree
  • 2. Disagree
  • 3. Agree
  • 4. Strongly Agree
  1. Some data are measured at the interval level. Numbers indicate the magnitude of difference between items, but there is no absolute zero point. Examples are attitude scales and opinion scales.
  2. Some data are measured at the ratio level. Numbers indicate magnitude of difference and there is a fixed zero point. Ratios can be calculated. Examples include: age, income, price, costs, sales revenue, sales volume, and market share.

More than one rating scale is required to measure an attitude or perception due to the requirement for statistical comparisons between the categories in the polytomous Rasch model for ordered categories (Andrich, 1978). In terms of Classical test theory, more than one question is required to obtain an index of internal reliability such as Cronbach's alpha (Cronbach, 1951), which is a basic criterion for assessing the effectiveness of a rating scale and, more generally, a psychometric instrument.

Rating scales used online

Rating scales are used widely online in an attempt to provide indications of consumer opinions of products. Examples of sites which employ ratings scales are IMDb, Epinions.com, Internet Book List, Yahoo! Movies, Amazon.com, BoardGameGeek, TV.com and Ratings.net. The Criticker website uses a rating scale from 0 to 100 in order to obtain "personalised film recommendations".

In almost all cases, online rating scales only allow one rating per user per product, though there are exceptions such as Ratings.net, which allows users to rate products in relation to several qualities. Most online rating facilities also provide few or no qualitative descriptions of the rating categories, although again there are exceptions such as Yahoo! Movies which labels each of the categories between F and A+ and BoardGameGeek, which provides explicit descriptions of each category from 1 to 10. Often, only the top and bottom category is described, such as on IMDb's online rating facility.

With each user rating a product only once, for example in a category from 1 to 10, there is no means for evaluating internal reliability using an index such as Cronbach's alpha. It is therefore impossible to evaluate the validity of the ratings as measures of viewer perceptions. Establishing validity would require establishing both reliability and accuracy (i.e. that the ratings represent what they are supposed to represent).

Another fundamental issue is that online ratings usually involve convenience sampling much like television polls, i.e., they represent only the conglomeration of those inclined to submit ratings.

Sampling is one factor which can lead to results which have a specific bias or are only relevant to a specific subgroup. To illustrate the importance of such factors, consider an example. Suppose that a film's marketing strategy and reputation is such that 90% of its audience are attracted to the particular kind of film; i.e. it does not appeal to a broad audience. Suppose also that the film is very popular among the audience that does see the film and, in addition, that those who feel most strongly about the film are inclined to rate the film online. This combination may lead to very high ratings of the film which do not generalize beyond the people who actually see the film (or possibly even beyond those who actually rate it).

Qualitative description of categories is an important feature of a rating scale. For example, if only the points 1-10 are given without description, some people may select 10 rarely whereas other may select the category often. If, instead, "10" is described as "near flawless", the category is more likely to mean the same thing to different people. This applies to all categories, not just the extreme points.

These issues are also compounded when aggregated statistics such as averages are used for lists and rankings of products. User ratings are at best ordinal categorizations. While it is not uncommon to calculate averages or means for such data, doing so cannot be justified because in calculating averages, equal intervals are required to represent the same difference between levels of perceived quality. The key issues with aggregate data based on the kinds of rating scales commonly used online are as follow:

More developed methodologies include Choice Modelling or Maximum Difference methods, the latter being related to the Rasch model due to the connection between Thurstone's law of comparative judgement and the Rasch model.

References

See also

External links

Categories: Psychometrics

 

The above information uses material from Wikipedia and is licensed under the GNU Free Documentation License.
Some facts may not have been fully verified for accuracy. [Disclaimers]
This page was last archived by our server on Sat Jul 18 16:57:24 2009. [ refresh local cache ]
Displaying this page or its contents does not use any Wikimedia Foundation's resources.
The owners of this site proudly support the Wikimedia Foundation.