Behaviorally Anchored Rating Scale (BARS)

A performance appraisal method that combines numerical rating scales with specific behavioral examples at each scale point, reducing subjectivity by anchoring each rating level to observable, job-relevant behaviors identified through systematic job analysis.

What Is a Behaviorally Anchored Rating Scale (BARS)?

Key Takeaways

  • BARS is a performance evaluation method that attaches specific behavioral examples (anchors) to each point on a rating scale, replacing vague labels like 'good' or 'excellent' with descriptions of what actual performance looks like at each level.
  • Smith and Kendall developed BARS in 1963 to address the subjectivity problems inherent in graphic rating scales.
  • Each BARS dimension requires job analysis, critical incident collection, and expert review to identify the behavioral examples that define each performance level.
  • BARS reduces common rating biases (halo effect, central tendency, leniency) by 28% compared to standard rating scales (Landy & Farr research).
  • The primary trade-off is development cost: building a BARS system takes 6-8 weeks per job family, compared to hours for a graphic rating scale.

Think of BARS as a graphic rating scale that shows its work. Instead of asking a manager to rate an employee's 'customer service' on a scale of 1 to 5, BARS tells the manager exactly what customer service looks like at each level. Level 1 might read: 'Ignores customer inquiries for more than 24 hours and provides inaccurate information when responding.' Level 3: 'Responds to all customer inquiries within 8 hours and accurately resolves 80% of issues on first contact.' Level 5: 'Anticipates customer needs before they arise, maintains a 98%+ first-contact resolution rate, and receives unsolicited positive feedback from customers monthly.' Now the manager isn't interpreting what a '3' means. They're matching the employee's observed behavior against a specific description. Two different managers evaluating the same employee are much more likely to arrive at the same rating because they're comparing against the same behavioral benchmark. This precision is why BARS is considered the gold standard for evaluation accuracy. It's also why it's expensive to build. Those behavioral anchors don't write themselves. They come from systematic job analysis, interviews with subject matter experts, and iterative refinement.

1963Year Smith and Kendall introduced BARS in their foundational research paper
5-9Typical number of performance dimensions evaluated using BARS per role
28%Reduction in rater bias compared to standard graphic rating scales (Landy & Farr)
6-8 weeksAverage time to develop a BARS scale for a single job family

How to Build a BARS System: Step-by-Step

Developing BARS is a structured process that typically takes 6-8 weeks per job family and involves job experts, HR professionals, and managers.

Step 1: Identify performance dimensions

Conduct a job analysis to determine the 5-9 key performance dimensions for the role. For a customer service representative, dimensions might include: response timeliness, problem resolution accuracy, communication clarity, product knowledge, and escalation judgment. Each dimension should be distinct (not overlapping) and observable (not an internal trait like 'attitude'). Involve experienced employees and managers in defining these dimensions to ensure job relevance.

Step 2: Collect critical incidents

Gather 50-100+ specific examples of effective and ineffective behavior for each performance dimension. Use structured interviews with subject matter experts, review incident documentation, and analyze customer feedback data. For 'response timeliness,' examples might range from 'left a voicemail unreturned for three business days' (poor) to 'called the customer back within 15 minutes during peak volume' (excellent). The more real incidents you collect, the stronger your anchors will be.

Step 3: Assign incidents to scale points (retranslation)

A separate group of subject matter experts (not the ones who provided the incidents) independently assigns each behavioral example to a scale point. If you're using a 7-point scale, each rater places every incident at the level they believe it represents. Keep only the incidents where raters show strong agreement (standard deviation below 1.5 on a 7-point scale). Discard ambiguous examples. This quality control step is what gives BARS its accuracy advantage.

Step 4: Select final anchors

Choose 1-2 behavioral anchors per scale point per dimension. These should be the examples with the highest inter-rater agreement and the clearest, most specific language. A 7-point BARS for one dimension will have 7-14 behavioral statements. Across 6 dimensions, you'll have 42-84 total anchors. Write each anchor in present tense, using observable behavior: 'Completes quality checks on all outgoing orders before shipping' rather than 'Is quality-conscious.'

Step 5: Pilot test and refine

Have managers use the draft BARS to rate current employees. Compare their BARS ratings to other performance indicators (output metrics, customer feedback scores, peer evaluations) to check for convergent validity. If BARS ratings don't correlate with objective performance measures, the anchors need revision. Collect manager feedback on clarity and usability, then refine the scales before full deployment.

BARS Example: Customer Service Representative

Here's what a completed BARS dimension looks like for the 'Problem Resolution' competency in a customer service role.

RatingBehavioral Anchor
7 (Outstanding)Identifies systemic issues from individual complaints, proposes process changes that prevent recurrence, and achieves first-contact resolution on 98%+ of cases including edge cases
6 (Excellent)Resolves non-standard issues without escalation by creatively applying policy exceptions within authority limits, maintaining a 95% first-contact resolution rate
5 (Above Average)Accurately diagnoses the root cause of common and moderately complex issues, resolves them within established timeframes, and follows up with the customer to confirm satisfaction
4 (Average)Handles routine issues according to established procedures, occasionally needs guidance on non-standard cases, and meets the department's 85% first-contact resolution target
3 (Below Average)Resolves simple issues but frequently misdiagnoses moderately complex problems, resulting in repeat contacts and customer frustration
2 (Poor)Applies incorrect solutions to common issues, fails to ask clarifying questions, and escalates cases that should be resolved at first contact
1 (Unacceptable)Provides inaccurate information that worsens customer problems, fails to document case details, and has a first-contact resolution rate below 50%

Advantages of BARS Over Other Appraisal Methods

BARS offers measurable improvements in evaluation quality that justify the development investment for roles where accuracy matters most.

  • Reduced subjectivity. Behavioral anchors define what each rating level means, so managers compare behavior against a standard rather than relying on personal interpretation.
  • Higher inter-rater reliability. Two managers evaluating the same employee using BARS are more likely to agree than with graphic rating scales, because the anchors constrain interpretation.
  • Better feedback quality. Instead of telling an employee they're 'a 3 on communication,' the manager can point to the specific anchor: 'Your communication matches this description at level 4, and here's what level 5 looks like.'
  • Stronger legal defensibility. Job-related behavioral anchors developed through systematic job analysis demonstrate that evaluations are based on legitimate, non-discriminatory criteria.
  • Targeted development. When an employee rates at level 4 on a dimension, the level 5 anchor tells them exactly what behavior they need to demonstrate to improve.
  • Reduced halo effect. Because each dimension has unique, specific anchors, managers are less likely to rate all dimensions the same based on an overall impression.

Limitations of BARS

BARS isn't the right choice for every organization. Understanding the drawbacks helps you make an informed decision.

Development cost and time

Building BARS requires significant upfront investment. The job analysis, critical incident collection, retranslation exercise, and pilot testing typically take 6-8 weeks per job family and involve 10-20 subject matter experts. For an organization with 50 distinct job families, a full BARS implementation is a multi-year project. This is why most organizations use BARS selectively for high-impact roles rather than enterprise-wide.

Maintenance burden

Jobs evolve. Technologies change. Customer expectations shift. The behavioral anchors that accurately described excellent performance in 2024 may be outdated by 2026. BARS requires periodic updates (ideally every 2-3 years per role) to keep anchors current. Organizations that build BARS once and never update it end up with a sophisticated system measuring the wrong behaviors.

Role specificity limits comparability

Because BARS is developed for specific job families, you can't directly compare ratings across different roles. A '5' on a customer service BARS means something different than a '5' on an engineering BARS. This complicates talent review processes that need to compare performance across functions. Some organizations address this by using BARS for within-function evaluations and a simpler scale for cross-functional talent reviews.

BARS vs. BOS: Two Behavioral Approaches Compared

Behavioral Observation Scales (BOS) are often confused with BARS. Both use behavioral descriptions, but they work differently.

FeatureBARSBOS
What the rater doesMatches employee behavior to the closest behavioral anchor on the scaleRates how frequently the employee demonstrates each listed behavior
Scale typeBehavioral anchors at each scale levelFrequency scale (Almost Never to Almost Always) for each behavior
Number of items1-2 anchors per scale point per dimension5-10 behaviors per dimension, each rated separately
Rating taskChoose the best-matching anchorRate frequency of each behavior (more items to complete)
Development complexityHigh (retranslation exercise required)Moderate (behaviors listed but no anchoring process)
Best forRoles where the quality of behavior matters mostRoles where frequency of desired behaviors is the key indicator

Tips for Successful BARS Implementation

These practical recommendations come from organizations that have successfully deployed BARS systems at scale.

  • Start with high-impact roles, not the entire organization. Customer-facing roles, safety-critical positions, and leadership positions benefit most from BARS precision. Administrative or support roles may not justify the development investment.
  • Use existing critical incident data. If your managers already document performance incidents (formally or informally), mine those records for behavioral examples. It significantly reduces the SME interview burden.
  • Limit dimensions to 5-7 per role. More dimensions means more development time, longer evaluation forms, and rater fatigue. Focus on the dimensions that most differentiate high performers from average performers.
  • Train raters on the BARS system before deployment. A 90-minute session explaining how anchors work, how to select the best-matching level, and how to handle situations where the employee's behavior falls between two anchors improves consistency.
  • Build review into the annual calendar. Schedule a BARS update review every 24 months per role family. Assign a subject matter expert to flag anchors that no longer reflect current job requirements.
  • Combine BARS with other data sources. BARS is stronger for evaluation accuracy, but adding self-assessment and peer feedback provides a more complete performance picture.

BARS Research and Adoption Statistics [2026]

Research data on the effectiveness and adoption of behaviorally anchored rating systems.

28%
Reduction in rater bias using BARS vs. standard graphic scalesLandy & Farr meta-analysis
6-8 wks
Typical development time for BARS per job familySHRM, 2024
23%
Of Fortune 500 companies use BARS for at least some rolesMercer, 2024
0.78
Average inter-rater reliability coefficient for BARS (vs. 0.56 for graphic scales)Borman, 1991

Frequently Asked Questions

How is BARS different from a regular rating scale with descriptions?

The key difference is development rigor. A graphic rating scale with descriptions has anchors written by HR or a consultant based on their judgment of what good and poor performance looks like. BARS anchors are derived from actual critical incidents collected from people who do or supervise the job, then validated through a retranslation exercise where independent raters confirm that each anchor belongs at its assigned scale point. This empirical development process is what gives BARS higher reliability and validity. It's the difference between an expert's guess and a research-validated measurement instrument.

Can we use BARS for all job types?

You can, but the return on investment varies. BARS delivers the most value for roles where behavioral consistency matters: customer service, healthcare, safety-critical operations, sales, and leadership. For highly individualized roles where success depends on novel problem-solving (research scientists, creative directors), the behavioral anchors can feel constraining because there isn't one right way to perform well. For roles with very clear, quantifiable output (warehouse picking, data entry), MBO or output-based metrics may be more appropriate than behavioral evaluation.

How often should BARS scales be updated?

Review and update every 2-3 years per role, or sooner if the job changes significantly due to technology adoption, reorganization, or strategic shifts. The review process is faster than the initial development because you're modifying existing anchors rather than building from scratch. Have 3-5 subject matter experts review each dimension's anchors and flag any that no longer represent current job requirements. Replace outdated anchors with new critical incidents that reflect current performance expectations.

Is BARS worth the investment for small companies?

For companies with fewer than 100 employees, full BARS development is often hard to justify. The development cost is high and the scales serve a small number of people. Small companies can capture some of BARS' benefits at lower cost by adding 2-3 behavioral examples to their existing graphic rating scale for the most important dimensions. This 'BARS-lite' approach isn't as rigorous as true BARS but significantly improves rating quality over unanchored scales.

Can BARS be used for self-assessment?

Yes, and it's particularly effective. When employees rate themselves against specific behavioral anchors rather than vague trait labels, the self-assessment becomes more honest and more useful. Employees can see exactly where their behavior falls on the scale and identify what they'd need to do differently to reach the next level. Research shows that self-assessment accuracy improves when behavioral anchors are present because employees are matching their behavior against concrete descriptions rather than deciding whether they're 'good' or 'excellent' at a trait, which inflates self-ratings.

How does BARS handle employees whose behavior falls between two anchors?

This is common. An employee's behavior may match level 4 in some aspects and level 5 in others. Train raters to select the level whose anchor best describes the employee's typical behavior during the evaluation period. If the behavior genuinely falls between two levels, assign the lower level and note in comments what the employee would need to demonstrate consistently to earn the higher rating. This conservative approach prevents ratings inflation and gives the employee a clear development target.
Adithyan RKWritten by Adithyan RK
Surya N
Fact-checked by Surya N
Published on: 25 Mar 2026Last updated:
Share: