Standardized assessments given to job candidates before hiring to evaluate their skills, cognitive ability, personality, or physical fitness for a role.
Key Takeaways
Pre-employment testing refers to any standardized assessment given to a job candidate before a hiring decision is made. These tests measure abilities, skills, traits, or behaviors that predict how well someone will perform in the role. The category is broad. It includes cognitive ability tests, personality assessments, skills tests (typing speed, coding challenges, Excel proficiency), physical ability tests, integrity tests, emotional intelligence assessments, and job simulations. The purpose is to add objective data to a hiring decision that would otherwise rely on resumes (which are self-reported and often inflated) and interviews (which are subject to interviewer bias). A 2016 meta-analysis by Frank Schmidt, published in the Annual Review of Psychology, found that cognitive ability tests have a predictive validity of 0.54 for job performance. That's higher than unstructured interviews (0.38), years of experience (0.18), and education level (0.10). It's the single strongest predictor available to employers. SHRM's 2024 survey found that 82% of companies use some form of pre-employment assessment. The growth has been driven by tighter labor markets, higher hiring costs (average cost-per-hire is $4,129 according to SHRM), and the availability of online assessment platforms that make testing scalable.
Resumes tell you what someone claims to have done. Interviews tell you how well they present themselves. Neither reliably predicts job performance. A candidate who interviews brilliantly might struggle with the actual work. A candidate who's nervous in interviews might be your top performer. Pre-employment tests close that gap by measuring the specific abilities and traits that matter for the role. They also create a level playing field. Every candidate takes the same test under the same conditions, producing scores that can be compared objectively. This reduces the influence of interviewer bias, networking connections, and pedigree-based filtering.
Screening (background checks, reference calls, drug tests) verifies factual information about a candidate. Testing evaluates capability and fit. Screening asks "Is what this person told us true?" Testing asks "Can this person do the job well?" Both are pre-hire activities, but they serve different purposes and use different methods. Most hiring processes include both, with testing earlier in the funnel (after application, before interview) and screening later (after conditional offer).
Each test type measures a different dimension of candidate capability. The right combination depends on the role's requirements and the competencies that predict success.
| Test Type | What It Measures | Predictive Validity | Best For | Example Platforms |
|---|---|---|---|---|
| Cognitive Ability | Problem-solving, logical reasoning, numerical ability, verbal comprehension | 0.51-0.54 | Roles requiring learning speed and analytical thinking | Wonderlic, Criteria Cognitive Aptitude Test (CCAT) |
| Personality Assessment | Behavioral tendencies, work style, interpersonal patterns (Big Five model) | 0.22-0.31 | Culture fit, leadership potential, team-based roles | Hogan, SHL OPQ, 16PF |
| Skills Test | Job-specific technical abilities (coding, writing, Excel, typing) | 0.40-0.55 | Technical roles, administrative positions, content creation | HackerRank, TestGorilla, Vervoe |
| Situational Judgment | Decision-making in realistic work scenarios | 0.26-0.34 | Customer service, management, any role requiring judgment | SHL, Cappfinity, Custom-built |
| Integrity Test | Honesty, reliability, rule-following tendencies | 0.32-0.41 | Retail, finance, roles with access to assets or sensitive data | Hogan Reliability Scale, PSI |
| Physical Ability | Strength, endurance, coordination, physical task performance | Varies by test | Manufacturing, warehousing, emergency services, trades | Employer-designed, WorkSTEPS |
| Emotional Intelligence | Self-awareness, empathy, social skills, emotion regulation | 0.20-0.30 | Leadership, client-facing, healthcare, coaching roles | EQ-i 2.0, MSCEIT |
| Job Simulation | Performance on realistic tasks that mirror actual job duties | 0.35-0.50 | Any role where task performance can be simulated | Pymetrics (now Harver), Arctic Shores |
Where you place the test in your hiring funnel, which test you choose, and how you communicate it to candidates all affect whether testing improves your outcomes or just adds friction.
The most common placement is after the application screen but before the first interview. This position filters out candidates who look good on paper but lack the required abilities, saving interview time for both the hiring team and the candidate. For high-volume roles (customer service, warehouse, retail), testing can be the first step after application, reducing a pool of 500 applicants to 50 qualified candidates before any human review. For senior roles, testing typically comes after an initial recruiter screen, because executives may balk at taking a test before they've had a conversation. There's no universal right answer. Test too early and you may deter good candidates who see it as impersonal. Test too late and you've already invested interview time in candidates who would have been screened out.
Start with a job analysis. What skills, abilities, and behaviors predict success in this specific role? A software engineering position might call for a coding test (skills) plus a cognitive ability test (learning speed). A sales role might use a personality assessment (extraversion, resilience) plus a situational judgment test (objection handling). A warehouse associate role might require a physical ability test plus an integrity test. Don't test for things that don't matter. If the role doesn't require advanced math, don't include a numerical reasoning section just because it's part of a standard battery. Every irrelevant test question adds friction and increases candidate dropout.
A cut score is the minimum test result required to advance. Setting it too high eliminates good candidates. Setting it too low makes the test meaningless. There are two approaches. The norm-referenced approach compares candidates against each other and advances the top 30 to 50%. The criterion-referenced approach sets a minimum score based on the results of current high performers in the same role. The criterion-referenced approach is more defensible legally, because it's directly tied to job performance data. Whatever method you use, document the rationale for the cut score. If the test is ever challenged, you'll need to show that the threshold is job-related and consistent with business necessity.
Pre-employment testing is regulated in most jurisdictions. In the United States, the EEOC enforces Title VII of the Civil Rights Act, which prohibits tests that disproportionately screen out protected groups unless the test is job-related and consistent with business necessity.
A test has adverse impact if the pass rate for any protected group (by race, sex, age, etc.) is less than 80% (four-fifths) of the pass rate for the highest-scoring group. For example, if 60% of white applicants pass a test and only 40% of Black applicants pass, the selection ratio is 40/60 = 0.67, which is below the 0.80 threshold. This doesn't automatically make the test illegal, but it shifts the burden to the employer to prove the test is valid and job-related. The EEOC's Uniform Guidelines on Employee Selection Procedures (1978) provide the framework for this analysis. Regularly audit your test results by demographic group. If adverse impact exists, consider whether alternative tests with less impact could serve the same purpose.
The Americans with Disabilities Act (ADA) requires employers to provide reasonable accommodations for candidates with disabilities who need them to take a pre-employment test. This might include extended time, larger font sizes, screen readers, a separate testing room, or alternative test formats. The ADA also prohibits pre-offer medical examinations. Physical ability tests that are medical in nature (blood pressure, strength measurements correlated with medical conditions) can only be administered after a conditional offer of employment. Non-medical physical tests (like carrying a 40-pound box up stairs) can be given pre-offer, as long as all candidates for the same role are tested equally.
Over 35 states and 150 cities in the US have "ban the box" laws that restrict when employers can ask about criminal history. While these don't directly regulate skills or cognitive tests, they're part of the broader legal environment around pre-employment screening. Some jurisdictions also regulate personality tests, particularly those that probe into medical or psychological conditions. Know your local regulations before implementing any assessment. Illinois' Artificial Intelligence Video Interview Act and New York City's Local Law 144 (regulating AI-based employment decisions) represent the next wave of regulation that may affect automated assessment tools.
Tests that are too long, irrelevant, or poorly communicated drive away qualified candidates. Balancing assessment rigor with candidate experience is a design challenge every employer faces.
Data from assessment platforms consistently shows that candidate completion rates drop significantly after 30 minutes. TestGorilla reports an 84% completion rate for assessments under 20 minutes, dropping to 62% for assessments between 30 and 45 minutes, and below 50% for anything over an hour. For most roles, 15 to 30 minutes is the sweet spot. If you need more assessment data, break it into two shorter sessions rather than one long marathon.
Candidates who understand why they're being tested are more likely to complete the assessment and view it positively. Before the test, explain what type of assessment it is, how long it will take, how results are used in the hiring decision, and that accommodations are available upon request. After the test, share results (even summary-level feedback) when possible. Candidates who receive feedback report higher satisfaction with the hiring process, even when they don't get the job. Transparency reduces the perception that testing is a gatekeeping exercise.
Over 60% of job seekers apply via mobile devices (Indeed, 2024). If your assessment doesn't work on a phone, you're losing candidates. Modern assessment platforms (TestGorilla, Criteria, Harver) offer mobile-optimized test experiences. Verify that your assessment is fully functional on iOS and Android before deploying it. Text that's too small, buttons that are too close together, or timed sections that penalize small-screen users will create adverse experiences and potentially introduce bias (candidates with access to computers have an advantage).
A test is only useful if it actually predicts job performance (validity) and produces consistent results (reliability). Many commercially available tests have weak evidence for one or both.
Criterion validity measures how well test scores predict actual job performance (usually assessed by correlating test scores with performance ratings or turnover data). Content validity means the test covers a representative sample of the knowledge, skills, and abilities required for the job. A typing test for an administrative assistant has high content validity. A personality quiz for the same role has low content validity. Construct validity means the test measures the psychological construct it claims to measure (intelligence, conscientiousness, emotional stability). This is established through research, not through a single study. When evaluating a vendor's test, ask for criterion validity studies conducted with sample sizes of 100 or more, across multiple organizations and job types.
Be skeptical of vendors who can't provide peer-reviewed validity data, claim their test predicts performance for all job types without role-specific validation, use proprietary scoring algorithms they won't explain, don't report adverse impact statistics by demographic group, or market their tool as "AI-powered" without explaining what the AI actually does. The assessment industry includes both rigorously validated tools (SHL, Hogan, Wonderlic) and questionable products with little scientific backing. Due diligence matters. Ask for a technical manual, validation studies, and adverse impact analyses before signing a contract.
Quantifying the return on pre-employment testing requires comparing the cost of testing against the cost of bad hires and turnover. The math usually works in testing's favor.
The assessment industry is shifting toward AI-powered, game-based, and skills-first approaches. Here's what's changing.
Platforms like Pymetrics (now Harver) and Arctic Shores use neuroscience-based games instead of traditional questionnaires. Candidates play short games that measure cognitive traits (attention, memory, risk tolerance, pattern recognition) without feeling like a test. These tools claim to reduce adverse impact because they're less influenced by educational background and language proficiency. Early research is promising, but long-term validity data is still limited compared to traditional cognitive tests.
AI is being used to score open-ended responses (essays, case studies, video interviews) that traditionally required human evaluators. HireVue, Criteria, and Vervoe all use machine learning models to evaluate candidate responses. The regulatory environment is catching up: New York City's Local Law 144 (effective 2023) requires annual bias audits of AI-powered employment decision tools. Illinois and Maryland have also passed laws regulating AI in hiring. Before adopting AI-scored assessments, understand your jurisdiction's requirements and ensure the vendor conducts regular bias audits.
The shift toward skills-based hiring (dropping degree requirements in favor of demonstrated competency) is driving demand for short, targeted skills assessments. Instead of a 60-minute comprehensive battery, candidates complete a 10-minute assessment focused on one specific skill (Python proficiency, financial modeling, customer email writing). These micro-assessments reduce candidate friction and can be stacked: start with a short screen, then give passing candidates a longer, deeper assessment. TestGorilla, Toggl Hire, and Filtered are leading this approach.