When everything gets 4 or 5 stars, nothing is truly important. Ranking questions force respondents to choose - and that’s where real priorities surface.
Rating scales measure intensity. Ranking measures priority. If you need trade-offs, ratings won’t give them to you. People cluster their ratings at the top and you end up with ten items all scored 4.2 out of 5. Ranking makes them pick, and picking is where the data gets honest.
When to Use
Use ranking questions when you need to know what comes first, not just what’s good:
- Prioritize features - “Which improvements matter most to you?”
- Test messaging - “Which value proposition resonates best?”
- Inform roadmaps - “Rank these problems by impact on your work”
- Compare ideas - “Which concepts do you prefer?”
- Allocate resources - “Where should we invest next?”
- Segment audiences - Different groups rank differently, and that difference is the insight
Ranking Methods
Four methods. Each works for a different list size and context. The system picks one based on your option count, or you choose manually.
Drag & Drop (Sort)
Respondents drag items into their preferred order. Simple, familiar, fast.
Use it for 5 or fewer items. Beyond that, people carefully place the top 2-3, then drag the rest into random spots just to finish. The middle of a long drag-and-drop list is noise, not signal.
Pairwise Comparison
Two items at a time. Pick the better one. Repeat.
This mirrors how people naturally decide things. Comparing two options is fast and feels effortless. Behind the scenes, each option gets a win rate based on how often it was chosen, producing a full ranking from simple binary choices.
You choose between “complete” pairwise (every possible pair shown) or “partial” (each respondent sees a sample). With 10 options there are 45 pairs; with 20 there are 190. Partial pairwise spreads the work across respondents so nobody votes on all combinations.
Use it for 6-15 options. Can feel repetitive if you show too many comparisons to one person - keep the number of pairs per respondent reasonable.
MaxDiff (Best-Worst Scaling)
Respondents see small subsets (3-5 options at a time) and pick the most and least important from each set. After several rounds with different combinations, votes are scored into a full ranking.
Each screen gives you two data points - a “best” and a “worst” signal - so MaxDiff extracts twice the information per screen compared to pairwise. It handles long lists because respondents never see the full set at once.
Raw scores are based on best and worst selections, then normalized for comparison
Use it for 8+ options. Sweet spot is 10-30 items with 3-5 shown per screen.
The catch: respondents must be familiar with all the options. Picking the best only requires recognizing one strong option. Picking the worst requires understanding all of them. If awareness varies, respondents guess - and guesses turn into noise.
Budget Allocation
Give respondents a fixed pool of points and let them distribute it. Also known as constant sum.
This is the only method that captures magnitude. Pairwise and MaxDiff tell you respondents prefer A over B. Budget allocation tells you they’d spend 40 points on A and 5 on B - they care about A eight times more.
The tradeoff: people tend to dump most points into one item and scatter the rest without much thought. Don’t over-interpret small differences between low-scoring items. Keep the list under 8 options - splitting 100 points across 15 items is more arithmetic than research.
Which Method Should I Pick?
| Method | Best for | Options | Respondent effort |
|---|---|---|---|
| Drag & Drop | Quick ranking of short lists | 2-5 | Low |
| Pairwise | Medium lists, mobile-friendly | 6-15 | Low per vote |
| MaxDiff | Long lists, research projects | 8-30+ | Moderate |
| Budget | When magnitude matters | 3-8 | Higher |
Simple rules: fewer than 6 items, use drag-and-drop. Between 6 and 15, use pairwise. Above that, use MaxDiff - unless you need to know how much more one option matters than another, in which case use budget with a short list.
Configuration Options
- Algorithm - Leave on
autounless you have a reason to override. The system picks the method that fits your list size and minimizes respondent fatigue - Mode (Pairwise) -
fullshows every possible pair to each respondent.partialsamples a subset and spreads coverage across respondents. Defaults topartial, which is what you want for most surveys - Items per screen (MaxDiff) - How many options appear in each subset, 3-10. Defaults to
min(5, list size). Lower values are easier for respondents; higher values extract more data per screen - Target views per item (MaxDiff, Pairwise partial) - How many times each option appears across all screens, 1-10. Defaults to 3. The system reduces this to 2 automatically if the resulting survey would exceed 20 screens
Interpreting Results
Collecting rankings is half the job. Reading them correctly is the other half.
All methods produce different raw scores - win rates, best-worst differences, point totals, position order. To make results comparable regardless of method, we normalize everything to a 0-100 scale. An item scoring 85 means the same thing whether it came from pairwise, MaxDiff, or budget allocation. This lets you switch methods between surveys or compare results across different ranking questions without recalibrating how you read the numbers.
- Small gaps between middle-ranked items are often statistical noise. Large gaps at the top or bottom are where decisions should focus.
- Scores are relative, not absolute. A score of 85 means that option ranked highly compared to the others in your list. It does not mean 85% of people want it. Change the list and the scores shift
- Budget results show magnitude, others don’t. Pairwise and MaxDiff tell you the order. Budget tells you how much more one item matters than another. A 10-point gap between two items in budget allocation is meaningful. The same gap in normalized pairwise scores may not be
- Segment before you conclude. Overall rankings often hide the real story. Your power users and casual users may rank the same list in opposite order. Break results down by audience segment before making decisions
Best Practices
Write Comparable Options
All items should be the same type of thing. Mixing features with bug fixes and business goals gives you a ranking that nobody can act on. “Faster search” and “Better onboarding” are comparable. “Faster search” and “Fix login bug” are not.
Keep Lists Focused
Every option you add costs respondent attention and adds noise to results. Only include items you’d actually act on. If an option ranks last and you still wouldn’t cut it, leave it out.
Watch for Position Bias
In drag-and-drop, items at the top tend to stay there. Randomize the starting order so position doesn’t become a hidden variable in your data.
Match Method to Device
Pairwise and MaxDiff work well on phones. Drag-and-drop is harder on small screens - dragging on mobile is fiddly. Budget allocation requires concentration that a phone-on-the-bus respondent won’t give you.