uniq Command Guide
The uniq command filters out repeated lines in a file. Learn how to find unique entries and count duplicates.
5 min read•Last updated: 2024
Dai Aoki
CEO at init, Inc. / CTO at US & JP startups / Creator of WebTerm
Quick Reference
Basic
uniq fileRemove adjacent dupssort file | uniqRemove all dupsuniq -c fileCount occurrencesOptions
-dShow only duplicates-uShow only unique-iIgnore caseCommon
sort | uniq -cCount each linesort | uniq -c | sort -rnTop countssort -u fileAlternative dedupDownloadable Image Preview
Failed to generate preview
Basic Usage
uniq removes consecutive duplicate lines. The input must be sorted first.
bash
# Remove duplicates (input must be sorted)
sort file.txt | uniqWarning
uniq only removes consecutive duplicates. Always use
sort first to group duplicates together.Common Options
uniq Options
| -c | Prefix lines with occurrence count |
| -d | Only print duplicate lines |
| -u | Only print unique lines |
| -i | Ignore case when comparing |
| -f N | Skip first N fields |
| -s N | Skip first N characters |
| -w N | Compare only first N characters |
Counting Occurrences
bash
# Count occurrences of each line
sort file.txt | uniq -c
# Output example:
# 3 apple
# 1 banana
# 2 cherryFinding Duplicates
bash
# Show only lines that appear more than once
sort file.txt | uniq -d
# Show only lines that appear exactly once
sort file.txt | uniq -uCase-Insensitive
bash
# Treat "Apple" and "apple" as duplicates
sort -f file.txt | uniq -iPractical Examples
Find most common lines
bash
sort file.txt | uniq -c | sort -rn | head -10Count unique visitors from log
bash
# Extract IPs and count unique
awk '{print $1}' access.log | sort | uniq | wc -l
# Most frequent visitors
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10Find duplicate files by checksum
bash
md5sum * | sort | uniq -d -w32Count word frequency
bash
cat file.txt | tr '[:upper:]' '[:lower:]' | tr -s ' ' '\n' | sort | uniq -c | sort -rnFind commands you use most
bash
history | awk '{print $2}' | sort | uniq -c | sort -rn | head -10Remove duplicate lines from file
bash
sort file.txt | uniq > unique.txt
# Or use sort -u
sort -u file.txt > unique.txtFind lines in file1 but not in file2
bash
sort file1.txt file2.txt file2.txt | uniq -uSkip Fields or Characters
bash
# Skip first field (compare from second field)
sort -k2 data.txt | uniq -f1
# Skip first 10 characters
sort data.txt | uniq -s10
# Compare only first 20 characters
sort data.txt | uniq -w20uniq vs sort -u
sort -u is often simpler for basic deduplication:
bash
# These are equivalent for basic deduplication
sort file.txt | uniq
sort -u file.txt
# But uniq has more features
sort file.txt | uniq -c # Count occurrences
sort file.txt | uniq -d # Show only duplicatesWithout Sorting
If you need to remove duplicates while preserving order, use awk:
bash
# Remove duplicates, preserve order
awk '!seen[$0]++' file.txtTip
The awk method preserves original order and removes non-consecutive duplicates, but is slower for large files.
Summary
uniq is essential for data deduplication. Key takeaways:
- Always
sortbeforeuniq - Use
uniq -cto count occurrences - Use
uniq -dto find duplicates - Use
uniq -uto find unique lines - Use
sort -ufor simple deduplication