uniq Command Guide

Basic Usage

uniq removes consecutive duplicate lines. The input must be sorted first.

bash

# Remove duplicates (input must be sorted)
sort file.txt | uniq

Warning

uniq only removes consecutive duplicates. Always use sort first to group duplicates together.

Common Options

uniq Options

-c	Prefix lines with occurrence count
-d	Only print duplicate lines
-u	Only print unique lines
-i	Ignore case when comparing
-f N	Skip first N fields
-s N	Skip first N characters
-w N	Compare only first N characters

Counting Occurrences

bash

# Count occurrences of each line
sort file.txt | uniq -c

# Output example:
#   3 apple
#   1 banana
#   2 cherry

Finding Duplicates

bash

# Show only lines that appear more than once
sort file.txt | uniq -d

# Show only lines that appear exactly once
sort file.txt | uniq -u

Case-Insensitive

bash

# Treat "Apple" and "apple" as duplicates
sort -f file.txt | uniq -i

Practical Examples

Find most common lines

bash

sort file.txt | uniq -c | sort -rn | head -10

Count unique visitors from log

bash

# Extract IPs and count unique
awk '{print $1}' access.log | sort | uniq | wc -l

# Most frequent visitors
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10

Find duplicate files by checksum

bash

md5sum * | sort | uniq -d -w32

Count word frequency

bash

cat file.txt | tr '[:upper:]' '[:lower:]' | tr -s ' ' '\n' | sort | uniq -c | sort -rn

Find commands you use most

bash

history | awk '{print $2}' | sort | uniq -c | sort -rn | head -10

Remove duplicate lines from file

bash

sort file.txt | uniq > unique.txt

# Or use sort -u
sort -u file.txt > unique.txt

Find lines in file1 but not in file2

bash

sort file1.txt file2.txt file2.txt | uniq -u

Skip Fields or Characters

bash

# Skip first field (compare from second field)
sort -k2 data.txt | uniq -f1

# Skip first 10 characters
sort data.txt | uniq -s10

# Compare only first 20 characters
sort data.txt | uniq -w20

uniq vs sort -u

sort -u is often simpler for basic deduplication:

bash

# These are equivalent for basic deduplication
sort file.txt | uniq
sort -u file.txt

# But uniq has more features
sort file.txt | uniq -c    # Count occurrences
sort file.txt | uniq -d    # Show only duplicates

Without Sorting

If you need to remove duplicates while preserving order, use awk:

bash

# Remove duplicates, preserve order
awk '!seen[$0]++' file.txt

Tip

The awk method preserves original order and removes non-consecutive duplicates, but is slower for large files.

Summary

uniq is essential for data deduplication. Key takeaways:

Always sort before uniq
Use uniq -c to count occurrences
Use uniq -d to find duplicates
Use uniq -u to find unique lines
Use sort -u for simple deduplication

Quick Reference

Basic

Options

Common

Downloadable Image Preview

Basic Usage

Common Options

uniq Options

Counting Occurrences

Finding Duplicates

Case-Insensitive

Practical Examples

Find most common lines

Count unique visitors from log

Find duplicate files by checksum

Count word frequency

Find commands you use most

Remove duplicate lines from file

Find lines in file1 but not in file2

Skip Fields or Characters

uniq vs sort -u

Without Sorting

Summary

Related Articles

sort Command Guide

cut Command Guide

wc Command Guide