Search for duplicate files
When browsing a filesystem the file
browser can show file checksum /
hash value on demand in last column, allowing to identify binary
identical files which have same checksum/hash value.
Clicking the name of the function (in context menu, "File tools" group)
will display hash or checksum value for all (or selected) files.
Clicking "Find duplicates" will display size and hash or checksum value
only for duplicate files - same binary identical content featured in
two or more distinct files - and report the number of non-unique files
cases, sorting for CRC column allows to group all files (in
same folder, or same search filter) with identical hash or checksum.
The verification function can be set in main application's menu:
Checksum/hash), a wide selection of algorithms can be selected, ranging
from simple checksum functions as Adler32, CRC family (CRC16/24/32, and
CRC64) to hash functions like eDonkey/eMule, MD4, MD5, and
cryptographically strong hash as Ripemd160, SHA-1 and SHA-2
(SHA224/256/384, and SHA512), and Whirlpool512.
When browsing an archive this on demand verification is not
available, but some archive types provides the same integrity-checking
information, saving for each archived object the pre-computed
checksum or hash value depending on the archive format, and on the
archival settings employed - i.e. CRC32 in ZIP archives -
allowing to sort archive content by CRC column to group identical files
and find out duplicates.
Find similar images
When browsing a filesystem, PeaZip can display
menu, organize, check show picture thumbnails, or select a file
browser's preset style showing thumbnails.
While checksum/hash based inspection allows to find exactly identical
files (and images), thumbnails allows the user to find similar images
(i.e. same picture or graphic saved in different formats, or with
different color depth or compression settings, or scaled to different
sizes), to help in deciding if the (pseudo)duplication is acceptable,
and what copy to keep.
Verify multiple checksum and hash values at once
Check files utility
"File tools" submenu (context menu) allows to compare multiple hash and
of multiple files at once. Employing
multiple functions, and relying on cryptographically
strong hash algorithms as Ripemd, SHA-2, Whirlpool, can identify even
of forging identical-looking files.
Alternative method: byte-to-byte comparison
"File tools" submenu performs byte to byte comparison between two
files; unlike checksum/hash method it is not subject of collisions
circumstance, and can report what the different bytes are - so it not
only tells if two files are not identical, but also what changes were
made between the two versions.
information about duplicate data detection
External online resources: definitions of checksum, and hash
Topics and serach suggestions: how to find duplicate files on Microsoft
Linux systems, compare and deduplicate content,
detect redundant data, remove identical files wiki, calculate checksum
algorithm, compute hash
FIPS, identify similar images, seemingly identical files, find multiple
copies of a file.