WHY DATA DEDUPLICATION?
Data quality needs all the aspects have to be correctly handled, and duplicated data are a problem for both man-software interaction and for unexpected expenses, for example, in the specific field of address deduplication, due to double mailings of advertising material or official notifications to the same company, which not only wastes money but also harms the company’s image.
When sending correspondence, the availability of a correct, complete, validated and duplicate free database is a sign of professionalism, which is why integrating an automatic validation software grants considerable benefits. If a number of databases need converging, deduplication can be applied to natural persons and legal entities, and exploits the “match-code” method, which permits: eliminating the duplicates, identifying families, enhancing internal data.
- Data correlation
- Integration of the registry data, records, addresses and archives
- Unification of data and documents about a single subject
- Management of a single position for the subject and assigning a PIN
- Management of duplicated data
MATCH CODE AND DUPLICATE CHECK
The deduplication output is the match-code, a string of characters or numbers, where some of the elements in the registry data (surname, enterprise name, name, sex, location, postcode DUG, street, number) are recorded or coded. The match-code offers extremely reliable performance in terms of identifying duplicated record groups.
Operationally, Duplicate Check involves:
- Database deduplication
- Keys and rules parameter management
- Managing probable duplicates
- Group management
- Creating match-codes/phonetic keys
- Database matching
- Relationships and ties management
- Converging databases
- Personal data deduplication