Data and Postal Address Deduplication Software
Data deduplication and postal address deduplication (duplicate check) involve identifying any certain or presumed duplicated addresses in the database, caused by inaccuracies or inconsistencies, and the duplicates can then be reduced to a single form.
Egon, our data and address deduplication software, has been specifically constructed to validate databases and deduplicate the data and the postal addresses, by selecting various criteria such as name, surname, enterprise name, vat code, address, etc., identifying all possible redundant data.
Each validated element is assigned a unique match code, which is analysed and evaluated by means of precise rules and enables extremely reliable identification of the duplicate records, highlighting them in the output report.
In the same process, the module manages and compares a number of archives even with record tracks that are lacking in cohesion.
If your database needs analysing to eliminate the duplicated data, or you need a tool to integrate with your software and backup data input by highlighting any superfluous input (as the record is already archived), then try EGON.
Why data deduplication?
Data quality needs all the aspects have to be correctly handled, and duplicated data are a problem for both man-software interaction and for unexpected expenses, for example, in the specific field of address deduplication, due to double mailings of advertising material or official notifications to the same company, which not only wastes money but also harms the company’s image.
When sending correspondence, the availability of a correct, complete, validated and duplicate free database is a sign of professionalism, which is why integrating an automatic validation software grants considerable benefits. If a number of databases need converging, deduplication can be applied to natural persons and legal entities, and exploits the “match-code” method, which permits: eliminating the duplicates, identifying families, enhancing internal data:
- Data correlation
- Integration of the registry data, records, addresses and archives
- Unification of data and documents about a single subject
- Management of a single position for the subject and assigning a PIN
- Management of duplicated data
Match code and duplicate check
The deduplication output is the match-code, a string of characters or numbers, where some of the elements in the registry data (surname, enterprise name, name, sex, location, postcode DUG, street, number) are recorded or coded. The match-code offers extremely reliable performance in terms of identifying duplicated record groups.
Operationally, Duplicate Check involves:
- Database deduplication
- Keys and rules parameter management
- Managing probable duplicates
- Group management
- Creating match-codes/phonetic keys
- Database matching
- Relationships and ties management
- Converging databases
- Personal data deduplication