Data Cleaning

As well as identifying data quality issues, Spotless can clean dirty data and replacing invalid fields with corrected data.

Supported rules are:

  • Fixing dates and times to a specific format
  • Correcting numbers to a given number of decimal places
  • Setting unique foreign key references that have been truncated or corrupted
  • Adding missing data using our machine learning lookalike model
  • Updating common spelling mistakes with a lookup rule

    Blog posts about Data Cleaning

    March 3, 2017, 10:41 a.m.
    The amount of data in need of cleaning to a high quality is set to rocket due to the explosion in IoT gadgets. The internet of things (IoT) has been described as the infrastructure of the information society. All these "things" connect to the Internet via wifi and then "talk" to each other. Gartner estimates that by 2020 over 26 billion of these devices will be connected ...
    Read More
    April 7, 2017, 7:46 a.m.
    Data Cleaning is a pre-requisite for an effective Single Customer View. Try our Machine Learning Filters to say good-bye to your Rogue Data now! A single customer view (SCV) is a data management concept where a single page containing all the information about a particular customer is gathered together so that it can be easily retrieved and reviewed by anyone within your business or organisat...
    Read More
    March 20, 2017, 12:45 p.m.
    Anyone who wants to be in charge of their data should consider using Spotless Data's Machine Learning Filters data cleaning solution. Who would benefit from using Spotless data quality API? The simple answer is that any business or organisation that has to deal with any data would benefit, whether it is a new start-up or a multi-billion dollar business. However, we have identified thr...
    Read More
    Jan. 13, 2017, 7:58 a.m.
    Whether sending through the post or via email, your address database needs to be spotlessly clean. Addresses, which include both email and postal addresses, have never been more important than today. Many companies, charities and other organisations hold large quantities of data in the form of postal and email addresses. They find that duplications and other errors and inaccuracies reduce th...
    Read More
    Dec. 30, 2016, 9:55 a.m.
    High-quality data is the first step towards actually protecting the private information of your customers. Importance of compliance regarding PIIs The requirements demanded by authorities for all businesses who have users with personally identifiable information (PII) throughout the world to comply with their ever more demanding regulations has been one of the news stories of 2016. As cas...
    Read More
    Dec. 19, 2016, 8:54 a.m.
    The concept of refining data perfectly illustrates how important data quality is in the 21st century. Data as the new oil If data are the new oil, then the data scrubbers, ensuring that your data have data integrity, are the new oil refineries. Crude oil is the basic ingredient of a whole range of products and yet of itself has little value until it has been refined into the products we a...
    Read More
    May 30, 2017, 6:48 a.m.
    Data integration is one of four examples where using Spotless Data's machine learning filters can ensure you have a seamless process. Spotless Data is a unique web-based Data Quality API solution to ensure the data cleaning of your data so that they have the data quality you can trust to ensure your company stands out among its rivals. As a part of our ongoing blogs designed to explain h...
    Read More
    May 12, 2017, 6:51 a.m.
    Dirty or rogue data will always have a profound effect on any organisation. Spotless Data have recognised five fundamental different types of cleansing of dirty data which can be done by using our unique web-based data quality API solution. 1. Regex Regex are regular expressions, which define search patterns and identify particular strings found within data sets. For instance, if you k...
    Read More
    Nov. 15, 2016, 9:50 a.m.
    User-generated-content is one area where data quality has never been so important which is why using Spotless Data to ensure data integrity makes so much sense. Corrupted data At Spotless Data we estimate that 5% of overall data held by companies is corrupted and lacking in data integrity, though a recent report estimated that manually entered data could contain an error rate of anywhere ...
    Read More
    Oct. 14, 2016, 10:23 a.m.
    At the heart of any successful data management is ensuring the quality of the data. Recent research from SAS suggests that a large number of companies are falling behind in data management, and many of those that are "laggards" believe their data management is just as effective as those who are leading. SAS describe the difference as people who have "a clear approach to dat...
    Read More