Data Cleaning

As well as identifying data quality issues, Spotless can clean dirty data and replacing invalid fields with corrected data.

Supported rules are:

  • Fixing dates and times to a specific format
  • Correcting numbers to a given number of decimal places
  • Setting unique foreign key references that have been truncated or corrupted
  • Adding missing data using our machine learning lookalike model
  • Updating common spelling mistakes with a lookup rule

    Blog posts about Data Cleaning

    March 9, 2018, 7:23 a.m.
    The key to success with data is to get as many automated processes in place as possible, and the best place to start is by getting Spotless machine learning filters to validate, clean and integate the data before they enter your platform. IBM has recently predicted that there will be 700,000 job vacancies for data scientists in 2020. Given that such a quantity of data experts simply doesn...
    Read More
    Feb. 23, 2018, 10:44 a.m.
    When data cleaning all your big data with Spotless Data's machine learning filters API solution you have only valid and integrated data in your data platforms. We have developed the Spotless Data API solution for all your dirty or rogue data issues realising that the analogy of data as the new oil and data cleaners as the new refineries is only partly correct. While oil refineries are c...
    Read More
    Feb. 16, 2018, 12:05 p.m.
    In order to achieve the data quality you can trust in where everything works seamlessly requires data refining just as surely as crude oil does. Don't let rogue data destroy your platforms by using the Spotless API solution. Data is increasingly being seen as the new oil. It is unfortunate that the concept of refining these data, which are increasingly big data, does not have in the publ...
    Read More
    Dec. 27, 2017, 11:11 a.m.
    Cleaning and ensuring the integration of all your data is now more efficient and profound than ever with the new release Spotless version 17 We are delighted to announce the new release Version 17 of Spotless' data validation solution for all your rogue data issues. While the main focus of this release has been various bug fixes, we have also added one new feature, which is that Spotl...
    Read More
    Jan. 26, 2018, 9:31 a.m.
    Data cleaning is the first step towards ensuring data quality in six key sectors of the economy who are having to dea with the new challenges and opprtunities which modern data represent. At Spotless we have identified six sectors who are dealing with the new challenges of data. They are manufacturing, retail, transport, healthcare, finance and media. Data has always been around but the q...
    Read More
    Dec. 4, 2017, 9:33 a.m.
    A sign taken from a ski slope in the Alps illustrates the new and improved Spotless version 16 release. We are delighted to announce the release of the new improved version 16 of Spotless Data's solution to all your rogue data issues through a data cleaning API easily accessible through your web browser. There are four new areas where we have improved our machine learning filters whic...
    Read More
    Nov. 14, 2017, 11:40 a.m.
    Miniature cleaners cleaning a laptop symbolises the importance of data cleaning. The need for data cleaning has never been so important in a world dominated by big data, which are of no use at all unless they have been suitably cleaned, and by artificial intelligence, which is about as smart as a wet blanket if it is working with data which have not undergone a thorough data cleaning process...
    Read More
    Dec. 18, 2017, 7:15 a.m.
    When doctors and healthcare IT specialists have clean, quality data to work with then the data can help save lives and cure illnesses. Modern technologies that involve large quantities of data are increasingly used in medicine and healthcare to help us all live longer and healthier lives, while helping cure intractable and horrible illnesses like cancer and Parkinson's. However, getting ...
    Read More
    Dec. 11, 2017, 7:27 a.m.
    By using big data in manufacturing processes, such as a tablet monitoring what is happening in a factory, manufacturers ensure greater efficiency and success. Manufacturing has always needed large quantities of data to produce the complicated goods typical of a modern industrial society while using data for the time management of workers and general efficiency within a factory stretches back...
    Read More
    Dec. 1, 2017, 7:17 a.m.
    Data quality is now achievable for the financial sector at minimum effort with Spotless Data's data cleaning solution The financial sector has always needed to ingest and analyse large quantities of both structured and unstructured data to make the often complex decisions required within highly competitive markets where millions can be made or lost depending on the decisions taken. The c...
    Read More