Data Cleaning

As well as identifying data quality issues, Spotless can clean dirty data and replacing invalid fields with corrected data.

Supported rules are:

  • Fixing dates and times to a specific format
  • Correcting numbers to a given number of decimal places
  • Setting unique foreign key references that have been truncated or corrupted
  • Adding missing data using our machine learning lookalike model
  • Updating common spelling mistakes with a lookup rule

    Blog posts about Data Cleaning

    Sept. 29, 2017, 5:16 a.m.
    Spotless Data's machine learning filters creating Artificial Intelligence for Data Quality. What inspired us to set up Spotless Data as a new start-up, in November 2015, using machine learning filters for data cleaning to reach the holy grail of data quality you can trust, and which can do not merely what they were designed to do but new things as well, has been seeing time and again wit...
    Read More
    Sept. 15, 2017, 4:27 a.m.
    Spotless is a web-based API that filters data coming into your systems so rogue data can never get into your data platforms. What is rogue data? Also known as dirty data, rogue data are essentially any corrupted, mismatched or inaccurate data which, if they get into your data platforms, will affect either your final product and thus are used and/or viewed by your customers, or will affect...
    Read More
    Sept. 15, 2017, 6:45 a.m.
    Our machine learning filters eliminating rogue data thus ensuring you have spotless data quality. A simple definition of filters is that they filter out information or data which are not wanted. A more complex definition is that filters take one list of information/data, made of one or several columns, and convert it into another modified list. It does this by examining the content of the li...
    Read More
    Aug. 18, 2017, 7:59 a.m.
    Recruiting a data science team to manage big data is no easy task but easier when you already use Spotless Data's machine learning filters. Managing your big data is no longer a task that only tech companies specialising in artificial intelligence require. Until recently big data managers only really existed in big IT companies with billions of dollars of resources while simple data mana...
    Read More
    Aug. 4, 2017, 9:04 a.m.
    The best way to trust your data again is to ensure they are quality data by using Spotless machine learning Filters. At the heart of Spotless Data Quality API solution lie our newly developed machine learning filters for data cleaning, filtering out the dirty data containing mismatches, duplications, and other corruptions. Many of these dirty data issues are predictable and are caused by a m...
    Read More
    July 21, 2017, 7:31 a.m.
    Spotless Data version 11 introduces our new machine learning filters for guaranteeing data quality. Spotless are delighted to announce the release of version 11 of our data quality solution to data cleaning your dirty and mismatching data problems at the speed of business, which focusses on introducing the concept of machine learning filters, applying the latest in artificial intelligence te...
    Read More
    Aug. 11, 2017, 9:03 a.m.
    The best way to ensure transparency in artificial intelligence and machine learning is through having quality data. The recent spat between Elon Musk and Mark Zuckerberg brought the alleged dangers of artificial intelligence and machine learning the world's attention. Musk evoked old fears of Artificial Intelligence, explored previously by the science-fiction writer Isaac Asimov and in t...
    Read More
    March 7, 2017, 6:42 a.m.
    Spotless Data's version 8 release focusses on pricing and subscription packages, including 500Mb of free data cleaning. Having been in beta since the beginning of 2016, Spotless Data is finally live! Get ready for your data to have data integrity so that you can trust in them! We have just come out of beta, where we have been working since the end of 2015 on our unique web-based API, ...
    Read More
    March 14, 2017, 7:43 a.m.
    High-quality data using our machine learning filters is the fundamental pre-requisite for effective rather than misleading data analysis. The concept of data analysis The concept sounds easy enough. Get all your data in one location, known as a data lake, buy some expensive data analysis software tool, run it through your data and hey presto you now have the data mining and business intel...
    Read More
    Feb. 10, 2017, 7:38 a.m.
    High-quality data so spotlessly clean is the goal of all data-driven businesses which is why we have developed our Machine Learning filters to remove rogue data. Part 2 of this blog. Spotless Data has identified 14 different causes of poor data quality, which is data which your company and your customers are unable to trust. Given that poor quality data can cost businesses 20-35% of their...
    Read More