Data Quality

Spotless measure the quality of datasets according to specification against a set of business rules. The business rules are automatically generated using machine learning algorithms and can be customised to meet the appropriate business requirements.

Typical applications include:

  • Integration datasets from different providers - Spotless will validate that each feed is well formed and consistent with other feeds
  • Loading data into a data lake - although full transformation is not required on data ingested into a data lake, simple validation using Spotless to ensure the data is well formed and well referenced saves significant time downstream
  • Cleansing data scraped from the internet - web scraped data is notoriously irregular, and Spotless will ensure that no information that does not conform to the expected specifications is loaded
  • Integrating internal data from different platforms - you can define specific business rules to provide consistent and valid data whenever it is loaded
  • Truncating or removing overlapping sessions; extended or filling sessions with gaps between them

    Blog posts about Data Quality

    Feb. 10, 2017, 7:38 a.m.
    High-quality data so spotlessly clean is the goal of all data-driven businesses which is why we have developed our Machine Learning filters to remove rogue data. Part 2 of this blog. Spotless Data has identified 14 different causes of poor data quality, which is data which your company and your customers are unable to trust. Given that poor quality data can cost businesses 20-35% of their...
    Read More
    March 3, 2017, 10:41 a.m.
    The amount of data in need of cleaning to a high quality is set to rocket due to the explosion in IoT gadgets. The internet of things (IoT) has been described as the infrastructure of the information society. All these "things" connect to the Internet via wifi and then "talk" to each other. Gartner estimates that by 2020 over 26 billion of these devices will be connected ...
    Read More
    Feb. 3, 2017, 7:36 a.m.
    A handshake as a symbol of trust is never more important than with data quality. Can you trust the quality of your data? Nobody doubts that in 2017 most companies need to exploit their data, including their dark data, to use them to their maximum potential. The question Spotless Data wants to ask you is, "can you trust your data?" The fundamental definition of data quality is th...
    Read More
    Jan. 20, 2017, 7:57 a.m.
    High-quality data is at the heart of any modern marketing campaign Data-driven marketing Marketing your services and products is at the heart of any successful business, more often than not making the difference between an outstanding and an ineffective company. It is widely recognised that it is not enough to simply hire some marketers and then say, "well just go for it". A gre...
    Read More
    March 20, 2017, 12:45 p.m.
    Anyone who wants to be in charge of their data should consider using Spotless Data's Machine Learning Filters data cleaning solution. Who would benefit from using Spotless data quality API? The simple answer is that any business or organisation that has to deal with any data would benefit, whether it is a new start-up or a multi-billion dollar business. However, we have identified thr...
    Read More
    Jan. 3, 2017, 8:56 a.m.
    Spotless Data version 6 release focusses on bug fixes and allowing unlimited records. We've just launched version 6 of our unique data quality web-based API solution for data you can trust, scrubbing up your dirty data and leaving them spòtlessly clean at the speed of business, which includes: Bug fix in reference rules Used to validate foreign keys or check natural keys bet...
    Read More
    Jan. 6, 2017, 7:58 a.m.
    When dark data are of a high quality they can greatly help your business succeed, which is why data cleaning them with Spotless API solution makes so much sense. What are dark data? Dark data are those data that a company stores but which do not appear directly useful. Using the analogy of an iceberg, those data that a company uses are the visible tip of the iceberg, and the dark data are...
    Read More
    Dec. 22, 2016, 7:54 a.m.
    A hype cycle perfectly visually illustrates Amara's law. Amara's Law is a computer saying which states: We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run. While this law, coined by the then president of the US-based Institute for the Future, Roy Amara, has been associated with both cyber attacks and nanotechnology, ...
    Read More
    Dec. 30, 2016, 9:55 a.m.
    High-quality data is the first step towards actually protecting the private information of your customers. Importance of compliance regarding PIIs The requirements demanded by authorities for all businesses who have users with personally identifiable information (PII) throughout the world to comply with their ever more demanding regulations has been one of the news stories of 2016. As cas...
    Read More
    Dec. 19, 2016, 8:54 a.m.
    The concept of refining data perfectly illustrates how important data quality is in the 21st century. Data as the new oil If data are the new oil, then the data scrubbers, ensuring that your data have data integrity, are the new oil refineries. Crude oil is the basic ingredient of a whole range of products and yet of itself has little value until it has been refined into the products we a...
    Read More