The following is an article written by Trellance’s Associate Vice President of Data Consulting, Merrill Albert. The article originally appeared on CUInsight.com
Data quality is a big problem. But often, small things are the culprits.
And, that, in itself, is another problem. Lots of small little errors in your data accumulate into an overwhelming pile of issues. It may feel impossible to tackle, or even know where to start.
But there’s another side to that coin: small steps can yield big results.
To start small when tackling data quality, first look for the most common problems. Your early successes will position you to more easily take on larger efforts. Here’s where to start.
Maintaining accurate addresses sounds simple enough, and yet without concerted attention, they become a source of headache for the credit union and its members. Erroneous addresses close off an important communication channel with members and prospects and come in many different shapes.
Do you have a street but no city? Does the address on file not exist? Has the member moved to a new development not yet recognized? Do you have addresses from different countries and need to understand how different countries construct their addresses? “Addressing” your address issues will help ensure important communications reach their intended destination.
Though part of an address, zip codes go wrong so often they deserve their own category. Problems can arise when frontline staff aren’t trained on the importance of complete and accurate data. When pressed for time, they may not ask for a zip code, or, just enter something in the required field, such as 11111, 12345, or 99999.
Another common problem are zip codes that begin with “0.” If the data rules aren’t constructed properly, the leading 0 may be removed, making it no longer valid. Zip code problems happen frequently enough that when many credit unions look under the hood, they find a large member population with addresses invalidated by bad zip codes.
We recommend all credit unions develop a consistent format for dates and communicate it widely. At Trellance, our preference is YYYYMMDD because it makes for easy math, but also helps avoid confusion with other global formats such as DD/MM/YYYY or MM/DD/YYYY.
Without consistency, a best-case scenario will require reformatting the dates every time you need to use the data. Worse, the data could be used incorrectly if different formats aren’t recognized.
Gender and Salutation
Determine if your credit union will collect gender, salutation, or ideally, both. Avoid guessing at the salutation based on the gender or the name – your guess won’t always be accurate. Gender and salutation are unique data elements and one can’t always be derived from the other. If you historically haven’t collected salutation data and don’t have a way to update your database with accurate information, forgo rather than guess. This can be particularly difficult for those who consider salutations polite – but you’ll quickly erode that goodwill with the wrong one!
Most credit union data contains “indicators” or yes/no fields. Here, missing values can prevent you from effectively using data. Take a citizenship indicator for example. If your credit union has business processes that vary based on citizenship, a blank field can’t tell you if the value wasn’t collected, was refused, or doesn’t apply. Missing indicator values become an impediment to efficient processes.
Your database may use sets of code values, like, for example, product type codes. These values will likely change over time, so it can be helpful to qualify with dates that communicate which values are valid when.
Adding date fields can help better prepare for future code value changes. Using the product type example, new product launches can be entered with the future dates they will become available so the lists can coordinate across your website, program code, and help text. The alternative is to have a team member adding information to your database at the moment of change, which is far more prone to errors and hiccups.
When it comes to ID fields, problems can arise from the best of intentions. Embedding logic into an ID field may seem like a time-saving technique that eliminates having to look up other data. But in reality, if the ID field is made of other fields, the supporting processes often aren’t there to keep the data accurate. For example, if your ID contains the zip code of the member residence, it will have to be changed each time a member moves. Because the ID is used throughout databases, changing it isn’t an easy or recommended approach. Instead, maintain IDs as a unique number without embedded logic.
Because data quality has such a meaningful impact on decision quality, the most important “small” step a credit union can take is making the commitment to identifying and addressing the issues on an ongoing basis. Then, every small step is a step in the right direction.
Merrill Albert is the associate vice president of data consulting at Trellance.