Demystifying the DNI numbers

September 2005
Updated: July and August 2011
Translated: October 2013

© 2013 Josep Portella Florit

Introduction

At some point all Spaniards have noticed the mysterious characters on the back of their DNI (the Spanish ID card):

``````IDESPABC123456012345678Z<<<<<<
7410150M0903226ESP<<<<<<<<<<<9
DE<TAL<Y<CUAL<<FULANITO<<<<<<<
``````

There is a myth that states that the last digit of the second line represents the number of persons with the same name and last names as the holder. The purpose of this article is to demonstrate that this is not true.

This zone of the DNI is composed of OCR characters, i.e., it is prepared to be read by machines. This digit in particular is just a check digit used to verify the correct reading of the data.

Below, each of the OCR data zone fields will be identified and the check digit calculation algorithm will be explained.

Identification of the fields

The OCR data zone of the electronic DNI can be divided into several fields:

``````1.[ID] 2.[ESP] 3.[ABC123456] 4.[0] 5.[12345678Z] 6.[<<<<<<]
7.[741015] 8.[0] 9.[M] 10.[090322] 11.[6] 12.[ESP] 13.[<<<<<<<<<<<] 14.[9]
15.[DE<TAL<Y<CUAL<<FULANITO<<<<<<<]
``````
1. Document type
2. Nation
3. Hardware serial number of the card
4. Field 3 check digit
5. DNI number
7. Date of birth (`YYMMDD`)
8. Field 7 check digit
9. Sex (`M`/`F`)
10. Expiration date
11. Field 10 check digit
12. Nationality
14. Check digit of the concatenation of the fields 3, 4, 5, 7, 8, 10 and 11
15. Name

The traditional DNI has different fields:

``````1.[ID] 2.[ESP] 3.[12345678Z] 4.[3] 5.[<<<<<<<<<<<<<<<]
6.[741015] 7.[0] 8.[M] 9.[090322] 10.[6] 11.[ESP] 12.[<<<<<<<<<<<] 13.[4]
14.[DE<TAL<Y<CUAL<<FULANITO<<<<<<<]
``````
1. Document type
2. Nation
3. DNI number
4. Field 3 check digit
6. Date of birth (`YYMMDD`)
7. Field 6 check digit
8. Sex (`M`/`F`)
9. Expiration date
10. Field 9 check digit
11. Nationality
13. Check digit of the concatenation of the fields 3, 4, 6, 7, 9 and 10
14. Name

Calculation of the check digits

The check digits are calculated by applying a simple algorithm to other fields. First the characters must be separated, e.g., if the field value is `12345678Z`:

``````1 2 3 4 5 6 7 8 Z
``````

If any of the characters is a letter, it must be replaced by its numeric value:

``````A 0   F 5   K 10   P 15   U 20   Z 25
B 1   G 6   L 11   Q 16   V 21
C 2   H 7   M 12   R 17   W 22
D 3   I 8   N 13   S 18   X 23
E 4   J 9   O 14   T 19   Y 24
``````

So we have:

``````1 2 3 4 5 6 7 8 25
``````

The weight `7-3-1` must be applied to those numbers. This means they have to be multiplied by `7`, by `3` or by `1` depending on their position:

``````1  2  3  4  5  6  7  8  25
7  3  1  7  3  1  7  3   1
--------------------------
7  6  3 28 15  6 49 24  25
``````

Then the results of the multiplications are summed:

``````7 + 6 + 3 + 28 + 15 + 6 + 49 + 24 + 25 = 163
``````

The check digit is the last digit of the sum, `3` in this case.

Algorithm implementation

This is an implementation of the algorithm in the C programming language.

The `check_digit` function defined below receives a parameter of type `char *` that must point to a string which contains digits and/or letters. It returns a `int` between `0` and `9`, the check digit of the string. If an invalid character is found, `-1` is returned.

``````#include <ctype.h>

int
check_digit(char *s)
{
static int m[3] = { 7, 3, 1 };
int i, n;

for (i = n = 0; s[i] != '\0'; i++)
if (isdigit(s[i]))
n += (s[i] - '0') * m[i % 3];
else if (isalpha(s[i]))
n += (toupper(s[i]) - 'A') * m[i % 3];
else
return -1;
return n % 10;
}
``````

Proof of concept

If you have an electronic DNI, fill the fields below with the hardware serial number of your DNI, your DNI number, your date of birth and the expiration date of your DNI as they appear on the back of your DNI. The check digits will be filled automatically.

IDESP <<<<<<
? ???<<<<<<<<<<<

On the other hand, if you have a traditional DNI, fill the fields below with your DNI number, your date of birth and the expiration date of your DNI as they appear on the back of your DNI.

IDESP <<<<<<<<<<<<<<<
? ???<<<<<<<<<<<

The next time you hear the urban legend that states that the number on the back of the DNI refers to the number of persons with the same name and last names as you, explain them the truth!

History

In 2005, tired of hearing the myth of the DNI numbers, I decided to search an answer to the enigma: if it’s not the number of persons with your name and last names, what is it then?

Before knowing anything about the origin of the check digits or knowing with certainty if they were, I already thought what they said about that digit was unlikely. It didn’t seem practical to include a piece of information dependent on so many external factors to the card holder, since it could easily turn obsolete and it would have no value. Ideally, if they really needed to know that information, they could query a database.

I read about the possibility of them being check digits at a weblog and it seemed reasonable to me. The digit after the birth date and after the expiration date seemed most likely a function of the date, and my investigation was based on that.

Thanks to my friends, I collected some samples for comparing. I tried to apply several common check digit algorithms, but I had no luck.

One day I could compare two dates where the only difference was the second digit. Guessing that the algorithm worked with a weights system (numbers that must be multiplied by the values depending on their positions), a sum and the extraction of the last digit, I deduced that the weight for the second position was a `3`.

I decided to address the problem with brute force, since it seemed doable. There were only 10 different values and 6 positions, so I wrote a script that applied all permutations of weights while checking the results for all the samples I had. It worked and I got the weights, `7-3-1`.

So I knew how to get the check digit for the dates, but there was a problem to apply it to the digit next to the DNI number on the first line. This number had a letter, therefore I had to discover how to get its numeric value. After a few tests I found the answer: `A=0`, `B=1``Z=25`.

Only the last digit was left, the one that motivated me to begin this research. But its origin was not obvious because it was separated from the other data.

I spent some time testing and with not exact result. Finally, I read a reference to certain document, ICAO 9303 that allegedly explained all this. I couldn’t have access to that document, since I couldn’t find it online at the time, but I found documents explaining the `7-3-1` weight system and its application to passports, citing the 9303 document as its source. But the OCR data on the passport has a different format than the OCR data on the DNI, so I didn’t find the solution, but I got a clue: the last check digit of the OCR data of a passport is calculated like the other check digits explained in this document, but its origin is a selection of the previous data; I read the sex and nationality didn’t affect this check digit, and then I realized why I hadn’t found the answer: in all the tests I had done, I always included the sex character. So after trying with just the check digits and its related fields I managed to match all samples.

Six years later I still didn’t have an electronic DNI, but I could analyze some kindly contributed samples. Comparing an electronic DNI with a traditional DNI I saw that the first line of the OCR data had a different format; there was an addition before the DNI number, that I identified as the serial number in the front of the card, and its check digit; also, the DNI number didn’t have a check digit. Then I did a couple of tests and I found the new way to calculate the check digit.