Where magic lives

Friday, November 17, 2006

The Guardian does it again

Exclusive: We can Read! - Why did Steve Boggan and a friendly computer expert find it so exctiting to state the obvious?

Labels: ,

Thursday, August 10, 2006

Analysis of HSBC Vulnerability

It is all over the news this morning about a "security flaw" in HSBC online banking.

Being an HSBC account holder myself (not that I actually use the bank; they offer pathetic interest rates) I was encouraged to investigate this further. I was put at ease the moment I saw that each article was hinting at the researchers having made an assumption that every target has been infected with a keylogger. A bit of an unreasonable assumption if you ask me, and I think at this point it stops being "news" however the vulnerability is quite interesting...

When you logon to HSBC banking you are asked for your date of birth and for three digits from your security number. The three digits you are asked for are randomly selected by HSBC but the digits requested only seem to change after a successful login. Also the instructions that tell you which digits to enter are sent over HTTPS and we will assume are invisible to the attacker. Now for the important part: the digits are always requested in the order they appear in the security number. For example you might be asked for digits 1, 2 and 3 in that order, but you would never be asked for digits 3, 2 and 1 in that order. This leads to the vulnerability...

Let us use a random example, assume that an HSBC customer uses the security number 4921576876, we have a keylogger running on his machine and have now watched him login to HSBC 22 times seeing the following partial security codes: 416, 458, 496, 286, 925, 976, 487, 476, 157, 987, 476, 576, 217, 915, 178, 976, 491, 476, 428, 915, 917 and 176.

From the data above we can estimate how often we expect each digit to appear in the users security code. We would expect to see each digit in the security code a total of (|dataset| x |partialcode|) / |availabledigits| = (22 x 3) / 10 = 6.6 times. For example we saw the number 6 ten times in total, so would expect it to appear in the security code 10 / 6.6 = 2 (0 d.p.) times. Using this strategy we can deduce the following frequencies for each digit in the security code: 0 x 0, 1 x 1, 1 x 2, 0 x 3, 1 x 4, 1 x 5, 2 x 6, 2 x 7, 1 x 8, 1 x 9. This statistical analysis has introduced some uncertainty and we may need to come back to these distributions if the procedure below leads to errors.

Now we can start to piece together the original code. Let's start with the digits that only appear once, the code contains a single 1: 1. It contains a single 2 and the partial 217 tells us that the 2 comes before the 1: 21. Similarly there is a single 4 and we know from 416 that it is before the 1 and from 428 that it is before the 2: 421. There is a single 5 and the same method tells us that it comes after the 1: 4215. Similarly we can deduce the positions of the single 8 and 9: 492158. Now we need to deal with the sixes and sevens, some uncertainty is introduced here but the state space stays manageably small. There is definitely a 7 after the last 8 (because of 487): 4921587. The other 7 comes either immediately before or immediately after the 5 but we cannot tell which. The first 6 could appear anywhere after the 9 (from 496), and the second six could appear anywhere after the 1 (from 416) but if you chart all the possible locations they can be seen to be statistically more likely to appear after the 57/75 so let us assume this.

Based on the above (which assumes our frequency distribution to be correct) we claim that the code begins with 4921 is then followed by 57 or 75 and is then followed by 6876, 8766, 8676, 6687, 8667 or 6867 (all of the possible arrangements of the sixes at the end of the code). This gives us only 12 possible codes and indeed does contain the correct code: 4921576876.

I have chose to publish a worked example rather than general code because it is easier and wont get me accused of publishing working exploit code but it can be seen how the above procedure can be generalised. It is at this point where you could debate the subject as well as not being newsworthy, not being academic research but just simple maths. We'll see where their research gets published "later in the year"!

Labels: