This post marks the beginning of the end. The end of the Free Code Camp bonfires. Ths time, we are going to be tackling the first bonfire in the advanced (or upper advanced) section. It deals with regular expressions, which can range from incredibly simple to monstruously complex. This bonfire won’t be too bad; we’ll start off the Validate US Telephone Numbers challenge by doing exactly that, validating the provided telephone number (a string) and returning true or false for valid or invalid phone numbers. I’m going to take you through the creation of a single regular expression that will do the job for us.
Let’s get started, these are the accepted telephone number formats:
- 555-555-5555
- (555)555-5555
- (555) 555-5555
- 555 555 5555
- 5555555555
- 1 555 555 5555
Anything with different shape is considered invalid. The phone number can be dissected in the following pieces:
- Country code: 1 555-555-5555
- Area code: 1 555-555-5555
- Phone number: 1 555-555–5555
The country code is optional this time around, but, if provided, we must make sure that it is equal to 1 (US country code). Both area code and phone number are required.
Now that we know the pieces, let’s use regular expressions to check each one of them one by one to keep it simple. I recommend that you use this page to construct your regular expressions, it’s very easy to use and will help a ton!
I’m going to start by creating a variable where the regular expression will be stored, we’ll get to the actual function later. First, let’s make sure that if we get a country code, it’s equal to 1. To achieve this, I’m going to use the ^ symbol, which matches the beginning of a string. Then, we type in 1, since that’s what we actually want at that position. Since the country code is optional, using the ? symbol after it will match zero or one of the preceding element:
var phone = /^1?/;
Next, we have the area code, but before it, we may get a space -sometimes-, so let’s use the ? symbol with a preceeding space:
var phone = /^1? ?/; // ^ There is a space there!
Now, onto the area code, it can show up with two different shapes: 555 or (555), we are going to use a capturing group -using the parenthesis symbols- and the OR operator -the | symbol- in conjunction. This way, we can match either one or the other option like so: (555|(555)).
It’s actually not that simple, we’ll need to match any number and also escape the parenthesis symbols for (555), or the engine will get confused between the capturing group parenthesis symbols and the ones that we want to match in the string. To escape a character, we simply place a backslash ( ) in front of it.
To match any digit, we can use this token: d. If we place three of this in succession (ddd) it will match any 3 digits in a row. We can also use this notation: d{3}. This means that the previous token will repeat 3 times. Let’s use this knowledge and implement it into our regular expression:
var phone = /^1? ?(d{3}|(d{3}))/;
It may seem quite confusing, but (d{3}|(d{3})) just means: three digits in a row or opening parenthesis, three digits and closing parenthesis.
We only have the actual phone number left, but we need to account for the possibility of a space or dash (-) character between the area code and phone number first. We will use a range by using the square brackets ([ and ]) for this. The square brackets match any one of the tokens inside, but only one. For example, [abc] will match a, b or c, but not two or three of them together. If we want to match a space, a dash or nothing at all, we need to use the following: [ -]? (remember what the ? symbol did!). Let’s put it into the phone RegExp:
var phone = /^1? ?(d{3}|(d{3}))[ -]?/;
Good! Almost there! Now for the phone number, we will always get seven digits, but there might be a space or dash between the first three and the last four digits. Sounds familiar doesn’t it? We just did it a few seconds ago! Let’s match three digits, an optional space or dash and another four digits: d{3}[ -]?d{4} This is what it looks like all together:
var phone = /^1? ?(d{3}|(d{3}))[ -]?d{3}[ -]?d{4}/;
Since we want these last digits to be the end of the match, we’ll append a dollar sign ($) at the end of the regular expression, it’s the similar to the ^ symbol we placed at the very beginning:
var phone = /^1? ?(d{3}|(d{3}))[ -]?d{3}[ -]?d{4}$/;
It may seem somewhat confusing, but if you go step by step, it will make sense. We are not done though! Let’s use the test method to actually validate the phone numbers provided to us!
function telephoneCheck(str) { var phone = /^1? ?(d{3}|(d{3}))[ -]?d{3}[ -]?d{4}$/; return phone.test(str); }
And that’s it! We can even skip adding the phone variable and be done in a single line:
function telephoneCheck(str) { return /^1? ?(d{3}|(d{3}))[ -]?d{3}[ -]?d{4}$/.test(str); }
If in doubt, you can always contact me via email, twitter or posting a comment below. Happy coding!