Validate Google Form Data with Regex Statements

Click image to enlarge

In this post I gave an overview of how to validate user input on Google Forms. As I said there, you can often avoid the use of regular expressions by using other options such as Contains or Does Not Contain.

Sometimes, though, a regex (or regular expression, to give it its full title) is the only option. Learning how to write a regex can be difficult and time consuming, so here I hope to give some background about what the various parts of a regex statement each signify, and offer a few examples you can use as they are or modify for your own use. (Please note that I don’t guarantee these to be bulletproof; you may find a few valid data examples that fail the test, or a few erroneous one that pass).

.       Matches any single character.
[ ]     Matches a single character contained within the square brackets. 
[^ ]    Matches a single character not contained within the square brackets.
^       Matches the beginning of the string. Referred to as an anchor.
$       Matches the end of the string. Referred to as an anchor.
*       Matches 0 or more of the previous item.
?       Matches 0 or 1 of the previous item.
+       Matches 1 or more of the previous item.
{ }     Matches {this many} of the previous item
|       The OR operator. 
        Matches either the expression before or the expression after the |
\       The escape character. 
        Allows you to use one of these metacharacters for your match.
( )     Groups characters into substrings.

That’s a lot to take in, so let’s try a simple example – a regex expression to allow only a whole number ( string of digits). Firstly, each character must be a digit 0 through 9:


And we’ll allow any number of such digits in our string:


There must be nothing else before these characters, so we use the beginning anchor


And there mjust be nothing else after these numbers, so we add the end anchor:


Hey presto.

Here’s some other popular examples, as promised:

whole numbers (digits)

regular text with spaces, comma and full stop (period)

US zip code

Only file names ending .jpg, .gif or .png

Email address

character limit 140 (such as SMS or tweet)

(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.]\d\d\d\d$
UK date, dd/mm/yyyy

(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.]\d\d\d\d$
US date, mm/dd/yyyy

If you can improve on these, find any bugs, or have any more to add, please comment here or in Google+!