Tuesday, May 12, 2015

Twain

Problem:

TopCoder Problem Statement - Twain
Single Round Match 169 Round 1 - Division II, Level Two

Overview:

Perform a series of manipulations to a String and return the result.

Java Source:
     1 public class Twain {
     2 
     3     public String getNewSpelling(int year, String phrase) {
     4 
     5         if (year >= 1)  {
     6             // If a word starts with x, replace it with a z
     7             phrase = phrase.replaceAll("\\bx","z");
     8 
     9             phrase = phrase.replaceAll("x","ks");
    10         }
    11 
    12         if (year >= 2)  {
    13             phrase = phrase.replaceAll("y", "i");
    14         }
    15 
    16         if (year >= 3)  {
    17             /*
    18             * If a 'c' is followed immediately by an 'e' or
    19             * an 'i', then replace the 'c' with an 's'.
    20             */
    21             phrase = phrase.replaceAll("c([ei])", "s$1");
    22         }
    23 
    24         if (year >= 4)  {
    25 
    26             /*
    27             * Remove repeating 'c's that end with a 'k'
    28             * with just a 'k'.
    29             */
    30             phrase = phrase.replaceAll("c+k", "k");
    31         }
    32 
    33         if (year >= 5)  {
    34 
    35             /*
    36             * Replace 'sch' at the beginning of a word with
    37             * 'sk'
    38             */
    39             phrase = phrase.replaceAll("\\bsch", "sk");
    40 
    41             phrase = phrase.replaceAll("chr", "kr");
    42 
    43             /*
    44             * Replace all 'c' s that are not followed immediately
    45             * by a 'h' with a 'k'
    46             */
    47             phrase = phrase.replaceAll("c(?!h)", "k");
    48 
    49         }
    50 
    51         if (year >= 6)  {
    52 
    53             /*
    54             * Replace all words that start with 'kn'
    55             * with just 'n'
    56             */
    57             phrase = phrase.replaceAll("\\bkn", "n");
    58         }
    59 
    60         if (year >= 7)  {
    61 
    62             /*
    63             * Replace all repeating consonants, with just the
    64             * single character.
    65             */
    66             phrase = phrase.replaceAll("([b-df-hj-np-tv-z])\\1+", "$1");
    67         }
    68 
    69         return phrase;
    70  }
    71 }
Notes:

It looks like a lot of coders had a tough time with this problem. The solution given here seems like a lot of work. Fortunately, if you're up on your regular expressions, you can knock this out easily. If you're rusty with regular expressions, this is a great opportunity to exercise these important skills.

The first rule under year one uses the \\b character which matches with the start of a word boundary. \\bx matches any x at the start of a word. The second rule trivially replaces all remaining occurrences of "x" with "ks".

Year two is also trivial. we simply replace all occurrences of letter "y" with an "i". Nothing difficult there.

The rule for year three is a little more difficult. We need to replace any "c" followed immediately by an "e" or "i" with an "s". The rule "c([ei])" matches on the "c" followed by either an "e" or an "i". By putting the parenthesis around "[ei]" the value it matches is stored in a variable. We access the variable with $1. The result "s$1" replaces the "c" with an "s" and then tacks on whatever the "[ei]" matched.

The rule for year four simply replaces 1 or more occurrences of the letter "c" ending with a "k" with just a "k".

We've already seen the first two rules for year 5. The final rule is a bit tricky. It matches a "c" that is not immediately followed by an "h" with a "k".

The rule for year six is basically the same as that the first rule of year 1.

The rule for year seven is probably the most difficult. All consonants are defined by the ranges contained within the square brackets. Note that the vowels are missing here. The group is also enclosed within parenthesis, so we can refer to what it matches later using the variable $1. The \\1+ means that we want to match any occurrence of 1 or more consonants. We then replace this with just the single consonant.

I hope this gives some idea of how simple the problem can be solved using regular expressions. My goal is not to give an precise definition for any of the constructs used above, but rather just to show you how they could be used in this example.

No comments:

Post a Comment