VÜlgÅr. A language generator

What is RegEx?

RegEx (regular expression) is a programming syntax for matching patterns in text. In Vulgar, RegEx can be implemented in the custom orthography option to mimic the spelling idiosyncrasies of real life languages more closely.

Replace an IPA symbol at the beginning of a word only

RegEx uses the carat symbol ^ to signify the beginning of line of text. The pattern ^d will match a 'd' phoneme at the beginning of a word, but not the middle or end. An orthography rule such as ^d > D would change /dog/ into Dog, while /god/ would remain unchanged.

Replace an IPA symbol at the end of a word only

The dollar sign $ is used to match characters at the end of a line. The rule dʒ$ > dge would spelled the word /wedʒ/ as wedge, while /dʒail/ would remain unchanged. You could combine this with a second rule such as ^dʒ > j to turn /dʒail/ into jail.

Replace multiple IPA symbols

Square brackets [] are used to match any character inside it. For instance, [ɐɑ] > a means both /ɐ/ and /ɑ/ turn into a. You can put as many characters as you want inside the square brackets, but it will treat everything as an individual character. If you want both /ɔ/ and /əʊ/ to turn into o you can use the use the vertical line symbol | to form an 'or' expression, for example ɔ|əʊ > o. Alternatively, you can simply list rules separately:

ɔ > o
əʊ > o

Replace an IPA symbol with itself + something else

Say you wanted to put a z after every vowel. The rule [aeiou] > z is no good because that would just replace every vowel with z. But you can retain what's matched by using $& on the replace side of the equation. For example [aeiou] > $&z will retain every vowel it finds and add z to the end. Now combine this with even more features and you can start to see how powerful RegEx can be. The pattern [aeiou]$ > $&z will put a z after any vowel at the end of a word only. Very French.

Order of rules

The order of your custom orthography rules matter. Vulgar will find-and-replace the first orthography rule to a word, then apply the next rule over the top of what it just did. This can be a problem if an IPA symbol appears again in a consonant cluster in a later rule. For instance, the following rules are problematic:

ʃ > sh
tʃ > ch

The intent is for /tʃ/ to change to ch. However in word such a /tʃar/, the first rule will find /ʃ/ and change the orthography to tshar. Then when it moves to the second rule it will fail to find /tʃ/. A solution to this is the flip the rules:

tʃ > ch
ʃ > sh

Alternatively you can use negative matching syntax, which is signified by a carat inside square brackets. The rule [^t]ʃ > sh translates to replace /ʃ/ with sh, unless there's a /t/ before it.

Other tricks

Replace with nothing

Creating a rule with nothing on the right side of the > symbol will simply delete everything on the left side of the rule. Thus, the rule [aeiou] > will replace all vowels inside the brackets with nothing. All orthography can be deleted using the dot symbol . which signifies any character. The rule . > translates to take any character and replace it with nothing.

Non-Latin alphabets

Custom orthography also supports non-Latin alphabets and scripts, such as Japanese, Chinese, Cyrillic and Georgian symbols, just to name a few.

Create katakana orthography:

ka > カ
ki > キ
ku > ク
ke > ケ
ko > コ
sa > サ
si > シ
su > ス
se > セ
so > ソ
ta > タ
ti > チ
tu > ツ
te > テ
to > ト
na > ナ
ni > ニ
nu > ヌ
ne > ネ
no > ノ
ha > ハ
hi > ヒ
hu > フ
he > ヘ
ho > ホ
ma > マ
mi > ミ
mu > ム
me > メ
mo > モ
ja > ヤ
ju > ユ
jo > ヨ
ra > ラ
ri > リ
ru > ル
re > レ
ro > ロ
wa > ワ
wi > ヰ
we > ヱ
wo > ヲ
a > ア
i > イ
u > ウ
e > エ
o > オ
n > ン

Created and designed in Sydney, Australia.
Vulgarlang.com © 2017.