Sound changes

Sound change rules (also known as phonological rules) can be applied to all words using the Sound Change field. This notation can also in affix rules, Custom Spelling, and Illegal combinations field.

A basic rule like a > o means change every a to o. Try it yourself:

Input Rule Output
alabama
a > o

You can also specify if the change should only happen in a particular environment: a > o / _b means change every a to o if it comes before a b. The / means in the environment of and _ signifies where the a is.

Input Rule Output
alabama
a > o / _b

Be aware that letters without diaciritcs are treated as different phonemes to letters with diacritics. Therefore, t > p does not change :

Input Rule Output
tatʰat
t > p

Word boundaries

# signifies a word boundary. Notice how the following pattern only changes the first a.

Input Rule Output
alabama
a > o / #_

Sets

Capture a set of phonemes inside curly brackets: {b,m} matches b or m.

Input Rule Output
alabama
{b,m} > p
alabama
a > o / _{b,m}

Note: Curly brackets cannot be used in the replacement side of the > symbol. Instead, bar symbol | creates a random choice between replacement options: b > m | n | b changes b to m or n or b. All three options are equally probable, but can be weighted with the * symbol: b > m*3 | n*2 | b is the same as b > m | m | m | n | n | b.

C can be used for "any consonant" and V for "any vowel":

Input Rule Output
alabama
V > o / C_

Additionally, X is "any phoneme", D is any IPA letter (does not capture diacritics) and (superscript D) captures any diacritic symbol. To capture other classes of phonemes, such as stops, nasals, etc., you should use distinctive feature notation.

Optional patterns

Round brackets makes a pattern optional:

Input Rule Output
llama hat
a > o / _(C)#

Adding a star symbol * means zero or more instances of the pattern:

Input Rule Output
llama hats
a > o / _(C)*#

Deletion

Null symbol signifies nothing and can be used to delete phonemes:

Input Rule Output
alabama
C > ∅

You can also literally write nothing: C >. Note that the null character (U+2205) should not be confused the Scandinavian letter Ø (U+00D8).

Insertion

The null symbol can be useful for inserting infixes into a certain position of the word:

Input Rule Output
burro
∅ > it / _V#

Exceptions to rules

The ! symbol signifies an exception to a rule: m > n / !_e means m changes to n everywhere except before e. Similarly, m > n / !e_ means m changes to n everywhere except after e.

Exception rules can also be added to the end of a normal rule, eg: m > n / _e !s_ means m changes n before e, except after s.

Distinctive features

Square brackets are used to specify the presence (+) or absence (-) of distinctive features of a phoneme. The following rule nasalizes a vowel before another nasal:

Input Rule Output
bon
V > [+nasal] / _[+nasal]

[+nasal] matches both nasal vowels and nasal consonants. But you could specify that you only want it to affect nasal consonants with [C +nasal]:

Input Rule Output
bon boõ
V > [+nasal] / _[C +nasal]

Various features can be used to narrow down the match. The following rule deletes voiceless stops, which in this case is only t (d is a stop but is voiced, and h is voiceless but not a stop):

Input Rule Output
tahad
[+stop -voice] > ∅

Here is a complete list:

Feature Meaning
[+affricate] affricate consonants
[+alveolar] alveolar consonants
[+approx] approximants
[+back] back vowels
[+click] click consonants
C or [+consonant] consonants
[+cons] consonantals (different to consonants - see Wikipedia)
[+cont] continuants
[+dorsal] dorsals
[+fricative] fricative consonants
[+front] front vowels
[+glottal] glottal consonants
[+high] high vowels
[+implosive] implosive consonants
[+labial] labial consonants and round vowels
[+laryngeal] laryngeal consonants
[+lat] lateral consonants
[+liquid] liquid consonants
[+long] long vowels or long (geminate) consonants
[+low] low vowels
[+nasal] nasal consonants or nasal vowels
[C +nasal] nasal consonants
[V +nasal] nasal vowels
[+palatal] palatal consonants
[+retroflex] retroflex consonants
[+round] round vowels
[+son] sonorants
[+stop] stop consonants (a.k.a. plosives)
[V +stress] stressed vowels
[+tap] tap consonants
[+trill] trill consonants
[+uvular] uvular consonants
[±+velar] velar consonants
[+voice] voiced consonant or vowel (unless specifically a voiceless vowel)
[C +voice] voiced consonant
[V +voice] voiced vowels
V or [+vowel] vowels

Syllables

% signifies a syllable boundary, and sigma σ matches a whole syllable. The difference between the two is that syllable boundary also includes word boundaries. So a rule such as a > e / _% would match a at the end of word, where as a > e / _σ wouldn't, because it's looking for a whole syllable after it.

Stress

You can use the feature [V ±stress]. For example [V -stress] > ə mimics English's tendency to turn unstressed vowels into ə. However you cannot make rules that modify the word's stress location (we're still working on this feature!). You can also target an individual vowel: [ɪ +stress] > i

Additionally, [V ±stress] does not work for spelling rules. (The generator can't tell where syllables begin and end once it's dealing with your unique spelling system, and not pure IPA.) Instead you must check Make spelling rules sensitive to stress symbol option and make rules that account for the ˈ stress symbol. The following rule looks for the stress mark and any optional consonants before the vowel. A separate rule will be needed to remove the stress symbols:

Input Rule Output
ˈama ˈdrama draˈma
a > á / ˈ(C)*_

Same phoneme in a set

There may be scenarios where you want to match two of the same phonemes from a class. To do this, use the a matching subscript number (₁₂₃₄₅₆₇₈₉) after the uppercase class. For example, V₁V₁ means two of the same vowel in a row:

Input Rule Output
taotii
t > d / _V₁V₁

Different numbers can signify that they are different phonemes. The following pattern matches two consonants in a row and swaps them:

Input Rule Output
ask
C₁C₂ > C₂C₁

If subscripts are left out, the program will make certain assumptions about what the rule means:

Input Rule Output
ask
CC > CiC

Subscripts can also be used after distinctive features: [C +nasal]₁; or sets: {n,m}₁. Multi-patterns can be captured inside curly brackets too:

Input Rule Output
tada
{CV}₁{CV}₂ > {CV}₂{CV}₁

For ease of typing, regular numbers also work: C1C2 > C2C1 is the same as C₁C₂ > C₂C₁

Reduplication

Reduplication is a process in some languages where the whole word, or part of the word, is repeated exactly (or with a slight change). Although it often plays more of a grammatical function (such as forming the plural of a word) rather than being a global sound change.

A quick-and-dirty rule to do a full word reduplication is X(X)* > @@, where X matches at least one phoneme, (X)* matches any number of any remaining phonemes, and @ replaces the whole match with itself (doubled @@):

Input Rule Output
bye
X(X)* > @@

Partial reduplication comes in many forms, and there a few strategies for expressing them as rules. The Pangasinan language (Philippines) may reduplicate the first CVC pattern:

Input Rule Output
baley
#CVC > @@

However, in a more complicated example, it repeats the first consonant only and first vowel. You could express this as being inserted at the beginning of the word with subscript numbers:

Input Rule Output
plato
∅ > C₁V₁ / #_C₁C₂V₁

First and last match

In same cases, you may want to only change the first or last match in the word. Use << for first match and >> for last match:

Input Rule Output
alabama
C >> x
Input Rule Output
alabama
C << x

Coniditional statements

There may be scenarios where you want to create sound changes based on whether a pattern exist in a word, and apply a different change if the pattern doesn't exist. Use the key words IF this pattern is found THEN apply this sound change ELSE apply a different sound change. The following rules tests if the word starts with a. If it doesn't, it applies a totally different sound change:

Input Rule Output
alabama
IF #a THEN a > o ELSE a > e

The ELSE condition is optional and can be omitted. Multiple IF statements are permitted:

Input Rule Output
mars
IF #s THEN #s > z IF s# THEN s# > z

Non-Latin alphabets

Custom spelling supports all Unicode alphabets and scripts, such as Japanese, Chinese, Cyrillic and even Unicode Emojis.

Order of rules

The order of rules matter. Vulgar will find-and-replace the first rule to a word, then apply the next rule over the top of what it just did. This can be a problem if a IPA symbol appears in multiple patterns. For instance, the following ordering is problematic:

ʊ > u
aʊ > ow

The intent here is for the diphthong (as in the word "cow") to change to ow. However, a previous rule has already changed any ʊ (as in the word "put") to u, which means has already changed to au. The easiest solution is to reverse the order of the rules:

aʊ > ow
ʊ > u

Another solution is to use the exception ! symbol: ʊ > u / !a_.