Sound changes

Sound change rules (also known as phonological rules) can be applied to all words using the Sound Change field. This notation can also in affix rules, Custom Spelling, and Illegal combinations field.

A basic rule like a > o means change every a to o. Try it yourself:

Input Rule Output
alabama
a > o
olobomo

You can also specify if the change should only happen in a particular environment: a > o / _b means change every a to o if it comes before a b. The / means in the environment of and _ signifies where the a is.

Input Rule Output
alabama
a > o / _b
alobama

Be aware that letters without diaciritcs are treated as different phonemes to letters with diacritics. Therefore, t > p does not change :

Input Rule Output
tatʰat
t > p
patʰap

Word boundaries

# signifies a word boundary. Notice how the following pattern only changes the first a.

Input Rule Output
alabama
a > o / #_
olabama

You must use ## for rules that go across word boundaries:

Input Rule Output
ooh la la
a > o / _##l
ooh lo la

Similarly, a > ## changes every a to a space; ## > a changes every space to a.

Classes and sets

C matches any consonant and V matches any vowel:

Input Rule Output
alabama
V > o / C_
alobomo

Additionally, X matches any phoneme at all, with or without diacritics on it; D macthes any IPA letter (does not capture diacritics); and (superscript D) captures any diacritic symbol.

Capture a custom set of phonemes inside curly brackets: {b,m} matches b or m.

Input Rule Output
alabama
{b,m} > p
alapapa
alabama
a > o / _{b,m}
aloboma

To capture other classes of phonemes with common features, such as stops, nasals, etc., you should use distinctive feature notation.

Note: Curly brackets cannot be used in the replacement side of the > symbol. Instead, bar symbol | creates a random choice between replacement options: b > m | n | b changes b to m or n or b. All three options are equally probable, but can be weighted with the * symbol: b > m*3 | n*2 | b is the same as b > m | m | m | n | n | b.

Optional patterns

Round brackets makes a pattern optional:

Input Rule Output
llama hat
a > o / _(C)#
llamo hot

Adding a star symbol * after the closing bracket means zero or more instances of the pattern:

Input Rule Output
llama hats
a > o / _(C)*#
llamo hots

Finally, the star symbol without brackets means one or more instances of the pattern. Eg C* matches at least one consonant, or more.

Deletion

Null symbol signifies nothing and can be used to delete phonemes:

Input Rule Output
alabama
C > ∅
aaaa

You can also literally write nothing: C >. Note that the null character (U+2205) should not be confused the Scandinavian letter Ø (U+00D8).

Insertion

The null symbol can be useful for inserting infixes into a certain position of the word:

Input Rule Output
burro
∅ > it / _V#
burrito

Exceptions to rules

The ! symbol signifies an exception to a rule: m > n / !_e means m changes to n everywhere except before e. Similarly, m > n / !e_ means m changes to n everywhere except after e.

Exception rules can also be added to the end of a normal rule, eg: m > n / _e !s_ means m changes n before e, except after s.

Distinctive features

Square brackets are used to specify the presence (+) or absence (-) of distinctive features of a phoneme. The following rule nasalizes a vowel before another nasal:

Input Rule Output
bon
V > [+nasal] / _[+nasal]
bõn

[+nasal] matches both nasal vowels and nasal consonants. But you could specify that you only want it to affect nasal consonants with [C +nasal]:

Input Rule Output
bon boõ
V > [+nasal] / _[C +nasal]
bõn boõ

Various features can be used to narrow down the match. The following rule deletes voiceless stops, which in this case is only t (d is a stop but is voiced, and h is voiceless but not a stop):

Input Rule Output
tahad
[+stop -voice] > ∅
ahad

Here is a complete list:

Feature Meaning
[+affricate] affricate consonants
[+alveolar] alveolar consonants
[+approx] approximants
[+back] back vowels
[+click] click consonants
C or [+consonant] consonants
[+cons] consonantals (different to consonants - see Wikipedia)
[+cont] continuants
[+dorsal] dorsals
[+fricative] fricative consonants
[+front] front vowels
[+glottal] glottal consonants
[+high] high vowels
[+implosive] implosive consonants
[+labial] labial consonants and round vowels
[+laryngeal] laryngeal consonants
[+lat] lateral consonants
[+liquid] liquid consonants
[+long] long vowels or long (geminate) consonants
[+low] low vowels
[+nasal] nasal consonants or nasal vowels
[C +nasal] nasal consonants
[V +nasal] nasal vowels
[+palatal] palatal consonants
[+retroflex] retroflex consonants
[+round] round vowels
[+son] sonorants
[+stop] stop consonants (a.k.a. plosives)
[V +stress] stressed vowels
[+tap] tap consonants
[+trill] trill consonants
[+uvular] uvular consonants
[±+velar] velar consonants
[+voice] voiced consonant or vowel (unless specifically a voiceless vowel)
[C +voice] voiced consonant
[V +voice] voiced vowels
V or [+vowel] vowels

Assimilation

Assimilation is the process where a phoneme takes on a distinctive feature of another phoneme. An example of this in English can be seen in the prefix in- which makes the opposite meaning of a word, eg: tolerant becomes intolerant. However if the word begins with m or p notice how the prefix becomes im- as in immovable and impossible. This is because the nasal consonant n assimilates to the "place of articulation" of the following phoneme; m and p are labial (pronounced with a closure of the lips) so n (alveolar nasal) becomes m (labial nasal). Meanwhile the place of articulation of n and t are both alveolar. And although it is not reflected in the spelling, the word incomplete is actually pronoucned with a ŋ [iŋkomplit]; n assimilates to the velar place of k.

It would be possible to capture this process with two separate rules: n > [+labial] / _[+labial] and n > [+velar] / _[+velar]. However we can write this as a single rule n > [@place] / _[@place]:

Input Rule Output
inpossible inkomplete
n > [@place] / _[@place]
impossible iŋkomplete

This is currently an experimental feature, and works for @place, @manner and @voice for consonants, and @height and @backness for vowels.

Syllables

% signifies a syllable boundary, and sigma σ matches a whole syllable. The difference between the two is that syllable boundary also includes word boundaries. So a rule such as a > e / _% would match a at the end of word, where as a > e / _σ wouldn't, because it's looking for a whole syllable after it.

Stress

You can use the feature [V ±stress]. For example [V -stress] > ə mimics English's tendency to turn unstressed vowels into ə. You can also target an individual vowel: [ɪ +stress] > i

To change the stress location of a word, you must target the location of the vowel, eg V > [+stress] / _(C)*# stresses the last vowel in a word: (C)*# meaning before any number of consonants before the word boundary.

Additionally, [V ±stress] does not work for spelling rules. (The generator can't tell where syllables begin and end once it's dealing with your unique spelling system, and not pure IPA.) Instead you must check Make spelling rules sensitive to stress symbol option and make rules that account for the ˈ stress symbol. The following rule looks for the stress mark and any optional consonants before the vowel. A separate rule will be needed to remove the stress symbols:

Input Rule Output
ˈama ˈdrama draˈma
a > á / ˈ(C)*_
ˈáma ˈdráma draˈmá

Same phoneme in a set

There may be scenarios where you want to match two of the same phonemes from a class. To do this, use the a matching subscript number (₁₂₃₄₅₆₇₈₉) after the uppercase class. For example, V₁V₁ means two of the same vowel in a row:

Input Rule Output
taotii
t > d / _V₁V₁
taodii

Different numbers can signify that they are different phonemes. The following pattern matches two consonants in a row and swaps them:

Input Rule Output
ask
C₁C₂ > C₂C₁
aks

If subscripts are left out, the program will make certain assumptions about what the rule means:

Input Rule Output
ask
CC > CiC
asik

Subscripts can also be used after distinctive features: [C +nasal]₁; or sets: {n,m}₁. Multi-patterns can be captured inside curly brackets too:

Input Rule Output
tada
{CV}₁{CV}₂ > {CV}₂{CV}₁
data

For ease of typing, regular numbers also work: C1C2 > C2C1 is the same as C₁C₂ > C₂C₁

Reduplication

Reduplication is a process in some languages where the whole word, or part of the word, is repeated exactly (or with a slight change). Although it often plays more of a grammatical function (such as forming the plural of a word) rather than being a global sound change.

A quick-and-dirty rule to do a full word reduplication is X* > __ where X matches any phoneme, * matches any number of X, and the underscore _ replaces the whole match with itself (doubled __):

Input Rule Output
bye
X* > __
byebye

Partial reduplication comes in many forms, and there a few strategies for expressing them as rules. The Pangasinan language (Philippines) may reduplicate the first CVC pattern:

Input Rule Output
baley
#CVC > __
balbaley

However, in a more complicated example, it repeats the first consonant only and first vowel. You could express this as being inserted at the beginning of the word with subscript numbers:

Input Rule Output
plato
∅ > C₁V₁ / #_C₁C₂V₁
paplato

First and last match

In same cases, you may want to only change the first or last match in the word. Use << for first match and >> for last match:

Input Rule Output
alabama
C >> x
alabaxa
Input Rule Output
alabama
C << x
axabama

Conditional statements

There may be scenarios where you want to create sound changes based on whether a pattern exist in a word, and apply a different change if the pattern doesn't exist. Use the key words IF this pattern is found THEN apply this sound change ELSE apply a different sound change. The following rules tests if the word starts with a. If it doesn't, it applies a totally different sound change:

Input Rule Output
alabama
IF #a THEN a > o ELSE a > e
olobomo

The ELSE condition is optional and can be omitted. Multiple IF statements are permitted:

Input Rule Output
mars
IF #s THEN #s > z IF s# THEN s# > z
marz

In Vulgar, secondary IF statements are technically what other programming languages call “ELSE IF” statements. Meaning, they don’t get tested if a previous IF condition is satisfied. To create a truly independent IF statement, you need to break up the rules up with a semi-colon ; at the end of the first IF statement.

Non-Latin alphabets

Custom spelling supports all Unicode alphabets and scripts, such as Japanese, Chinese, Cyrillic and even Unicode Emojis.

Order of rules

The order of rules matter. Vulgar will find-and-replace the first rule to a word, then apply the next rule over the top of what it just did. This can be a problem if a IPA symbol appears in multiple patterns. For instance, the following ordering is problematic:

ʊ > u
aʊ > ow

The intent here is for the diphthong (as in the word "cow") to change to ow. However, a previous rule has already changed any ʊ (as in the word "put") to u, which means has already changed to au. The easiest solution is to reverse the order of the rules:

aʊ > ow
ʊ > u

Another solution is to use the exception ! symbol: ʊ > u / !a_.