My apriori conlang
Find a file
2025-10-13 22:26:57 +03:00
all_syllables_list.txt update files: change q to g for velar nasal 2025-10-02 16:49:00 +03:00
all_syllables_table.txt update files: change q to g for velar nasal 2025-10-02 16:49:00 +03:00
Dictionary.md update files: change q to g for velar nasal 2025-10-02 16:49:00 +03:00
generate_all_syllables.py change q to g for velar nasal 2025-10-02 16:48:45 +03:00
README.md fixed paragraph spasing in Parts of Speach 2025-10-13 22:26:57 +03:00
Syllables.md update files: change q to g for velar nasal 2025-10-02 16:49:00 +03:00
vocabulary_table.ods added frames to the blocks (that correspond to one consonant) in the meaning-list columns for clearer visual separation 2025-10-13 20:49:50 +03:00

Table of Contents

  1. Table of Contents
  2. The Generator
  3. About the Language
    1. Phonetics
      1. Consonants
      2. Vowels
    2. Writing Systems
    3. Grammar
      1. Overview
      2. Parts of Speech
      3. Word Order

The Generator

There is a python scrypt to generate a list of syllables. In it's first lines, there are defined phonemes of the language: consonants and vowels, and names for consonant groups. Next go the functions to generate different files, and the main body with command line parsing.

usage: generate_all_syllables.py [-h] [--table] [--list] [--markdown] [--output OUTPUT]

Print matrix in different formats

options:
  -h, --help       show this help message and exit
  --table          Print as table
  --list           Print as list
  --markdown       Print as markdown
  --output OUTPUT  Output file

There are four possible files: list, table, markdown list and markdown table. Markdown version of the list loads symbol definition from the .txt file. Current algorithm for loading the definitions is not efficient, but works fast enough.

About the Language

Phonetics

All possible syllables are displayed in this file.

Consonants

There are 16 consonants, which can be classified by two traits: a place of articulation and a manner of articulation. There are 4 places of articulation: back (Velar), up (Palatal), down (Alveolar) and front (Bilabial).

  • The first group is back, as they are closer to the lungs;
  • Next goes up, as it has a body of a tongue lifted;
  • Down with the body of a tongue lowered and the tip lifted;
  • Front, pronounced with the frontest part of the bouth - lips.

There are 4 manners of articulation:

  • plosive: it completely blocks the airflow,
  • nasal: it allows airflow through the nose,
  • fricative: it allows airflow through the mouth,
  • approximant: it represents the phonemes that could be considered semi-vowels.

A consonant phoneme lies at the intersection of two groups of these two classifications. Let's list all the consonant phonemes:

Letter Place Manner IPA English Occures in words
k back plosive k k staCKer
g back nasal ŋ or ŋɡ ng stiNG
h back fricative x or h kh, h scottish loCH, russian Хурма (khurma) or Hot
j back approximant j y Yellow
c up plosive or t̠ʃ ch, ć polish pisaĆ, russian Через (cherez) or CHill
y up nasal ɲ or ɲc ny, ñ espaÑol, canyon or russian коНЬКи (kon'ki)
x up fricative ɕ or ʃ sh, ś Sure, russian Щепка (sh'epka), polish coŚ or SHame
l up approximant ʎ or l ll, l miLLion or Lamb
t down plosive t t Tall
n down nasal n or nd n, nd Neck or eND
s down fricative s or θ s, th Sad or THink
r down approximant ɹ or r r, rr Ray or spanish peRRo
p front plosive p p Put
m front nasal m or mb m, mb Mold, laMB
f front fricative ɸ or f f japanese Fuhai or Fall
w front approximant w or ʋ w Well or indian Vine

As you can see, there are some variations allowed, and not all phonemes have their standard IPA definitions. Namely, "j" is back approximant here, and trill "r" is allowed down approximate. That is designed that way to allow more people to pronounce every phoneme easily (I doubt that everyone can pronounce the real velar approximant). It also worths noting that nasal phonemes are allowed as double clusters, namely ng, n'k', nd and mb, as not every language has a full inventory of nasal consonants. These clusters should be relatively easy to pronounce and distinguish in speech.

Vowels

Vowels, like consonants, have two classification - opennes and roundness. There are 5 vowels: two rounded, three unrounded, two open, two close and one middle. Let's list all the vowel phonemes:

Letter Opennes Roundness IPA Occurs in words
a open unrounded a or ɑ polish jAjo or fAther
e middle unrounded or ɛ spanish bEbé or bEd
i close unrounded i frEE
u close rounded u bOOt
o open rounded or ɔ spanish tOdo or bOy

Generally the vowels are clear, like in spanish, and don't form diphtongs. It is advised to put a glottal stop between vowels, but it should not sound as the "k" consonant phoneme.

Writing Systems

The main writing system is latin alphabet, that is used on this page (you can see it in the Phonetics section). However, if you wish, you can create your own script, or use my own. It is relatively easy to create a syllabary or abugida for the language, as consonants and vowels are already classified, and you only need to assign some graphemes to the classes. In my custom script, there are four basic letters: <, c, b and ɛ. The set above is for the back consonants (it faces left). Shape < is for plosives, as it's sharp and pointy-like, just like plosives; Shape c (or u for down) is for nasals, as it's closed (rounded) on one side and open on another, just like mouth is closed and nose open; Shape b (d for up, p for down and q for front) is for fricatives, as it looks like an obstruction on the way, or as a wisstle; Shape ɛ (w for down, m for up) is for approximants, as it consists of two parts approaching each other. Direction to left (<) is for back consonants (k, q, h, j), up (m) - for up (c, y, x, l), down - to down (t, n, s, r), and to right - for front (p, m, f, w). For the vowels, you add (or don't add) diacritics:

  • - for unrounded,
  • v for rounded,
  • on top - for open
  • on bottom - for close.
  • For "e" you don't add any diacritics.

Example: word kelanuifor (you don't need to know what it means, we're discussing only phonetics here) would look like <m̄ṷ'q̱w̌. It's syllables are: ke, la, nu, if, or, and to separate forvard (CV) syllables from backward (VC) we can use '.

Grammar

Overview

The language has oligosynthetic grammar. Oligosynthesis = oligo (few) + synthesis (combination) = combination of few. There are 4 * 4 * 5 * 2 = 160 primes in the language, that combine into different words. (4 places of articulation, 4 manners, 5 vowels and two possible orders - CV and VC). The word structure is [CV]n[VC]m where n ≥ 1 and m ≥ 1. The stress, or tone change, goes on the half of the word - where CV syllables end and VC syllables begin. So, the word sounds like "tatatAEpepep". Alternatively, you can mark CV syllables as rising tone, and VC - as falling, like "ta/ta/ta/et\et\et\". These measures can help the listener to understand more easily, where the semantic part ends and grammatical part begins. CV syllables in the beginning of the word are semantic - they carry the meaning of a word. VC syllables in the end of the word are grammatic - they modify the meaning or tell the role of a word in a sentence. For example: hecaohas = heca + ohas, heca = he (perceive) + ca (eye) = see, ohas = oh (do, verb) + as (past tense) = did in the past, hecaohas = saw. Grammatical VC syllables can transform one part of speach into another (for example, a verb to a noun), modify the meaning (ka-person + il-all -> kail - all people, humanity), indicate tense, case, part of speech, conditions, questions and much more. In that regard, you can say that the language has agglutinative features - there are many affixes (building blocks) with predictable meaning that modify the word. Oligosynthetic language is different from agglutinative in that the number of these building blocks is fixed and usually small (160 in our case). The root of the word consists of its own building blocks - in the "hecaohas" example the root "heca" (see) can be analyzed as "perceive with an eye". This is a feature of analytical languages, such as chinese or vietnamese. In such languages you usually don't invent words on the go, combining smaller words - there are stable idiomatic ways of saying things. Same is true for this language: you don't need to think every time how to combine small parts to get the needed meaning, and the listener doesn't have to analyze every root. Instead, once you've learned (or invented) a complex word once, you reuse it as is. The fact that the roots consist of small building blocks helps to learn words more easily - when you see a new word, you might already know what it means, without looking it up in a dictionary. However, if you can't guess the meaning, or can't be sure if you've guessed it right, or need to say some word and don't remember it - you can consult with the dictionary to see an explication of the root you need. Most probably, explication will use the building blocks from the word (like "to see" = "too perceive the light from objects through you eyes"). Explications only use one-syllable CV roots, so when you've learned all 80 of them (and the grammar of course), you can understand any word in the dictionary. This is the concept of semantic primes - every complex meaning can be explained through universal undivideable building blocks. Usually the root starts with the most important syllable, then go syllables that narrow the meaning until it's precise enough. So "to see" is "heca", not "cahe", because seeing is the process of perceiving, and by the medium of an eye. "cahe" would mean something like an eye that can perceive, which could be useful in some situations, but that's definitely not a synonim to "see".

Parts of Speech

Three different parts of speech can be identified: verb, noun and adjective. Verbs tell what action is being performed. Nouns tell what objects are involved in this action: who does it, who receives it, who helps, who watches, where and when, and much more. Adjective describe verbs and nouns - what color is the object, how big is it, how fast and strong is the action etc. CV* part of the word without any suffixes is a Base of no part of speech. What part of speech does a word belong to can be seen from it's ending (the last suffix). Suffixes can:

  • assign a part of speech to Bases (CV* parts without VC suffixes, example: case suffixes, that say that an object of some class is the subject or the object in the sentence: home-at = home (base) -> at home (noun, where? locative case));
  • change a part of speech into another (for example, Adjective -> Verb: happy-quality-be = happiness (base) -> happy (adjective) -> to be happy (verb))
  • preserve a part of speech of the word and only modify the meaning (like question, negation or tense suffixes: move-do-future = move (base) -> to move (verb) -> will move (still verb, but in future))
  • return a word from a part of speech back to the Base (example: run-do-"er"-be = run (base) -> to run (verb) -> runner (base, because without the case suffix) -> to be a runner (verb once again))

What every suffixes does to every part of speech is described in it's definition in the dictionary.

Word Order

Although all words are marked with their role in a sentence, it is advised to follow a head-final word order. Head final means that the most important word in a phrase goes last. But what is the most important word in a phrase, you might ask. It's not just the word you think is most important currently, or some new word, or the most beautiful. The most important word (the head) in the sentence is the verb. The head in a noun phrase is the noun. The head in a verb phrase is the verb. So, the order should be: modifier-of-subject subject (modifier-of-object object)* adverb verb. (smth)* means that smth is optional, and that there can be many of these smth. Modifiers to nouns can be sentences themselve, in that case these subsentences start with placeholders for the modified noun, and the verbs in them are marked. If there are many words that share the role in a sentence (for example, many subjects, or verbs), then the first one is not marked, and all consecutive are marked. These twins should go in a row: adj adj-and subj adj subj-and adj subj-and adj obj-case1 adj obj-case1-and adj obj-case1-and adj obj-case2 adj adj-and adj-and obj-case2-and adv verb adv adv-and verb-and Two sentences can be connected (for example, if something then something). What is said with particles like "if", "then", "more than" etc in english, can be said with grammatical VC suffixes in the word. So, "if-then" structure would look like this: subj obj verb-if - subj' obj' verb'-then. (The order of these two parts can be changed if it's more natural or logical). The order of objects of different cases (with different suffixes) is free: if you want to say the direct object of a verb before the place where the verb is done, you can do so. But if you want to say the other way around, you free to do that too. So, "subj obj-direct obj-location verb" is the same as "subj obj-location obj-direct verb", maybe emphasis is different.