Morfa for developers

Here starts the documentation for making Morfa work for new languages. The goal is to make a demo of Morfa-C with 2-3 task templates and the lemmas from Leksa. Here you can see how the North Sámi Morfa-S and Morfa-C work. Choose instruction language in the right margin before you choose the program.

Tasks for Morfa-C

Now starts the fun part of the work. Using the semantic classes in the lexicon you can make tasks for the students. They get a lemma in base form, and their task is to inflect it. What kind of tasks depends on the language. Your tools are the semantic sets of FAMILY and PEOPLE nouns, HUMAN_A adjectives, PLACE nouns, MOVEMENT_V verbs, ACTIVITY verbs, FOOD and DRINK nouns and FOOD_A adjectives.

The semantic classes in the lexicon make it possible to make e.g. these templates:

  • - Where does your FAMILY live? My FAMILY lives in ______ (PLACE).
  • - What does PEOPLE drink? PEOPLE drinks ______ (DRINK).
  • - What does PEOPLE eat? PEOPLE eats ______ (FOOD).
  • - Who are in PLACE? ______ (PEOPLE Pl) are.
  • - What you FAMILY do today? Today my FAMILY _______ (ACTIVITY-verb).
  • - What ADJ PEOPLE did yesterday? Yesterday PEOPLE _______ (ACTIVITY-verb).
  • - How are PEOPLE Pl? PEOPLE pl are ______ (HUMAN-adj).
  • - Where to PEOPLE MOVEMENT-verb? PEOPLE MOVEMENT-verb _______ (to PLACE).

Of course you can many places use pronoun in stead of variable in the answer.

Question-answer-pair templates in xml-format

The template contains both variables and constants. This is an example of a question-answer-pair template in North Saami. The first text is "What SUBJ did yesterday?", the next one is "Yesterday SUBJ MAINV". The line with game="morfa" implies that the task for the student is to write the correct word form of the MAINV.

   <q id="prtSg">
    <qtype>PRT</qtype>
    <question>
      <text>Maid ADJ SUBJ barggai ikte</text> 
      <element id="ADJ">
	<sem class="HUMAN_A"/>
	<grammar tag="A+Attr"/>
      </element>
     <element id="SUBJ">
	<sem class="PEOPLE"/>
	<grammar tag="N+Sg+Nom"/>
      </element>
    </question>
    <answer> 
      <text>Ikte SUBJ MAINV</text> 
      <element game='morfa' id="MAINV" task="yes">
	<grammar tag="V+Ind+Prt+Sg3"/>
	<sem class="ACTIVITY"/>
      </element>
    </answer> 
  </q>	
  
 

This first template generate tasks like "What the old teacher did yesterday?" "bake" "Yesterday the teacher ____ " . As you see, the variable ADJ is just for giving variation. The qtype is PRT: to inflect verbs in past tense. It will be useful when you have made so many templates that you will sort them as options for the student. But we will not use it in the demo. Every template has an id to make it easier to remove them.

One example with inflecting adjectives in plural:

   <q id="AdjPred">
    <qtype>PRED</qtype>
    <question>
      <text>Makkárat SUBJ leat</text>
     <element id="SUBJ">
	<sem class="PEOPLE"/>
	<grammar tag="N+Pl+Nom"/>
      </element>
    </question>
    <answer> 
      <text>Sii leat ADJ</text>
      <element game='morfa' id="ADJ" task="yes">
	<grammar tag="A+Pl+Nom"/>
	<sem class="HUMAN_A"/>
      </element>
    </answer> 
  </q>		
  

The first text is "How are the SUBJ?", the next one is "They are ADJ". The line with game="morfa" implies that the task for the student is to write the correct word form of the ADJ. This template will generate tasks like "How are the girls?" "clever" "They are ______ (plural form in North Saami) ".

One example with inflecting nouns in illative:

  <q id="nill">
    <qtype>N-ILL</qtype>
    <question>
      <text>Gosa du SUBJ MAINV</text> 
      <element id="SUBJ">
	<grammar pos="N+Sg+Nom"/>
	<sem class="FAMILY"/>
      </element>
      <element id="MAINV">
	<grammar tag="V+Ind+Prs+Sg3"/>
	<sem class="MOVEMENT_V"/>
      </element>
    </question>
    <answer> 
      <text>SUBJ MAINV N-ILL</text>
      <element game='morfa' id="N-ILL" task="yes">
	<grammar tag="N+Sg+Ill"/>
	<sem class="PLACE"/>
      </element>
    </answer>
  </q>
	
  

The first text is "Where to SUBJ MAINV?", the next one is "SUBJ MAINV N-ILL". The line with game="morfa" implies that the task for the student is to write the correct word form of the N-ILL. This template will generate tasks like "Where to your sister runs?" "school" "Sister runs ______ (illative case in Saami) " .

For the demo we do it simply. E.g. we don't use pronouns in the answer, which in principle is easy, but then we have to take in account that in some languages there are genders. We have also an agreement fuction e.g. between SUBJ and MAINV which one could use.

Save the templates in a file with the language code, e.g. questions.sme.xml

Word forms

The lexicon contains lemmas. The Morfa-C tasks need word forms, which are generated with an FST. For that you need files, like v_paradigm.txt, telling what forms to generate.

V+Inf
V+Ind+Prs+Sg1
V+Ind+Prs+Sg2
V+Ind+Prs+Sg3
V+Ind+Prs+Pl1
..  

Read more about generation of word forms.

Oahpainstallation

Morfa-S

Morfa-S is based on generated word forms. You have to decide what options there should be, take a look at the North Saami Morfa-S.

Morfa-S offers help if the student wants to know how to make the word form. The help-information is generated based on the combination of the morphological tags and the task itself. This will be explained here.