Spanish gender assignment in an analogical framework


 
 


David Eddington

Mississippi State University
 
 
 

Version 12-2-98
 


Abstract


 


The present study is carried out within the framework of Analogical Modeling of Language (AML) (Skousen 1989, 1992, 1995). AML is a model of language usage that is founded on the premise that all known words are stored as wholes in the mental lexicon. When the need arises to determine the behavior of an unknown word, the lexicon itself is accessed. A search is conducted for the words most similar to the unknown word. The behavior of the word(s) most similar to the word in question generally predicts the behavior of the word in question.

Spanish gender assignment is discussed in terms of analogy. The 1739 most common Spanish nouns serve as an approximation of a Spanish speaker's mental lexicon. Analogies are made based on the phonemic make-up of the nouns studied. AML is shown to correctly assign gender to Spanish words in about 95% of the cases. Misassigned words are explained as exceptional in that they have one gender, but bear more similarities to other words of the opposite gender.

Contrary to what is commonly assumed, a noun's final phoneme is not the best indicator of gender. The best predictions occur when the elements of a noun's final syllable are taken into consideration. Further empirical support for this finding comes from a questionnaire in which Spanish speakers were asked to determine the gender of antiquated words. The subjects' gender preferences are best predicted when the entire final syllable is taken into account.

The AML account of gender assignment is also applied to words which were adopted into Spanish from other languages (Zamora 1975). Analogy does a respectable job of predicting the gender of these borrowings.
 

0. Introduction. Previous studies on Spanish gender fall into one of three major categories. In the pedagogical approach (e.g. Bergen 1978; Teschner and Russell 1984), the emphasis is placed on dividing Spanish nouns into masculine and feminine groups based on their final phonemes, and listing the most common exceptions to those groups. This categorization is made to facilitate the acquisition of Spanish as a second language. Teschner (1983) and Rosenblat's (1952) approach, on the other hand, is mainly descriptive. Their interest is in finding systematic correspondences between nominal gender and phonological patterns in Spanish words. Generative analyses (e.g. Harris 1985, 1991; Klein 1983, 1989) strive to describe gender in terms of a rule system that derives a word's final phoneme(s) given the word's inherent gender and a set of abstract assumptions about the word's underlying structure.

Although each of these analyses is valid in its own realm of inquiry, none of them claims to determine how native speakers go about assigning gender when syntactic clues such as gender-marked determiners and adjectives are absent. Given the mentalistic vocabulary often employed in the generative literature, one could assert that such analyses do explain how native speakers assign gender. Generative accounts may be elegant representations of linguistic structure, however, their status as psychological mechanisms that play a role in actual language usage is dubious (Eddington 1996).

An additional difficulty with the generative approach is that the rules are designed to assign word-final phonemes to words whose gender is previously 'known.' That is, they cannot apply in reverse and assign gender based on the phonemic make-up of an unknown word. Consider the feminine word cama, and the masculine word drama. In spite of their gender differences, they are both -a final. According to Harris (1991), this is explained by the fact that they have a different underlying structure. However, this underlying structure is not visible to a native speaker who needs to determine their gender, and only has the 'surface structure' to work with.

The object of this paper, then, is to discuss gender assignment within an analogical framework, more specifically within the framework of Analogical Modeling of Language (henceforth AML) (Skousen 1989, 1992, 1995). I will begin by briefly outlining this model, and by describing the database and variables used. I will present analogy as a psychologically plausible model of Spanish gender assignment. Exactly what factors are relevant in making gender assignment is a question that has not been dealt with directly, but the evidence from the database utilized in the present study suggests that gender assignment is made on the basis of a word's final syllable. An experiment with native speakers confirms these findings. In addition, AML is found to mirror quite closely the gender assigned to words which were adopted into Spanish from other languages.
 

1. Analogical Modeling of Language. AML is presented as an alternative to both rule models and connectionist models.(1) Most rule models claim to be relevant to linguistic competence and linguistic structure; In general, they do not claim to mirror the psychological mechanisms responsible for language comprehension and production (c.f Kiparsky 1975:198; Chomsky and Halle 1968:117; Bradley 1980:38). AML, in contrast, is presented as a model of linguistic cognition. It is founded on the premise that all known words are stored as wholes in the mental lexicon. When the need arises to determine the behavior of an unknown word, the lexicon itself is accessed. A search is conducted for the words most similar to the unknown word.(2) The behavior of the word(s) most similar to the word in question generally predicts the behavior of the word in question, although the behavior of less similar words has a small chance of applying as well.
 
 
 

The probability that a word will be chosen as an analog is dependent on three derived properties (Skousen 1995:217):

(1) proximity: the more similar the example is to the word in question, the greater the chances of that example being selected as the analogical model;

(2) gang effect: if the example is surrounded by other examples having the same behavior, then the probability of selecting these similarly behaving examples is substantially increased;

(3) heterogeneity: an example cannot be selected as the analogical model if there are intervening examples, with different behavior, closer to the word in question.

These derived properties are important since they constrain what examples can constitute analogs, as well as deciding between competing analogs. These are precisely the factors that traditional appeals to analogy lack.

According to AML, all of the words that are found to be similar in the course of the search constitute the analogical set. It is the words from this set that can serve as analogical models for the word in question. There are two ways in which the contents of the analogical set can influence the outcome (Skousen 1989:82). The first is that a word can be randomly selected from among those in the analogical set, and the outcome for that word applied to that of the word in question. The other possibility is to determine which outcome is most frequent among the words in the set and assign that outcome to the word in question. The amount of influence that a word, or group of words, will have on the word in question is expressed in terms of the probability that the word in question will adopt the behavior of one group or another. In the case of Spanish gender, masculine is one behavior and feminine another. The reader is referred to Skousen (1989) for specific details of the model, and the algorithm it employs, which are beyond the scope of the present paper.
 

2. Database and Variables. The database for the present study included the 1739 most frequent nouns in the Spanish language taken from LEXESP(3) (Sebastián, Cuetos, and Carreiras, in preparation). The most frequent nouns were chosen since psycholinguistic experiments have show that frequent words are more easily retrieved from memory and are retrieved with fewer errors (i.e. Allen et al. 1992; MacKay 1982; Scarborough et al. 1977). Therefore, they are assumed to exert more influence on language usage.

The singular form of all 1739 nouns was encoded in terms of 11 variables. In those instances in which both the singular and the plural form appeared in the database, only the singular was included. Dual-gender words such as mar 'sea', and estudiante 'male or female student' were excluded from the database. For most words in the database, 11 variables is sufficient to encode the phonemic make-up of the word, and the syllabic structure of the final two syllables. This means that some longer words are not fully specified. For these longer words, some phonemes at the beginning of the word are not included. For example, the first three phonemes of principio, /pri/, are not included. This is actually less problematic than it seems since the most relevant part of a noun, as far as gender assignment is concerned, is in the end of the word. This is consistent with the principle of proximity (Skousen 1989:52-3).

The nouns were encoded in this manner: Variables 1-5 incorporate the final syllable. Variables 6-9 incorporate the nucleus and rhyme of the penult syllable. Variables 10-11 incorporate any phonemes preceding the nucleus of the penult syllable, but not the syllable structure of the antepenult syllable.

The algorithm for deriving the variables centers on the syllable nuclei in variables (2) and (9). Starting from the nucleus and working outward, the next tautosyllabic phoneme is assigned to the next variable. If there is none, the next variable is given the syllable boundary symbol '0'. Any other missing segments are marked with the null symbol '-'.
 


Variable Assignment

(1) The rhyme of the final syllable, or the syllable boundary marker '0' if there is none.

(2) The nucleus of the final syllable.

(3) The tautosyllabic phoneme preceding (2) or the syllable boundary symbol '0' if the final syllable has no onset.

(4) The tautosyllabic phoneme preceding (3), or the syllable boundary symbol '0' if there is only one phoneme in the onset, else the non-specification marker '-'.

(5) The tautosyllabic phoneme preceding (4), or the syllable boundary symbol '0' if the onset contains two phonemes, else the non-specification marker '-'.

(6) The syllable boundary marker '0' if two tautosyllabic phonemes follow the nucleus of the penult syllable in (9) else '-'

(7) The second tautosyllabic phoneme following the syllable nucleus of the penult syllable (9), or the syllable boundary marker '0' if only one phoneme follows (9), else '-'.

(8) The first tautosyllabic phoneme following the syllable nucleus of the penult syllable (9), or the syllable boundary marker '0' if the syllable is open, else '-'.

(9) The syllable nucleus of the penult syllable, or '0' if the noun is monosyllabic.

(10) The phoneme preceding (9) or '0' if there is none.

(11) The phoneme preceding (10), or '-' if there is none.
 
 
 
 
 

Examples: Gender: Variables:(4)

10987654321

tiempo M tiem0--0po0

ojo M -0o0---0xo0

origen M ori0---0xen

consecuencia F kuen0-0&ia0
 

3.0 Elements Relevant to Spanish Gender Assignment. The relationship between the gender of a noun and its phonological shape is by no means straightforward. The lack of one-to-one correspondence, led Harris to conclude that in spite of the many generalizations that exist, "the correlation between word marker and grammatical gender is random and arbitrary" (1985:37). I do not intend to rehash the correspondences between gender and phonology which have been adequately dealt with elsewhere (Bergen 1978; Rosenblat 1952; Teschner 1983; Teschner and Russell 1984). Instead, I will explore this question: How much of a noun must be taken into consideration in order to make a gender determination?

Since most words ending in -o are masculine, and most ending in -a and -d are feminine, it appear that only the final phoneme comes into play in assigning gender. With z-final words, on the other hand, one would have to consider at least the last two phonemes; most words ending in -az and -uz are masculine, while those ending in -ez are generally feminine (Teschner 1983). However, words ending in -iz would require an analysis of the entire final syllable in order to make a gender determination; those ending in -briz, -driz, and -triz are feminine, while most others with other final syllable onsets are masculine. Words ending in -ón and -is cannot be correctly assigned gender without considering the third to the last phoneme. Those ending in -ión are mostly feminine, while those in -ón

are generally masculine. Similarly, -sis and -tis words are mainly feminine, while those ending in another consonant plus -is are masculine. At times, it appears that elements belonging to the penult syllable are also germane to gender assignment. About 89% of -e final nouns are masculine (Teschner and Russell 1984). However, there is a substantial number of high-frequencybisyllabic e-final nouns whose penult nucleus is /a/, and whose gender is feminine. It should be evident from this brief sketch, that it is unclear which factors are most relevant to gender assignment.
 

3.1. Evidence from the Corpus. I order to determine which noun-final elements are most important in gender assignment, five different experimental conditions were established according to how much of the noun's phonemic and syllable structure were included: 1) the word's final phoneme,(5) 2) the rhyme of the word's final syllable, 3) the word's final syllable, 4) the word's penult rhyme and final syllable, 5) all 11 variables which includes the word's final two syllables, and in some cases elements of the antepenult syllable as well. In each condition, the 1739 words in the database were removed one at a time. Each word's gender was then determined on the basis of the similarities it bears to other words in the database, according to AML's algorithm. Table 1 contains some sample outcomes calculated by AML when the elements of the final syllable are used as the variables. It indicates that mapa and catástrofe are among those words incorrectly assigned gender by AML.
 


0
Word Actual Gender Probability of Masculine Probability of Feminine
tabaco masc. 1.000
grito masc. .992 .008
edición fem. 0 1.000
hierba fem. 0 1.000
ideal masc. .768 .231
*mapa masc. .125 .875
presente masc. .821 .179
*catástrofe fem. .667 .333

Table 1. Sample Outcomes of Gender Probability Based on AML.

Table 2 specifies the success rate in assigning gender, in each of the experimental conditions. Analogy proves to be quite adept at correctly assigning gender in all five conditions. More importantly, the data in Table 2 demonstrate that the phonemes in the final syllable are the best predictors of a noun's gender. However, this finding is only valid as far as the items in the database are concerned, and is not necessarily relevant to native speaker's preceptions. For this reason, a study involving native Spanish speakers was carried out.
 
 
 

Experimental Condition # of Resulting Errors % Correctly Assigned Gender
1) Final Phoneme 119 93.2
2) Final Rhyme 106 93.9
3) Final Syllable 87 95.0
4) Penult Rhyme and Final Syllable 94 94.6
5) All 11 Variables 103 94.0

Table 2. Outcome of the Five Experimental Conditions.

3.2.0. Gender Assignment Task.(6) According to the outcome of the study on the database, the final syllable appears to contain the most relevant information on which to make gender assignment. The purpose of the gender assignment task is to determine if the same holds true when native speakers are faced with the task of assigning gender to novel words.
 

3.2.1. Stimulus Materials. 118 nouns were extracted from Diccionario de la lengua española (Real Academia Española 1995). Each of these words are considered antiquated and of infrequent use (see Appendix). Therefore, they were highly unlikely to be known by the subjects, which also means that their gender would be unknown. Words were chosen that ended in phonemes other than o and a. In this way, the more obvious gender/phoneme correspondences were eliminated, and the subjects had to make gender assignments on the more ambiguous cases.
 

3.2.2. Subjects. 31 literate native Spanish speakers from Spain participated in the study, 18 women and 13 men. The average age of the subjects was 33.4.

3.2.3. Procedure. The 118 test items were presented in the form of a written questionnaire. The subjects were asked to circle either the feminine article la or the masculine article el which appeared before each test item. They were instructed to choose the article that was most appropriate for the word that followed.
 

3.2.4. Results and Discussion. The percentage of questionnaire responses in which the masculine article was preferred was tabulated. In addition, the 118 test items were also assigned gender based on analogy with the items in the database. The five different conditions described in section 3.1 were applied, and the probability that each test item would be assigned masculine gender was calculated for each of the five conditions. Pearson correlations were then calculated between the results of the questionnaire, and the five conditions calculated by AML.
 


Condition Correlation
1) Final Phoneme .294
2) Final Rhyme .403
3) Final Syllable .502
4) Penult Rhyme and Final Syllable .441
5) All 11 Variables .419

Table 3. Correlations Between AML and Questionnaire Outcomes in 5 Experimental Conditions.

These results mirror those obtained when AML assigned gender to the members of the corpus themselves. It is clear that the highest correlation between the gender assignments made by the subjects, and those that result when gender is assigned by AML is obtained when the phonemes of the final syllable are considered. Once again, it appears that the final syllable is the best predictor of a noun's gender.
 

4. The Gender of Lexical Borrowings. Now that it is clear that a noun's final syllable contains the most relevant information for gender assignment, the analogical approach can be applied to interesting questions of language usage. One of these is the gender that is assigned to borrowed words. Poplack, Pousada and Sankoff (1982), Zamora (1975) and Zamora and Béjar (1987) studied borrowings into Spanish and determined the factors best predicts the gender a word takes when adopted into Spanish. Among these are the phonemic structure of the word, and the gender of the human referent. Zamora (1976) discusses words of Native American and English origin to which AML may be applied.

Applying AML to the words of Native American origin is somewhat problematic since the majority of these were adopted during the colonization of the Americas. The database used in the present study purports to be an approximation of the most frequent nouns in a contemporary Spanish speaker's mental lexicon. Therefore, using this database to approximate the lexicon of a typical Spanish speaker during the colonial period is somewhat flawed from the outset. Nevertheless, the benefits of testing these items seemed to outweigh the obvious disadvantages. Therefore, I encoded the final syllable(7) of the 67 Native American words whose gender was unambiguous according to Zamora's study. They were then assigned gender by AML.

60 of the words were correctly assigned gender, while 7 masculine words were incorrectly assigned feminine gender: cazabe 'type of tortilla,' jagüey 'well,' maíz 'corn,' tatuán, (8) inca 'Inca,' lucumba,8 and yanacona 'sharecropper.' Two of the misassigned words, both of which end in -a , have masculine human referents (inca, yanacona). In these cases, the human referent factor seems to have overridden the phonemic variables in gender assignment. This is consistent with the finding of Poplack et al. (1982) that the gender of a human referent outweighs all other factors. In any event, the success rate of AML is about 90% in spite of the fact that a modern database was used to predict gender assignments made 400 to 500 years ago.

Zamora also studied borrowings from English into Puerto Rican Spanish. He asked 13 bilingual speakers to determine the gender of 20 English words that are commonly used in Puerto Rican Spanish. Statistics for each individual item are not available. However, the Spanish rendition of words ending in -a were assigned feminine gender in 98.5% of the responses. The sole word ending in -o was unanimously assigned masculine gender. All other words were assigned masculine gender in 97% of the responses. Table 4 contains the gender probabilities for these 20 borrowings as calculated by AML, and the gender assignments made by Zamora's subjects.
 


English P.R. Spanish Gender Probability of Masc. Probability of Fem.
market /marketa/ F .019 .981
appointment /apoinmen/ M 1.000 0
lunch /lon/ M 1.000 0
furniture /furnitura/ F 0 1.000
driveway /drajwej/ M .580 .420
break/brake /brejk/ M .686 .314
*ride /raj/ M 0 1.000
building /bildin/ M 1.000 0
ice cream /ajskrin/ M .974 .026
hole /xol/ M 1.000 0
basement /bejsmen/ M 1.000 0
yard /yar/ M .993 .007
nursery /nurseria/ F .063 .938
safety can /safakon/ M 1.000 0
coat /kow/ M 1.000 0
mop /mapo/ M 1.000 0
ticket /tike/ M 1.000 0
bluff /blof/ M 1.000 0
freezer /friser/ M 1.000 0
bill /bil/ M 1.000 0

Table 4. AML Results for English Borrowings.

The AML assignments correspond quite closely to those of the subjects. The major disagreement is the gender assigned to ride /raj/. Driveway /drajwej/ is a border-line case in that it has the highest probability of taking masculine gender, but only by a small margin. In any event, the success rate of AML is 95%. It is important to view this in light of the fact that Zamora's subjects di not unanimously agree on the gender of the borrowed words. They agreed on the gender of the words in about 97-98% of the cases.
 

5. The Adequacy of AML as a Model of Linguistic Cognition. The fact that the success rate of AML in assigning Spanish gender does not reach close to 100% could be construed as evidence against it. One could argue that rule-based models would be able to correctly assign gender to all nouns. The difficulty with this criticism is that rule-based models are not intended to be models of language usage, nor can the formal mechanisms they employ be considered to mirror psychological mechanisms. In fact, they are not even designed to assign gender on the basis of surface forms.

AML, on the other hand, is presented as a model that relates to actual mental processes. It is based on the assumption that speakers do not need to make any sort of tacit or explicit generalizations or rules about language data. Analogy yields rule-like behavior without utilizing rules. It is based on the fairly uncontroversial idea that words are stored in the mind and accessed as needed. The idea that groups of words can affect the behavior of other words that bear resemblances to the members of the group has been abundantly attested to in the psycholinguistic literature (e.g. Bybee and Slobin 1982; Jared et al. 1990; Stemberger and MacWhinney 1986). In addition, the variables used in the present study--phonemes and syllables--have been shown to be relevant to linguistic cognition(9) (Jeager 1980; Sebastián 1996).

What of the errors made by AML? First of all, it is important to remember that all known nouns are assumed to be stored in the mental lexicon, and their gender known. Given perfect memory, AML will correctly predict the gender of a known word at a rate of 100%. An exceptional masculine word such as drama, which ends in a generally feminine ending -a, will be misassigned feminine gender when the model is asked to treat it as if it were unknown. What this says, in effect, is that drama is an 'exception to the rule,' although no rule is assumed to exist, and without having to mark it as an exception. Errors of this sort are common among students learning Spanish as a second language.
 

6. Conclusions. In the preceding pages, AML has been presented as a model of linguistic cognition. When applied to the question of Spanish gender assignment, it does a formidable job of assigning gender based on the surface properties of Spanish words. In addition, AML predicts the gender of Spanish words that were adopted from other languages with a high degree of accuracy. With AML as a tool of linguistic analysis, it was determined that the final syllable is the most relevant part of Spanish nouns in determining gender assignment.
 
 



Appendix

Stimulus Materials
abarraz
acates
acemite
acordación
acumen
afer
afice
alancel
alcaduz
alcalifaje
alcamiz
alinde
alioj
alizace
amarillor
anascote
arrafiz
asperez
atarfe
avarientez
azcón
azoche
balizaje
barrunte
beudez
bitumen
bocacín
botor
broznedad
cabción
cabrial
cafiz
calicud
calonge
cambil
candelor
canez
carauz
ceción
celtre
cifaque
cipión
cobil
coce
cocadriz
compage
consuetud
consulaje
copanete
cotrofe
crenche
criazón
crochel
chivitil
delate
desdón
deslate
destín
disfrez
egestión
elébor
emiente
entalle
entenzón
epiglosis
escambrón
escorche
escrocón
esgambete
esguarde
esledor
estipe
estruz
evagación
fabledad
fenestraje
fluxión
folguín
fosal
gafez
gagate
garifalte
grasor
gubilete
guiaje
ingre
jusente
lailán
lande
lavajal
lerdez
linamen
mandrial
mansuetud
másticis
menge
meridión
merode
nacre
orebce
palude
panol
peraile
pernicie
pólex
primaz
pujés
realme
rebalaj
riste
senojil
sorce
sozprior
tabelión
trascol
velambre
venadriz
venderache
 

References
 


Allen Phillip, Michelle McNeal, Donna Kvak. 1992. Perhaps the lexicon in coded as a function of word frequency. Journal of Memory and Language 31.826-44.

Bergen, John J. 1978. A simplified approach for teaching the gender of Spanish nouns. Hispania 61.865-876.

Bradley, Diane. 1980. Lexical representation of derivational relation. Juncture, ed. by Mark Aronoff and Mary-Louis Keaton, 37-55. Saratoga, Cal.: Anma Libri.

Bybee, Joan L. and Dan I. Slobin. 1982. Rules and schemas in the development and use of the English past tense. Language 58.265-89.

Chomsky Noam, and Morris Halle. 1968. The sound pattern of English. New York: Harper and Row.

Eddington, David. 1996. The psychological status of phonological analyses. Linguistica 36.17-37.

Harris, James W. 1985. Spanish word markers. Current issues in Hispanic phonology and morphology, ed. by Frank H. Nuessel Jr.. Bloomington, IN: Indiana Linguistics Club.

________. 1991. The exponence of gender in Spanish. Linguistic Inquiry 22.27-62.

Jared, Debra; Ken McRae, and Mark S. Seidenberg. 1990. The Basis of Consistency Effects in Word Naming. Journal of Memory and Language 29.687-715.

Jeager, Jeri J. 1980. Testing the psychological reality of phonemes. Language and Speech 23.233-253.

Jones, Daniel. 1996. Analogical natural language processing. London: UCL press.

Kiparsky, Paul. 1975. What are phonological theories about? Testing linguistic hypotheses, ed. by David Cohen and Jessica R. Wirth, 187-209. Washington D. C.: Hemisphere.

Klein Philip W. 1983. Spanish gender morphology. Papers in Romance 5.57-64.

________. 1989. Spanish 'gender' vowels and lexical representation. Hispanic Linguistics 3.147-162.

MacKay, Donald G. 1982. The problems of flexibility, fluency, and speed-accuracy trade-off in skilled behavior. Psychological Review 89.60-94.

Poplack, Shana, Alicia Pousada, and David Sankoff. 1982. Un estudio comparativo de la asignación de género a préstamos nominales. El español caribe, ed. by Orlando Alba, 239-269. Santiago, Dominican Republic: Universidad Católica Madre y Maestro.

Real Academia Española. 1995. Diccionario de la lengua española. CD ROM version of the 21st edition. Madrid: Espasa Calpe.

Rosenblat, Angel. 1952. Género de los sustantivos en -e y en consonante. Estudios dedicados a Menéndez-Pidal, vol. 3, 159-202. Madrid: Concejo Superior de Investigaciones científicas.

Scarborough, Don L., Charles Cortese, and Hollis S. Scarborough. 1977. Frequency and repetition effects in lexical memory. Journal of Experimental Psychology 3.1-17.

Sebastián-Gallés, Núria. 1996. Speech perception in Catalan and Spanish. Language Processing in Spanish, ed. by Manuel Carreiras, José E. García Albea, and Núria Sebastián Gallés, 1-17. Mahwah, N.J. : Erlbaum.

Sebastián, Núria, Fernando Cuetos, and Manuel Carreiras. In preparation. LEXESP: Creación de una base de datos informatizada de español. Report, Universitat de Barcelona.

Skousen, Royal. 1989. Analogical modeling of language. Dordrecht: Kluwer Academic.

________. 1992. Analogy and structure. Dordrecht: Kluwer Academic.

________. 1995. Analogy: A non-rule alternative to neural networks. Rivista di Linguistica 7.213-232.

Stemberger, Joseph Paul and Brian MacWhinney. 1986. Frequency and the lexical storage of regularly inflected forms. Memory and Cognition 14.17-26.

Teschner, Richard V. 1983. Spanish gender revisited: -Z words as illustrating the need for expanded phonological and morphological analysis. Hispania 66.252-256.

Teschner, Richard V. and William M. Russell. 1984. The gender patterns of Spanish nouns: An inverse dictionary-based analysis. Hispanic Linguistics 1.115-132

Juan Clemente Zamora. 1975. Morfología bilingüe: La asignación de género a los préstamos. Bilingual Review 2.239-247.

Zamora Munné, Juan C. and Eduardo Béjar. 1987. El género de los préstamos. Revista española de lingüística 17.131-137.

1. See Jones (1996) for a comparison of Connectionism and AML.

2. In this study, the phonemic attributes of words are assumed to be the relevant variables. However, AML can also incorporate other variables such as sociolinguistic variables: age, sex, social class etc. (Skousen 1989:97-100).

3. The current study is based on LEXESP, which is a morphologically tagged frequency dictionary of Spanish of about 2.5 million words. A more recent version, LEXESP II, is based on a 5 million word corpus. It is available at: psico.psi.ub.es/pub/lexesp/frectken.zip.

4. ASCII characters represent certain phonemes: R=/rr/, $=//, &=/2/, %=/8/.

5. In this condition, the variables in the database had to be changed so that word final vowels were compared to word final consonants.

6. I am most indebted to Milagros Malo Fernández and Elías Álvarez Ortigosa who took charge of administering the questionnaires for this study.

7. The syllabification of the cluster -tl- is perhaps the only cluster in Spanish in which there is disagreement as to how it should be syllabified. Two borrowed words contained this cluster, cocolistle, and tanatle. Both possible final syllables, -le and -tle were tested, and in both cases the words were given masculine gender.

8. Meaning unknown.

9. The syllable is pertinent to Spanish language processing, but its relevance to processing other languages is unclear.