Recoding ASL with a spoken phonology

last substantial update 22 June 2009; last change 23 May 2014


Introduction Which types of phones are what? Non-manual elements Handshapes Orientation Places Place modifications Motion endpoints Two-handed signs Leftovers Examples Rounding off some corners


The aims of the project I describe on this page are to produce a mapping from the phonology of American Sign Language to a spoken phonology, which

  1. carries related manual articulations to related spoken articulations;
  2. is able to reproduce at least the great majority of meaning-bearing distinctions in ASL; and
  3. yields a spoken language with not too highly unnatural a phonology in and of itself (taking distributional markedness of its phonemes into account).

I've followed another principle in the main, that

  1. it should be possible to convert an ASL word to its spoken form mechanically, and in a fashion dependent only on the form of that word.
This principle is what one would want if, say, one were designing an
engelang meant to have isomorphic spoken and signed realisations. Indeed it's primarily an interest in multi-modal languages that led me to undertake this project. Transplanting a natural language in one mode to the other was meant to give me some experience with what kinds of features I can expect to go over smoothly in the translation, and for which it's harder to set up good analogues; and I chose to go from signed to spoken because signed seemed the higher bandwidth of the two.

Principle 4 eliminates the possibility of selectively choosing to mark or ignore certain features when transcoding an ASL word, according to whether they're necessary to retain contrasts with another word. But one might want to allow this selectivity; only requiring spoken coding of what's necessary to reconstruct a word would yield a leaner and in all expectations more natural spoken language, albeit one with less fidelity. Some thoughts in this direction constitute my last section.

This project has much in common with producing a transcription system for ASL. In transcription systems, there is plenty of prior art: a few examples for ASL are Stokoe notation and Thomas Stone's ASL Sign Jotting, and with the larger bailiwick of signed languages in general, HamNoSys, SignWriting, and David Peterson's Sign Language IPA. Now, a system meeting my goals, when written out in IPA, would yield a sort of transcription scheme for ASL, but a poor one. The choices of symbols would seem unmotivated, phonetics being a poor motivation by any external standard; and I have made many concessions to obtaining a half-sensible spoken phonology which would be out of place in a devoted transcription system. Ultimately I do want something speakable.

By "speakable" I don't necessarily mean 'comfortable for English speakers' — even as English speakers would be the most likely adopters of any language based on ASL; that's too practical a concern for me ;-). I'm simply aiming for a language within the natlang ambit of possibility. (That said, if you really wanted something convenient for English speakers it shouldn't be too hard to make a couple of phoneme substitutions to get closer to the mark: perhaps throw in some epenthetic vowels, chuck the uvulars and other awkward consonants, maybe bring in voicing to replace them... but that I leave to you.)

I don't have any command of ASL myself. I'm very grateful to Sai (or should I call him /katʃ/?) for answering my ASL questions and informing me of his speaker's intuition on several points. Thanks also to David J Peterson for feedback on an earlier incarnation of this project, and those who commented on the CONLANG list.

Send your comments, questions, etc. to me, Alex Fink. I use gmail with the username 000024.

Which types of phones are what?

David Perlmutter, in Sonority and syllable structure in American Sign Language, Linguistic Inquiry 23 no. 3 (1992), 407–442, observed several structural and distributional constraints on ASL signs as composed of places and motions and handshapes. The SLIPA page has a summary of these constraints. In light of them, Perlmutter drew the analogy that places and motions and handshapes are to signed languages what consonants and vowels and tones are to spoken languages, respectively. These equivalences would thus seem to be the natural equivalences to impose in my scheme as well. I've deviated from them, though, for the following reasons.

I order these elements so that a typical (Place-)Motion(-Place) word has a maximal structure

[place or motion endpoint 1] [orientation 1] [handshape 1] [motion modifiers] [place or motion endpoint 2] [orientation 2] [handshape 2].
The vowels are italicised. I allow many of the consonantal components to render one of their highly unmarked values (or their absence) as zero. But in order to allow places unchanged from the previous place to be rendered as zero, for parsimony in situations where only handshape or orientation or such have changed, I avoid rendering any single place as zero (though I will still exploit /ʔ/).

For vowels having a zero option also seems awkward, and so I won't. Fortunately every sign has at least one handshape, so that every word ends up with at least one vowel. If there's no change of handshape, the word may have just the one vowel. I allow the vowel corresponding to a handshape to be repeated in subsequent handshape slots if it's unchanged, or not. Sometimes repetition may be necessary, for instance if there is a long succession of places and motions with the same handshape; consonant clusters cannot be allowed to form since these have special meanings.

In Perlmutter's scheme a word that consists of only a Place, with no motions, is like a consonant-only word; he justifies this with the observation that such necessarily have some secondary articulation, such as tapping. In my scheme such words will still contain a vowelled syllable:

[place] [orientation] [handshape] [secondary articulation].
Tapping seems pretty unmarked to me, and I represent it as zero. Other secondary articulations, like wiggling, will be rendered as consonants.

Non-manual elements

I haven't actually given non-manual elements any more thought yet than is recorded above.

It would give me somewhere to start if I knew enough prosodic phonology to say which particular prosodies are common in natlangs to mark topics, WH-questions, etc. But I don't.


Among the handshapes of ASL, seven are sufficiently unmarked that they can occur as the handshape of the resting non-dominant hand, namely 5, B, 1, A, S, O, and C (Baker-Shenk & Cokely, American sign language: a teacher's resource text on grammar and culture, p. 82). On the other end of the markedness scale, my impression is that D, M, R, T occur only as letters, i.e. in (possibly nativised) fingerspellings and initialised signs, 7 only in numeric signs involving that digit, and W=6 only in one or the other, so they're among the most marked handshapes.

I've analysed handshapes as having the following dimensions of variation.

So the task is to map these features onto vowels.

I summarise the assignments below. Each entry in the table gives a handshape, its description in the featural analysis above, and its assigned vowel. Many of the gaps are systematically fillable with other handshapes; perhaps even a few of them should be included that I'm unaware of.

S 0-..- /a/ thumbs-up 0-..+ /ã/ A 0-..a /aj/
I 0i..- /ja/ Y 0i..+ /jã/
1 1--.- /ɛ/ L 1--.+ /ɛ̃/ X 1-x.- /ɛw/
horns 1i-.- /jɛ/ ILY 1i-.+ /jɛ̃/
/ɔ/ baby-O 1-h.o /ɔ̃/ bent-L 1-c.+ /ɔ̃j/
U 2---- /e/ close-3 2---+ /ẽ/ claw-U 2-x-- /ew/
V 2--+- /je/ 3 2--++ /jẽ/ claw-V 2-x+- /jew/ claw-3 2-x++ /jẽw/
K 2-k.- /ɘj/
H 2-h-- /oj/
bent-V 2-h+- /joj/
B 4---- /i/ thumby-B 4---+ /ĩ/ E 4-x-- /iw/
4 4--+- /ji/ 5 4--++ /jĩ/ claw-5 4-x++ /jĩw/
8 4-k+o /ɨ̃/ open-8 4-k++ /ɨ̃j/
C 4-c-c /u/ O
/ũ/ bent-B
open-F 4-c+c /ju/ F 4-c+o /jũ/
T 0---t ? R 2--r- ? D

Legend for the featural analyses:

character 1 character 2 character 3 character 4 character 5 general
fingers raised extra pinky? finger disposition finger spreading thumb disposition
0: none -: no -: straight -: unspread -: unextended .: feature is irrelevant
1: index i: yes c: curved +: spread a: as in A ?: I don't know
2: index + middle h: bent r: crossed +: to the side
3: i + m + ring x: clawed c: to the front
4: all four k: strt, middle bent o: loop
t: behind index

So we've got nine vowels. And for that, most any phonologist seeing the sounds in the table above would collapse at least /ɔ o/ and /ɘ ɨ/ as allophones, though the first at least are technically separated by the handshapes in NO. As markedness goes, it's unfortunate that the unmarked 5 /jĩ/ has turned out worse than the rarer thumby-B /ĩ/ and 4 /ji/.


Orientations appear in my scheme as the last elements of onset clusters. So I'll code them as the sort of segments that are most at home in such positions, approximants, especially liquids.

My take on orientation markedness is that the zero-marked value is the most natural way for the hand to be situated given that it's touching a given point in space (which might be a neutral space pseudo-point). In ASL phonotactics there are also constraints on modes of contact each handshape allows, and it may be that a better coding system would consider these; but I don't know them.

Regardless of the disposition of the fingers, I take the orientation frame of the hand to be defined relative to the base of the palm, and name orientations in terms of the sides of an open palm, since I don't want orientation codings or names to change when you only move the fingers around. For instance, touching a point with the tips of the fingers of a flat-O is a palm-side contact, since that's what it would be if you opened the fingers flat without moving anything else.

The fact that spheres aren't flat (and not only that, they don't even support a nonvanishing continuous tangent vector field) got in the way of my first couple attempts to make a nice and orthogonal coding for orientation, with for instance elbow behaviour and wrist behaviour coded separately. To illustrate the problems, hold your hand comfortably in neutral space in front of you, so that your forearm is pointing upwards. Now sweep your arm at the elbow 90° forward away from you, then 90° to the inside parallel to the ground, then 90° back up. This puts your forearm back where it was, but you'll notice that your hand has rotated at the wrist 90°. Now suppose one tried to code orientations in such a way that side-to-side twisting at the wrist was separated out from other features. The angle of this twisting changed over the course of these three arm motions, so one of them individually must have carried a twist; but none of the three individual motions seems like a compelling place to posit one.

In lieu of such a scheme, here are the orientation segments I'm using. Orientation is broken down into which side of the dominant hand is touching the point (or pseudopoint) of contact and how the rest of the hand is rotated. I sequence the segments in that order. For the point of contact:

For the rotation: My motivation for suggesting to collapse the orientations that I do in this list is just to keep the number of phonemes down.


We start our consideration of places with the most genuine ones, those which actually are locations on the body. For these I use obstruents, which feel like the most consonanty of consonants.

My basic place assignments are derived from a relatively coarse identification of the ASL-relevant points of the body, close to Stokoe's one: even that is a fair number of obstruents. Among points not on the arm, there is a feature of centrality: they are either sagitally central or off to the side. We'll mark this by fricativity, and co-opt the same distinction to mark different sides of the arm for points there. Sai suggests noncentral points are more common, so they get to be the less marked stops, while central points are fricatives. As for height of the points, I assign POAs from the back to the front of the mouth in top to bottom order, which has the nice effect that the arm gets the coronals. There's a break between the labials and the coronals where I jump back to the chest, but I don't mind, since that gap strikes me as impressionistically one of the larger inter-column gaps on the IPA chart anyways. (It looks like I'm not going to end up using voicing constrastively, given that I haven't used it here.)

This gives the following assignments.

forehead, eyes /χ/ temple /q/
chin, below lips /x/ cheek /k/
elbow /ʃ/ shoulder /tʃ/
(underside of wrist /ʂ/) (back of wrist /tʂ/)
palm /s/ back of hand /ts/
fingers /θ/ thumb side of hand /t/
center of chest /f/ side of chest /p/

The elbow and shoulder aren't a very close pair, but I suspect that more refined distinction of basic points near either of these is unnecessary. I'm considering omitting the wrist points as basic too.

Places not among the basic inventory, and places more precise than the basic inventory provides when that's necessary, can be specified as modifications of basic places. These places default to the dominant side of the body, except those on the arm which default to the nondominant arm, since that is easier touched by the dominant arm. To specify points on the other side also requires a modification.

I have a couple of different categories of renderings of points not touching the body. When these points can be easily and correctly specified relatively as endpoints of a motion, that is likely preferable to what I describe below. I'm not entirely happy with the duplication of strategies here, but it seems to be a good thing to do as regards markedness.

Positions in neutral space are the most unmarked positions of all, so I've given them minimally obtrusive consonants, namely glottals, which can be regarded as just phonation. I distinguish a few such positions according to the disposition of the forearm; I'm not actually sure all of them are necessary.

For the forearm pointing up I take the unmarked orientation to be palm outward, so that e.g. fingerspelling is performed with /ʔ/ onset and mostly in the unmarked orientation. For pointing forward I take it to be palm down. For the side position I haven't decided.

Again, there are classes of modifications to specify more refined points not touching the body, including some which lift points on the body out to nearby points.

There is also a special series of points for pronominal locations. These are construed as clusters with an initial nasal, which I chose with an eye to the other value of nasals. The nasal is allowed to assimilate in place to the following consonant (though I may write it overprecisely), and to take a syllabic realisation. The second element of the cluster is the representation of a (possibly modified) place on the body close to the relevant pronoun. I haven't yet defined these associations precisely, but, for instance, /nt/ is a sensible name for an default pronominal location off to one side at neutral height, and /ɴχ/ for a pronoun sagitally central and above the head. For the location of one's interlocutor I use /mf/.

Place modifications

Most of the place modification serve to specify a point slightly displaced from its canonical location. A modification of a point coded C is coded as a cluster MC, the initial segment M giving the details. Modifications can stack. Values of M are given in the following list.

Here and subsequently, I use these four logical directions for motion along the body surface even where physically they aren't really a good fit: thus e.g. even if my palm is sitting face up in neutral space, moving towards the fingers is moving downwards.

The association of stops to directions is chosen to have them roughly line up with their association to body parts; this association will recur. There is no particularly good reason for /f/ or /x/. For completeness I point out that /s/ and /ʃ/ can also occur in this cluster-initial position, but have entirely different meanings: they denote signs with both hands active.

Motion endpoints

Some motion endpoints are represented by nasals, the last common manner of articulation I've got left. However, to keep proliferation of consonants to a minimum, obstruents are reused for motion endpoints as well. Nasals serve for motion endpoints not touching the body, obstruents for those that do touch. The rule for disambiguating this use of obstruents from their use as genuine places is the following:

The first obstruent (or cluster thereof) in a word is always a genuine place, as is any obstruent (cluster) if the previous cluster ends in (or is) a nasal or a glottal. Other obstruents (or clusters thereof) are motion endpoints, unless they're preceded in a cluster by /ʔ/ in which case they're genuine places.
Markedness justifies making the motion endpoint senses simpler, since words with motions best specified relatively in terms of them outnumber words with motions between any two old points.

In particular, in a word whose first two consonants (or clusters) are a nasal and an obstruent, the obstruent is a genuine place. It's okay to rule out the sequence of an endpoint off the body followed by a relative point on the body, since endpoints on the body are always specified with respect to other points on the body (i.e. I don't have 'inward to contact').

Motion endpoints are usually interpreted relative to the nearest genuine place to their left, but if there are none such they're interpreted relative to the nearest genuine place to their right. The latter only ever happens in the inward-moving case of the last paragraph. The default orientation for a motion endpoint is the actual orientation (not the default one!) of the genuine place it's interpreted with respect to.

The place of articulation assignments of these endpoints use the same directional scheme as the place modifications. For nasals we have

If an obstruent specifies an endpoint, then the previous place must be on the body. I use the stop-fricative distinction to show whether the motion rubs along the body (stops) or hops off it (fricatives): All of these can be preceded by a place modification to nuance their positioning. Finally, I've included an exceptional consonant which specifies a complex motion. The trill /r/ by itself specifies large circles in the circling in the sagittal plane. When following a consonant indicating relative motion it indicates circling back to repeat that motion.

Two-handed signs

There are two basic kinds of two-handed sign, if we exclude iconic signs in which each hand can assume a classifier handshape and act freely: these are those where the nondominant hand serves as a static base, and those where it copies the dominant hand in disposition and in motion. The motion can be copied either as a reflection or in parallel or lapping by half a revolution of a circle.

In signs with the non-dominant hand static, this hand acts merely as another set of available Places: the relevant Place consonants are /ʂ/, /tʂ/, /s/, /ts/, /θ/, and /t/. For these signs the only data that must be specified for the static hand are its handshape and orientation. The general rule is that this information is coded as a special first liquid + vowel syllable, with the same coding as for the dominant hand, and the word stress falls in this case on the second syllable; this distinguishes these from one-handed signs where stress always falls on the first syllable.

Orientations are taken with respect to the default that the contacted place faces up. For handshapes I also have defaults which are assumed if there's no initial unstressed vowel. The default handshape is B (/i/) if the contact point is the front or back of the palm or the fingers, and S (/a/) if the contact point is the thumbside or the wrist. (I'm not sure if this is a good choice for the wrist.) If the vowel is omitted, the initial approximant may be taken syllabic.

Signs where the non-dominant hand copies the dominant hand are indicated by two special types of cluster, those with initial /s/ and /ʃ/, prefixed to the first consonant in the description of the dominant hand. Generally /s/ means that the non-dominant hand is to follow the dominant hand as a sagittal reflection, the more common behaviour, and /ʃ/ that it is to follow it in parallel. However, if there is no side-to-side motion, these two behaviours would be collapsed, so in this case I interpret the prefixes differently: /ʃ/ means that the non-dominant hand acts oppositely to the dominant in some direction, for instance staying opposite it in a circle, leaving /s/ for cases where the motion is a true reflection. (As this shows, I regard reflection as the more natural relation than parallelling.)

What about those signs where both hands move independently? It would be a plausible generalisation of the resting nondominant hand machinery to code the nondominant and dominant hands' motions in succession, using the primary stress to show where one switches hands.


I haven't given a lot of attention to any significant phoneme classes in ASL not treated so far. For instance, I don't even know how many secondary articulations it's necessary to distinguish. One of them, I suppose, is tapping, which I analyse as zero. Another is wiggling, i.e. continuous small changes of orientation. Since the lateral liquid is my most frequent orientation consonant I'm of a mind to make wiggling the also lateral /ɬ/. This is slightly unfortunate, being a fricative — but I don't really have any plausible unused manners of articulation at this point. As for ASL's supersegmentals, for motions performed specially slowly or quickly I mean to use the iconic representations, namely shortened and lengthened vowels. (I should look into a list of aspects.)

When an ASL sign has reduplication I code this by reduplication of a corresponding sequence in the spoken realisation. (This may demand buffer vowel insertion to prevent formation of new clusters.)


Below, as examples, are renderings into my system of the vocabulary words from Bill Vicars' ASL lesson #1.

/hɫujs/: the dominant hand starts in neutral space with the arm extended out (/h/), palm up (/ɫ/), in a bent-B (/uj/), and arcs over to touch the nondominant palm (/s/) with its palm side (zero). The nondominant hand is in its default position, B with palm up (zero). Relative-position alternative, /ɲɫujs/.
/kɛs/: the hand touches the ear (/k/ I think is good enough), palm in (zero), in a 1 (/ɛ/), and hops inward (/s/) to the side of the mouth. The old sign on that page might be rendered /tʃkɛstli/, doing something underhanded with cluster-initial /s/.
/fxʋɛrfxʋɛr/: the hand sits in front of (/f/) the mouth (/x/), palm in but fingertips to the side (/ʋ/), in a 1 (/ɛ/), and makes a circle (/r/) twice (reduplication; the cluster is innocuous). I'm put off by the simultaneous presence of /f-/ and /r/, given that it seems unlikely that a sign involve circles which touch the body, or at least that it should be analysed as such rather than brushing. If I made a special-case rule to drop the former in this situation the sign would be /xʋɛrxʋɛr/.
/sjĩʔqlũ/: the dominant hand starts touching the opposite palm (/s/), palmside in (zero), in a 5 (/jĩ/, with a little curvature that seems ignorable), and moves to contact another point (/ʔ/), the side of the forehead (/q/), thumbside in (/l/), and having closed up to a flat-O (/ũ/). The nondominant hand is all default (zero). The casual version is /sjĩŋũ/.
LIKE, the verb
/fjĩnɨ̃/: the hand starts on the chest (/f/), palm in (zero), in a 5 (/jĩ/), then moves off (/n/) into an 8 (/ɨ̃/). The negative would be /fjĩnɨ̃mljĩ/; I'm concerned whether /l/ is informational enough there.
/lsɹeɹʋ/: the hand touches the palm (/s/) with its fingertips (/ɹ/) in a V (/e/) and then, still touching with the fingertips (/ɹ/), rotates at the wrist (/ʋ/). The nondominant hand is rotated to have the palm to the side (/l/) but its handshape is default.
/sɲɛs/: with the nondominant hand mirroring (/s/), the dominant hand stands out to its side (/ɲ/), palmside in (zero), in a 1 (/ɛ/), and moves in to touch it on the palmside (/s/). Absolute position alternative, /sʔlɛs/. The indexical version "I meet you" given there, converted narrowly, becomes /mfɫɛˈfɫɛs/, which is a mess, but perhaps I can't expect much better unless I special-case some indexicals.
/eˈptle/, or perhaps just /eˈtle/: the dominant hand taps the other hand's fingers on the outside, which are downward of (/p/) the thumb (/t/), pinky side in (/l/), in a U (/e/). The nondominant hand is also in a U (/e/). That's the verb; the noun is reduplicated /eˈptleptle/.
/sip/: the dominant hand touches the other hand's palm (/s/) with its palm (zero) in a B (/i/) and slides off it, (logically) downwards (/p/). Nondominant hand is default.
/ʔoõ/: sitting in neutral space, arm up (/ʔ/) with palm out (zero), the hand moves from an H with the thumb out in front (/o/) to a two-fingered O (/õ/). (An earlier version of the scheme allowed a relative place to be used here, so you could have /noõ/! Pity that changed.) Indexed "he told me no", as portrayed there, /ntʃɫoftʃõ/ or some such fiasco.
/ʃʔlɛrʃʔlɛr/: with the nondominant hand following 180° out of phase (/ʃ/), the dominant hand starts in neutral space (/ʔ/), palm more or less to the side (/l/), in a 1 (/ɛ/), and makes circles (/r/; actually it changes orientation as it circles but I don't know how to say that), a few times (reduplication). I suppose if my system had a name this would be it, Š'leř or however one might spell it. (Maybe not like that. That apostrophe there grosses me out, it looks far too much like a gratuitous alien name apostrophe.)
/tsik/: the dominant hand touches the back of the other hand (/ts/) with its palm (zero) in a B (/i/) and slides off it, upwards (/k/). Nondominant hand is default.
/sfqũn/: both hands (/s/), starting in a space near (/f/) the temples (/q/), palm side in (zero), in flat-O:s (/ũ/), move outward (/n/).
AGENT, the suffix
/-(s)hlim/, in its least reduced form: wherever the hands may have been (/s/ to catch them both), they go to neutral space with arms out (/h/), palms toward each other (/l/), in B:s (/i/), and move a little down (/m/). (Vicars only introduces this in STUDENT and TEACHER.)
/xihɫ/: the hand starts on the lips (/x/), palm in (zero), in a B (/i/), and moves to neutral space, arm out (/h/), palm up (/ɫ/). Relative-position alternative, /ximɹ/. I've gone for absolute because if one wanted to have a reversal of direction for negation on this sign, absolute position lets one just write /xih/ whereas in this instance relative position copes only awkwardly (perhaps /ximɹɫ/, but the liquid cluster is not pretty). The fact that GOOD /xiʔsɫ/ and BAD /xiʔs/ canonically alight on the nondominant hand but alternate with forms where they don't seems to support this too.
/fqɛw.ɛ/: the hand sits in space near (/f/) the temple (/q/) with palm facing in (zero) and opens from an X (/ɛw/) to a 1 (/ɛ/).
/shɫjĩɲshɫjĩɲ/ plus whatever renders the WH eyebrows: both hands moving oppositely (/s/) in neutral space, arm out (/h/), palm up (/ɫ/), in (loose) 5s (/jĩ/), and move out to the side (/ɲ/) back and forth (reduplication).
/ʔɛʋʔɛʋ/ plus the WH marking: the hand sits in neutral space, arm up (/ʔ/), palm out (zero), in a 1 (/ʋ/), and rolls at the wrist (/ʋ/) back and forth (reduplication).
/xlɛ̃ɔ̃jxlɛ̃ɔ̃j/ plus the WH marking: the hand sits on the chin (/x/), thumbside in (/l/), in an L (/ɛ̃/), which switches to a bent-L (/ɔ̃j/), twice (reduplication). The old sign would be something like /(k)xɛ.../; I haven't decided on a coding for that kind of circling.
/qujmjã/ plus the WH marking: the hand starts at the temple (/q/), palm in (zero), in a bent-B (/uj/), and moves off and down (/m/) into a Y (/jã/).
/ʔaɹʔaɹ/: the hand sits in neutral space (/ʔ/), more or less palm out (zero), in an S (/a/), and bends at the wrist to point the finger side out (/ɹ/), twice (reduplication).
At the foot of the list come two pages of pronouns and indexical forms. These have many similarities in shape, so I treat them at less length.
3rd person singular pronouns
Several forms, depending on the indexed location. A fairly unmarked one is /ntɹɛ/
3rd person plural pronouns
One swept one is /ntɹɛɲ/
These use /i/ for a B hand with neutral orientation: MY /fi/, YOUR /mfi/, one 3rd person /nti/, etc.
THAT, unindexed
THAT, indexed
/ntjã/, with the same variants as third-person pronouns
/ntɹwjẽɲr/, for circling, or /ntɹwjẽɲntɹwjẽɲ/, reduplication for the shaking. Sadly both have an entirely disgusting initial cluster, which will be a theme for the next several pronouns.
WE THREE, inclusive
/hɹwjẽɲr/, which I've coded simply as in neutral space to distinguish it from the second-person analogue
WE TWO, inclusive
/hlɘjnhlɘjn/, ditto

Rounding off some corners