Message 2004-10-0075: Re: Phylogenetic Notation

Tue, 14 Sep 2004 19:00:51 +0200

[Previous by date - Re: Apomorphy-based clades; was Re: Panstems]
[Next by date - Paleontology [was: Re: Thoughts on the Paris meeting]]
[Previous by subject - Re: Phylogenetic Nomenclature Meeting]
[Next by subject - Re: Phylogenetic Notation]

Date: Tue, 14 Sep 2004 19:00:51 +0200
From: [unknown]
To: PML <phylocode@ouvaxa.cats.ohiou.edu>
Subject: Re: Phylogenetic Notation

----- Original Message -----
=46rom: "T. Michael Keesey" <mightyodinn@yahoo.com>
Sent: Tuesday, September 14, 2004 12:53 AM

> > At this point in history, I don't think it's a noticeable exagger=
ation
to
> > claim that every scientist speaks English.
>
> Yes, but what about Latin?

This seems to be a compromise to the botanists, who are still obliged=
 to
write short diagnoses of newly described taxa in Latin... except if t=
hose
taxa are fossil, AFAIK.

> And will definitions written in either of these always be unambiguo=
us?

If the registration database administrator is vigilant enough, probab=
ly they
will... :-}

> > I'll try to retrieve the system I proposed a month or so ago (it'=
s
simpler,
> > uses non-ASCII characters but only such that occur in iso-8859-1
"Western
> > European", and is not capable of expressing the more complex of y=
our
> > examples), to see if I could find something about your system to =
quibble
> > about... :-)
>
> I'd be very interested to see that. Perhaps it can be instructive i=
n
making the
> system more accessible.

Here is it, updated from my post from June 15th (and another from the=
 18th
that seemingly didn't get through). In fact, I think all characters a=
re
ASCII after all.
-------------------------------------------

A through G are taxa, M is an apomorphy. Parentheses indicate optiona=
l
additions, such as more than two specifiers.

Node-based:
{A(, B, C...) + D}
        "{}" used instead of "Clade()" because it's shorter, already =
used on
a few websites, language-free, and avoids confusion with the method t=
o write
a tree -- (A + (B + C))). It keeps definition and description apart. =
The
identity to the brackets used for mathematical sets is a fortunate
coincidence.
        "+" used instead of "and" because it's shorter, in widespread=
 use
(abstract booklet!) and language-free.

Stem-based:
{A(, B, C...) # D(, E, F...)}
        "#" used instead of "not" because it's shorter and language-f=
ree;
instead of ">" or "<--" because the direction of the arrow would conf=
use
people either way, and because "<--" is painfully ugly, unless replac=
ed by a
real arrow; instead of "=AC" because this (the mathematical "not" sig=
n) is
poorly known and poorly available on keyboards. My English teacher us=
ed "#"
for "opposite", probably because it's similar to the mathematical "un=
equal"
sign ("=3D" with one instead of two "/" through it). Its use for "num=
ber"
seems to be restricted to English-speaking countries and is not under=
stood
elsewhere.

Apomorphy-based (should those be allowed):
{M @ A (+ B, C...)}
        "@" is the chemical "in" sign (e. g.
http://gaus90.chem.yale.edu/window.html), probably because it looks s=
o
_en_circled. Should this be deemed too little straightforward, we cou=
ld
spell "in" out; "in" is Latin, English, German and more, so some
internationality would be retained this way.
        Hey, wait!!! Actually we don't need _any_ mention of "in" her=
e. We
could just write {M A (+ B, C...)}, couldn't we? :-)
        (The apomorphy itself would still have to be written in a lan=
guage.
Theoretically, this could be used as an argument to ban apomorphy-bas=
ed
definitions -- but if the apomorphy is well enough described and figu=
red,
_this_ shouldn't produce any problem in the real world.)

One kind of qualifying clause:
{[...] \ G}
        "\" is the mathematical "without" sign, and exists on every c=
omputer
keyboard. Does not work for Art. 11.9 Example 1, but for Example 2:
*Lepidosauriformes* =3D {*Lacerta agilis* + *Crocodylus niloticus* \ =
*Youngina
capensis*}.
        (Should math be preferred, this could be "{*Lacerta agilis* +
*Crocodylus niloticus*} \ {*Youngina capensis*}" instead; however, th=
is can
make it confusing to tell how many definitions there are or where it =
ends.)

Another kind of qualifying clause:
{[...] | [condition]}
        "|" is the mathematical sign that is used in a similar way. L=
et's
see... it works for Art. 11.9 Example 1: *Pinnipedia* =3D {*Otaria by=
ronia*,
*Odobenus rosmarus* + *Phoca vitulina* | flippers @ *Otaria byronia*,
*Odobenus rosmarus*, *Phoca vitulina*}. More examples will need to be=
 tested
to see if this notation can become confusing.
        (Another question is if this is needed at all, even if
apomorphy-based definitions will be allowed. For example, despite the
emphasis on the apomorphy, *Pinnipedia* is a crown-group here; it wou=
ld be
_the very same clade_ if it were defined {*Otaria*, *Odobenus* + *Pho=
ca* [your favorite terrestrial Carnivora]}.)
        Several conditions could be separated with ";", for example.

Stem-modified crown definition (Note 9.4.1):
{=A5 A # B}
        =A5 is the symbol for "crown-group". Totally straightforward.=
 It
depicts a cladogram with a node that is marked by double underlining.=
 =3D8-)
=3D8-) =3D8-) Disadvantage: Not available on German keyboards, at lea=
st.
Advantage: Seems to be ASCII.
        Perhaps this could be shortened to {A =A5 B} -- if this is no=
t too
confusing (A is the internal, B is the external specifier).
        (I have only just noticed that such definitions, too, can
self-destruct, namely if A is extinct; then there's a possibility tha=
t there
is nothing alive that's closer to A than to B.)

Apomorphy-modified crown definition:
{=A5 M @ A}

Ancestor-based definition (like "*Homo sapiens* and all its descendan=
ts"):
{A}
        A is the ancestor. The format is straightforward because a sp=
ecies
or specimen cannot by itself constitute a clade if it has any descend=
ants.
        Not applicable for Panbiota/Nominata/Nominanda*; its definiti=
on
would have to be interpreted as apomorphy-based, {life @ *Homo sapien=
s*}
respectively {life *Homo sapiens*}.

* Jon's abstract said *Panbiota*. His talk said *Nominata*, the "name=
d
ones". I wonder if he meant "those that are to be named/those we will=
 name",
which would be "nominanda".

Never used so far, but node-modified crown definitions are imaginable=
:
{=A5 A + B}
Would mean "the crown group of the clade (A + B)", implying that A is=
 extant
and B fossil. But this type of definition would probably be
indistinguishable from a stem-modified one with the same specifiers (=
A
internal, B external). Hmmm... it would be _exactly_ identical. Right=
?

And now the big test: Can I manage to express the definition of
*Ichthyornis*?
{*Ichthyornis dispar* # *Struthio camelus*, *Tinamus major*, *Vultur
gryphus* | amphicoelous cervical vertebrae, [rest of the list] @
*Ichthyornis dispar*}
        I think this works. Does it?


  

Feedback to <mike@indexdata.com> is welcome!