Wed, 15 Sep 2004 00:31:29 -0700 (PDT)

From: **[unknown]**

To: **Mailing List - PhyloCode <phylocode@ouvaxa.cats.ohiou.edu>**

Cc: **David Marjanovic <david.marjanovic@gmx.at>**

Re: Phylogenetic Notation

--- David Marjanovic <david.marjanovic@gmx.at> wrote: > > > And will definitions written in either of these always be unambig= uous? >=20 > If the registration database administrator is vigilant enough, prob= ably they > will... :-} Not good enough, says I. =20 > > > I'll try to retrieve the system I proposed a month or so ago (i= t's > simpler, > > > uses non-ASCII characters but only such that occur in iso-8859-= 1 > "Western > > > European", and is not capable of expressing the more complex of= your > > > examples), to see if I could find something about your system t= o quibble > > > about... :-) > > > > I'd be very interested to see that. Perhaps it can be instructive= in > making the > > system more accessible. >=20 > Here is it, updated from my post from June 15th (and another from t= he 18th > that seemingly didn't get through). In fact, I think all characters= are > ASCII after all. > ------------------------------------------- >=20 > A through G are taxa, M is an apomorphy. Parentheses indicate optio= nal > additions, such as more than two specifiers. >=20 > Node-based: > {A(, B, C...) + D} > "{}" used instead of "Clade()" because it's shorter, already used o= n > a few websites, language-free, and avoids confusion with the method= to write > a tree -- (A + (B + C))). It keeps definition and description apart= . The > identity to the brackets used for mathematical sets is a fortunate > coincidence. Or not-so-fortunate, in my eye. Because you've appropriated them for = clades, now they can't be used for other types of sets of organisms. The shor= thand in the current draft of PhyloCode does specify "clade", but I think the = real issue here is grammar, not vocabulary. Vocabulary (e.g., the word "clade") = can be defined rigorously and independently in any language. Grammar cannot,= and that is what I was trying to overcome by using mathematical notation. > Stem-based: > {A(, B, C...) # D(, E, F...)} [snipped] > Apomorphy-based (should those be allowed): > {M @ A (+ B, C...)} [snipped] These (and node-based clades) are already provided with shorthand not= ations (which are also ASCII-friendly) in the current draft of PhyloCode. > Hey, wait!!! Actually we don't need _any_ mention of "in" h= ere. We > could just write {M A (+ B, C...)}, couldn't we? :-) I suppose ... somewhat discombobulating, though. > (The apomorphy itself would still have to be written in a l= anguage. > Theoretically, this could be used as an argument to ban apomorphy-b= ased > definitions -- but if the apomorphy is well enough described and fi= gured, > _this_ shouldn't produce any problem in the real world.) As mentioned before, I can't think of any way around this, either. A = necessary evil, unless you don't view apomorph-based clades as necessary. =20 > One kind of qualifying clause: > {[...] \ G} > "\" is the mathematical "without" sign, and exists on every= computer > keyboard. Does not work for Art. 11.9 Example 1, but for Example 2: > *Lepidosauriformes* =3D {*Lacerta agilis* + *Crocodylus niloticus* = \ *Youngina > capensis*}. _C. niloticus_ and _Y. capensis_ should be switched there, right? Didn't know about that usage of the "backslash". > (Should math be preferred, this could be "{*Lacerta agilis*= + > *Crocodylus niloticus*} \ {*Youngina capensis*}" instead; however, = this can > make it confusing to tell how many definitions there are or where i= t ends.) I don't think that's a problem, since there is clearly an operator br= idging the two expressions. > Another kind of qualifying clause: > {[...] | [condition]} > "|" is the mathematical sign that is used in a similar way.= Let's > see... it works for Art. 11.9 Example 1: *Pinnipedia* =3D {*Otaria = byronia*, > *Odobenus rosmarus* + *Phoca vitulina* | flippers @ *Otaria byronia= *, > *Odobenus rosmarus*, *Phoca vitulina*}. More examples will need to = be tested > to see if this notation can become confusing. "|" is usually translated orally to "such that" or "where". But it se= ems to me what you really want here is a conditional, usually written as an arr= ow and orally translated as "if X, then Y", e.g. {"flippers" @ _Otaria byron= ia_ + _Odobenus rosmarus_ + _Phoca vitulina_} !=3D =D8 -> _Pinnipedia_ = =3D {_Otaria byronia_ + _Odobenus rosmarus_ + _Phoca vitulina_} ("!=3D =D8" should be read as "not equal to a null set") > (Another question is if this is needed at all, even if > apomorphy-based definitions will be allowed. For example, despite t= he > emphasis on the apomorphy, *Pinnipedia* is a crown-group here; it w= ould be > _the very same clade_ if it were defined {*Otaria*, *Odobenus* + *P= hoca* > [your favorite terrestrial Carnivora]}.) No, it could, in theory, still be a crown clade not including any oth= er extant carnivorans ("fissipeds") AND have an ancestor that did not possess f= lippers. > Several conditions could be separated with ";", for example= . >=20 > Stem-modified crown definition (Note 9.4.1): > {=A5 A # B} > =A5 is the symbol for "crown-group". Totally straightforwar= d. It > depicts a cladogram with a node that is marked by double underlinin= g. =3D8-) > =3D8-) =3D8-) Hehe ... international traders might disagree. > Disadvantage: Not available on German keyboards, at least. or North American > Advantage: Seems to be ASCII. > Perhaps this could be shortened to {A =A5 B} -- if this is = not too > confusing (A is the internal, B is the external specifier). > (I have only just noticed that such definitions, too, can > self-destruct, namely if A is extinct; then there's a possibility t= hat there > is nothing alive that's closer to A than to B.) Good point, although I don't think there's anything wrong with self-d= estructing names. (Nor do you, judging from your abstract.) > Apomorphy-modified crown definition: > {=A5 M @ A} >=20 > Ancestor-based definition (like "*Homo sapiens* and all its descend= ants"): > {A} > A is the ancestor. The format is straightforward because a = species > or specimen cannot by itself constitute a clade if it has any desce= ndants. Here's where I really dislike this notation, because it looks like "t= he set of A". > Not applicable for Panbiota/Nominata/Nominanda*; its defini= tion > would have to be interpreted as apomorphy-based, {life @ *Homo sapi= ens*} > respectively {life *Homo sapiens*}. I've provided a method of notating the definition without apomorphs. = You seem to be equating "life" with ancestor-descendant relationships, which i= s probably a good idea, but possibly presumptuous? =20 [snipped] > And now the big test: Can I manage to express the definition of > *Ichthyornis*? > {*Ichthyornis dispar* # *Struthio camelus*, *Tinamus major*, *Vultu= r > gryphus* | amphicoelous cervical vertebrae, [rest of the list] @ > *Ichthyornis dispar*} > I think this works. Does it? Nope. We know that the characters appear in _I. dispar_; the question= is hwo far back they go. The actual prose definition is worded not so much a= s a definition with a qualifying clause, but as an intersection of two cl= ades. Rendering this is not really possible in your notation or the shortha= nd proposed in PhyloCode. It seems to me there are not enough really good single ASCII characte= r approximations for expression of set notation. Boolean notation could= get by to some extant on the symbols used in C-based computer code: &: "and" |: "or" ~: "not" But some of these conflict with other symbols ("&" can appear in cita= tions, and "|" is important in set notation, as discussed above). Perhaps a better ASCII solution, then, would be words marked off by o= therwise unused characters, such as backslashes (\). \member of\ (lower case epsilon) \not member of\ (lower case epsilon with slash) \for all\ (upside-down A) \exists\ (backwards E) \union\ (U-like curve) \intersection\ (U-like curve upside-down) \not\ (=AC; it is ASCII, but in the interest of consistency....) \and\ (angle pointing up) \or\ (angle pointing down) \subset of\ (c-like curve with line underneath) \proper subset of\ (c-like curve) \not subset of\ (c-like curve with slash) \unequal\ (equals sign with slash) And some can be approximated pretty well: -> (right arrow: "if ... then ...") <- (left arrow: reversed "if ... then ...") <-> (double arrow: reversed "if and only if ... then ...") =D8 ("null set") | (straight line: "such that" or "where") ' (prime tick) Giving this a try on the example definitions (which I've since also incorporated PhyloCode's current proposed shorthand into): _Pinnipedia_ =3D nodeClade(Specifiers) <- Specifiers =3D {_Otaria byronia_ de Blainville 1820, _Odobenus ros= marus_ Linnaeus 1758, _Phoca vitulina_ Linnaeus 1758} /and/ nodeClade(Specifiers) /subset of/ apomorphClade({=93flippers= =94}, Specifiers) _Lepidosauriformes_ =3D Content <- Content =3D clade(_Lacerta agilis_ Linnaeus 1758 not _Youngina cap= ensis_ Broom 1914) /and/ Content /subset of/ Sauria /and/ Sauria =3D clade(_Lacerta agilis_ Linnaeus 1758 and _Crocodylus= niloticus_ Laurenti 1768) _Halecostomi_ =3D clade(_Amia calva_ Linnaeus 1766, _Perca fluviatili= s_ Linnaeus 1758 not _Lepisosteus osseus_ Linnaeus 1758) _Dinosauria_ =3D clade(_Iguanodon bernissartensis_ Boulenger in van B= eneden 1881 and _Megalosaurus bucklandi_ von Meyer 1832 and _Hylaeosaurus armatus= _ Mantell 1833) _Nominata_ =3D clade(firstAncestors({_Homo sapiens_ Linnaeus 1758})) _Saurischia_ =3D clade(_Megalosaurus bucklandi_ von Meyer 1632 not _I= guanodon bernissartensis_ Boulenger in van Beneden 1881) _Panaves_ =3D panstemClade(_Aves_) <- _Aves_ =3D clade(_Struthio camelus_ Linnaeus 1758 and _Tetrao majo= r_ Gmelin 1789 and _Vultur gryphus_ Linnaeus 1758) _Predentata_ =3D clade("predentary bone" in _Iguanodon bernissartensi= s_ Boulenger in van Beneden 1881) _Ichthyornis_ =3D apomorphClade(SelectedDiagnosticCharacters, Specifi= ers) /intersection/ stemClade(Specifiers, AvesSpecifiers) <- SelectedDiagnosticCharacters =3D {"cervical vertebrae: amphicoelou= s or =91biconcave=92", "bicipital crest on humerus with pit-shaped fossa f= or muscular attachment located directly at the distal end of the bicipital crest"= , "dimensions of the ulna=92s dorsal condyle such that the length of th= e trochlear surface along the posterior surface of the distal ulna is approximate= ly equal to the width of the trochlear surface taken across its distal end", "= oval scar located on the posteroventral surface of the distal radius, in the ce= nter of a depression", "large tubercle developed close to the articular surface= for the first phalanx of the second digit where the deep tendinal groove for = the m. extensor digitorum communis ends"} /and/ Specifiers =3D {_Ichthyornis dispar_ Marsh 1872b} /and/ AvesSpecifiers =3D {_Struthio camelus_ Linnaeus 1758, _Tetrao m= ajor_ Gmelin 1789, _Vultur gryphus_ Linnaeus 1758} _Ichthyornis dispar_ Marsh 1872b =3D species(YPM 1450) And on some of the derived functions: e /member of/ Organisms -> ancestors(e) =3D {x /member of/ Organisms | x /member of/ parents(e) = /or/ /exists/ y /member of/ parents(e): x /member of/ ancestors(y)} /for all/ x /member of/ S: x /member of/ Organisms /union/ Specimens = /union/ Species -> specifiedOrganisms(S) =3D {x /member of/ Organisms | /exists/ y /memb= er of/ S: (y /member of/ Organisms /and/ x =3D y) /or/ (y /member of/ Specimens /a= nd/ x =3D organism(y)) /or/ (y /member of/ Species /and/ x =3D organism(type(y)= ))} S' =3D specifiedOrganisms(S) -> commonAncestors(S) =3D {x /memebr of/ Organisms | /exists/ y /member = of/ S': x /member of/ ancestors(y)} S' =3D specifiedOrganisms(S) /and/ S' /member of/ AncestralSets -> ancestorClade(S) =3D S' /union/ {x /member of/ Organisms | /exists/ y= /member of/ S': x /member of/ descendants(y)} I' =3D specifiedOrganisms(I) /and/ E' =3D specifiedOrganisms(E) -> stemClade(I, E) =3D ancestorClade({x /member of/ Organisms | x /membe= r of/ commonAncestors(I') /and/ x /not member of/ lineage(E')}) Illegible? More legible than the mathematic symbols? Equally difficul= t? You be the judge because I need some sleep. =3D=3D=3D=3D=3D =3D=3D=3D=3D=3D> T. Michael Keesey <http://dino.lm.com/contact> =3D=3D=3D=3D=3D> The Dinosauricon <http://dinosauricon.com> =3D=3D=3D=3D=3D> Instant Messenger <Ric Blayze> =3D=3D=3D=3D=3D =09=09 __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail=20

