Naming expansin genes
Let's say you have identified a putative expansin gene sequence. How do you know it is a new expansin, and if new, how do you assign it a gene number that will not conflict with existing or future expansin gene designation?

Is it a new expansin? Here are the steps:

bulletRun BLASTP using your protein sequence as the query and the nonredundant sequence database at NCBI or its mirrors. If you find an identical or nearly identical sequence in the same species and with an expansin name, then your search is done. It has already been identified and annotated. As long as the name conforms to the updated nomenclature rules, you must use this name. If the name does not conform, then open a dialog with the author (depositor) to get the name updated to current nomenclature rules. If necessary, contact D Cosgrove for advice.
bulletIf your sequence is from a species with a lot of public genomic data, it may already be named even though the sequence and gene name does not show up in GenBank. Species that have already been deeply examined include Arabidopsis and rice, and these sequences should be in GenBank. Other species with nearly complete genomes include Populus, Physcomitrella, and Selaginella;  these sequence are not yet in GenBank, but are found in specialized databases that must be queried individually. Other species with extensive genomic data include maize, tomato, wheat, soybean, Medicago....the list is continually growing. There is a high likelihood that the expansins in these species have already been identified and named. Publications may be in press already.  Check the expansin gene tables to see if your sequence is already publically registered.  Check with  D Cosgrove to reserve a name for your sequence, if you think it is new.
bulletIf your sequence is from a species with little genomic data and if your GenBank search comes up empty, then probably it is a new sequence and should be named following the standard nomenclature rules. It is best to publish the sequence in GenBank as soon as possible, to prevent others from using the same name for a different sequence, or assigning a different name to the identical sequence.
bulletAlthough in most cases it is straightfoward to determine whether a gene sequence is already represented in the public databases, sometimes it becomes more complicated, eg. for highly similar but not identical sequences when different cultivars of the same species are involved or when the species is polyploid. Then, careful analysis is needed to avoid naming collisions or confusions, such as assigning multiple gene names to different alleles of the same gene.

