Our objective is to automatically identify emergent technical areas based on the full text of scientific papers, including their citations. Achieving this goal will require significant advances in our understanding both of the dynamics of emergence and of the indicators by which it can be recognized. We will build a large collection of full-text features which will be tested in large-scale predictive probabilistic models. For example, we will use the context of citations to see what authors say about the papers they cite; is the cited paper "promising", "established" or "disputed"? Did the citing paper use the method, data, or nothing from it? How do such features predict the emergence of new sub-fields? We will also build models of the dynamics of the birth, growth, and death of scientific micro-communities based on novel algorithms for clustering millions of scientific papers based on their citations.