The Penn Treebank Project ... The Penn Treebank Project annotates naturally-occuring text for linguistic structure. Most notably, we produce skeletal parses showing rough syntactic and semantic information -- a bank of linguistic trees.
www.cis.upenn.edu/~treebank/ www.cis.upenn.edu/~treebank/
bin/sed -f # Sed script to produce Penn Treebank tokenization on arbitrary raw text. # Yeah, sure. # expected input: raw text with ONE SENTENCE TOKEN PER ...
www.cis.upenn.edu/~treebank/tokenizer.sed www.cis.upenn.edu/~treebank/tokenizer.sed
S - simple declarative clause, i.e. one that is not introduced by a (possible empty) subordinating conjunction or a wh-word and ... Penn Treebank II Tags ... Note: This information comes from "Bracketing Guidelines for Treebank II Style Penn Treebank Project" - part of the documentation that comes with the Penn Treebank.
bulba.sdsu.edu/jeanette/thesis/PennTags.html bulba.sdsu.edu/jeanette/thesis/PennTags.html
Penn Treebank Online ... This is a tgrep interface to several Penn Treebank parsed corpora. To use this interface, you need to know the tgrep query syntax and be familar with the tgrep options. See the short introduction or complete documentation for more information.
www.ldc.upenn.edu/ldc/online/treebank/ www.ldc.upenn.edu/ldc/online/treebank/
The Penn Treebank (PTB) project selected 2,499 stories from a three year Wall Street Journal (WSJ) collection of 98,732 stories for syntactic annotation. These 2,499 stories have been distributed in both Treebank-2 (LDC1999T42) and Treebank-3 (LDC1999T42) releases of PTB. Treebank-2 includes the raw text for each story.
www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LD... www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC99T42
Penn Treebank Tagset ... Here are the most important tags. ... See also: M. Marcus, Beatrice Santorini and M.A. Marcinkiewicz: Building a large annotated corpus of English: The Penn Treebank. In Computational Linguistics, volume 19, number 2, pp313-330.
www.mozart-oz.org/mogul/doc/lager/brill-tagger/penn.htm... www.mozart-oz.org/mogul/doc/lager/brill-tagger/penn.html
The tagset used in tagging the demo corpus available here is the Penn Treebank Tag set, described for example in Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz: Building a Large Annotated Corpus of English: The Penn Treebank, in Computational Linguistics, Volume 19, Number 2 (June 1993), pp. 313-
www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-H... www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP-HTMLDemo/PennTreebankTS.html
Listed alphabetically below are the standard tags used in the Penn Treebank. Each tag has examples of the tokens that were annotated with that tag. ... The examples are taken directly from the Penn Treebank lexicon that is supplied with Eric Brill's Transformation-Based Part-of-Speech Tagger . This is the tagger that is...
www.comp.leeds.ac.uk/amalgam/tagsets/upenn.html www.comp.leeds.ac.uk/amalgam/tagsets/upenn.html
Alphabetical list of part-of-speech tags used in the Penn Treebank Project: ... 8. JJR Adjective, comparative...
www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treeb... www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
Penn Treebank Project 1. Principal authors: Ann Bies, Mark Ferguson, Karen Katz, and Robert MacIntyre. Major contributors: Victoria Tredinnick, Grace Kim, ...
ftp.cis.upenn.edu/pub/treebank/doc/manual/root.ps.gz ftp.cis.upenn.edu/pub/treebank/doc/manual/root.ps.gz