wiki:Road Map

PET road map, resulting from discussion at the Paris meeting of DELPH-IN

First of all, the list of all completed items:

  • converge main & cm into one new main branch
  • code style recommendations fixed
  • poll for functionality, esp. in input part, has been done
  • no need for a ticketing system other than trac
  • sub-group worked out a list of deprecated things (done per mail)
  • got rid of ECL as much as possible (completely)
    • remove input modes that rely on ECL, complete MRS treatment as far as necessary
  • obsolete/deprecated code was deleted

This is a list of possible topics to discuss in this session. Please add topics freely that you feel are missing.

  • now: refactoring the code
    • encourage developers to remove obsolete code and merge similar functionality where possible
    • refactoring into new code style if we see that the version is stable
    • documentation of (especially new) modules,
      • add scripts to illustrate use
      • necessary resources (data to train / models, training scripts, etc)
    • remove as much of the load on "item" as possible
  • GUI: which one, desired functionality?
  • XML-RPC / API enhancments
  • put the road map and TODO into the PET trac
  • implement Lohuizen unifier to enable multi thread operations
  • release plans / procedures (LOGON?)
  • include C++ REPP implementation (RebeccaDridan has working code --- not 100% LISP compatible yet --- differences not necessarily bugs)


  • generation
    • MRS equivalence test
    • MRS rewriting (for trigger rules), also used in idiom checking
  • unique output separation for standard-io server mode (requested many times in the past)

PET API proposal, worked out at the Fefor meeting of DELPH-IN

The idea is to replace the cheap executable (with 'pipe' over stderr...) by a library. The library in turn then could be wrapped in an executable (if someone really still need this), or better in a XML-RPC or other socket server. The library could also be called directly from Java (via JNI) or scripting languages like Python.

The main motivation is to have a more flexible configuration. Currently, the cheap binary has to be restarted (including time-consuming grammar loading) each time the number of readings to return is to be changed, for example. Using the new API, this option could be modified for each parse individually.

Moreover, the different configuration sources (command line, pseudo TDL config file, partly true TDL definitions in the grammar) should be unified.

Generally, no new features will be added, just the implicit interface will be made explicit and streamlined. For example, it will not be possible to exchange the grammar during runtime, as this would require fundamental changes in the code (and is not considered necessary).

Elements of the API

  • basic configuration like which grammar image to load, cheap command line options that cannot be modified at runtime
  • cheap command line options that can be modified at runtime
  • MRS globals (shared with LKB, readable also from file via pseudo TDL parser; cf. JerezTop discussion)
  • parse from preprocessor (FSC) input or raw text input
  • retrieve result chart in different formats
    • MRS
    • MRS-XML
    • tree
    • HTML (?)
    • typed FS (FS-XML?), with feature path to sub-FS
    • fragments
  • access to type hierarchy
    • sub/supertype
    • type subsumption
    • type name and code
    • FS prototype (FS-XML?), but no feature structure unification

Feature requests

  • configurable start symbol for parsing (FrancisBond)
  • change lexicon during runtime (not possible for the built-in lexicon, but probably for lexDB)
  • change model during runtime
Last modified 8 years ago Last modified on 09/09/11 17:41:00