Haskell Libraries The Haskell Libraries Mailing List
libraries@haskell.org
Introduction This document consistutes part of a proposal for an extension to the Haskell 98 language. The full proposal has several parts: A modest language extension to Haskell 98 that adds the character . to the lexical syntax for a module name, allowing a hierarchical module namespace where a module name is a sequence of components separated by periods. An allocation of the new module namespace to existing and non-existent libraries, people, organisations, and local use. A policy and procedure for allocating new parts of the namespace. A set of libraries which are under the control of the community, have reference implementations kept in a standard place, and conform to a set of guidelines and policies set out in this document. We shall call this set of libraries the core libraries. In addition, this document also describes: Guidelines and conventions for organising the hierarchy. Our policy with respect to the design and evolution of library APIs, versioning of library APIs, and maintenance of the reference implementation. A set of conventions for coding style and portability within the core libraries. How to contribute This project is driven by the Haskell community, so contributions of all kinds are welcome. The first step is to join the Haskell libraries mailing list, and maybe browse the list archives. Some of the ways you can contribute are: By donating code: for libraries in the core set which don't yet have a reference implementation, or for new contributions to the core set, code is always welcome. Code that conforms to the style guidelines (which aren't very strict, see ) and comes with documentation () and a test suite () is better, but these aren't essential. As a library progresses through the stability scale () these things become more important, but for an experimental library we're not going to worry too much about this stuff. By porting code for an existing library to a new compiler or architecture. A library is classed as portable if it should be available regardless of which compiler/platform combination you're using; however, many libraries are non-portable for one reason or another (see , and broadening the scope of these libraries is always welcome. Become a library maintainer: if you have a particular interest in and/or knowledge about a certain library, and have the time to spare, and the library in question doesn't already have a maintainer, then you may be a suitable maintainer for the library. The responsibilities of library maintainers are given in . Participating in the design process for new libraries, and suggesting improvements to existing libraries. Everyone on the Haskell libraries mailing list is invited to participate in the design process, so get involved! The hierarchy layout We first classify each node in the hierarchy according to one of the following terms: Allocated Nodes in the hierarchy can be allocated to a library (whether the library actually exists or not). The currently allocated nodes are specified in . User The User hierarchy is reserved for users: a user may always use the portion of the hierarchy which is formed from his/her email address as follows: replace the @ by a ., reverse the order of the components, capitalise the first letter of each component, and prepend User.. For example, simonmar@microsoft.com becomes User.Com.Microsoft.Simonmar. Organisation The Org hierarchy is reserved for organisations. Any organisation with a DNS domain name owns a unique space in the hierarchy formed by reversing the components of the domain, capitalising the first character of each component, and prepending Org.. ToDo: I don't like this very much, any better ideas? Local The Local hierarchy is reserved for libraries which are local to the current site. Libraries which are to be distributed outside the current site should not be placed in the Local hierarchy. Top-level All top-level names (i.e. module names that don't contain a .) that are otherwise unallocated, are available for use by the program. Note that for compabibility with Haskell 98, some modules in this namespace are reserved (eg. Directory, IO, Time etc.). Unallocated Any node which doesn't belong to any of the above categories is currently unallocated, and is not available for use. A node in the hierarchy may be both a specific library and a parent node for a number of child nodes. For example, Foreign is a library, and so is Foreign.Ptr. Hierarchy design guidelines Module Naming Conventions The hierarchy The currently allocated top-level names are: Prelude Haskell98 Prelude (mostly just re-exports other parts of the tree). Control Libraries which provide functions, types or classes whose purpose is primarily to express control structure. Data Libraries which provide data types, operations over data types, or type classes, except for libraries for which one of the other more specific categories is appropriate. Database Libraries for providing access to or operations for building databases. Debug Support for debugging Haskell programs. Edison The Edison data structure library. FileFormat Support for reading and/or writing various file formats (except: programming language source code which lives in Language, database formats which live in Database, and textual file formats which are catered for in Text). Foreign Interaction with code written in a foreign programming language. Graphics Libraries for producing graphics or providing graphical user interfaces. Language Libraries for operating on or generating source code in various programming languages, including parsers, pretty printers, abstract syntax definitions etc. Local Available for site-local use. Numeric Functions and classes which provide operations over numeric data. Network Libraries for communicating over a network, including implementations of network protocols. Org Allocated to organisations on a domain-name basis (see ). System Libraries for communication with the system on which the Haskell program is running (including the runtime system). Text Libraries for parsing and generating data in a textual format (including structured textual formats such as XML, HTML, but not including programming language source, which lives in Language). GHC Libraries specific to the GHC/GHCi system. NHC Libraries specific to the NHC compiler. Hugs Libraries specific to the Hugs system. User Allocated to individual users, using email addresses (see ). Licensing Following some discussion on the mailing list related to how we should license the libraries, the viewpoint that was least offensive to all involved seems to be the following: We wish to accomodate source code from different contributors, and with different licenses. However, a library of modules where each module is released under a different license, and where the dependencies between modules aren't clear, isn't workable (it's too hard for a user of the library to tell whether they're violating the terms of the each license or not). So the solution is as follows: code under different licenses will be clearly separate in the repository (i.e. in separate subdirectories), and compilers are expected to present packages of modules where all modules in a package fall under the same license, and where the dependencies between packages are clear. It was decided that certain essential functionality should be available under a BSD style license. Hence, the BSD part of the repository will contain implementations of at least the following modules: Prelude, Foreign, ToDo: what else?. ToDo: include a prototype BSD license here. Versioning Library Stability The stability of a library relates primarily to its API. Stability provides an indication of how often the API is likely to change (or whether it may even go away entirely). The stability scale is also a measure of how strictly the conventions in this document are applied to the library: an experimental library isn't subject to any restrictions regarding coding style and documentation, but a stable library is expected to adhere to the guidelines, and come with full documentation and tests. To help with the stability issue, library maintainers are allowed to mark functions, types or classes as deprecatedCompilers may have extra support for warning about the use of a deprecated feature, for example GHC's DEPRECATED pragma. , which means simply that the feature will be removed at a later date. Just how long it will stick around for depends on the stability category of the library (see below). A feature is marked as deprecated in the documentation for the library, and optionally in an implementation-dependent way which enables the system to warn about the use of deprecated features. The current stability categories are: experimental An experimental library is unrestricted in terms of API changes: the API may change between minor revisions and there is no requirement to retain old interfaces for compatibility. Documentation and tests aren't required for an experimental library. provisional A provisional library is moving towards stability, and the rate of change of the API is slower. API changes between minor revisions must be accompanied by deprecated versions of the old features where possible. API changes between major versions are unrestricted. The library should come with at least rudimentary documentation. stable A stable library has an essentially fixed API. Additions to the API may be made for a minor release, deprecated features must be retained for at least one major revision, and small changes only may be made to the existing API semantics for a major revision. A stable library is expected to include full documentation and tests. Portability Considerations The portability status of a library affects under which platforms and compilers the library will be available on. Haskell implementations are expected to provide all of the portable core libraries, and those non-portable core libraries which are appropriate for that particular platform/compiler implementation. The precise meaning of the terms portable and non-portable for our purposes are given below: Portable A portable library may use only Haskell 98 features plus approved extensions (see ), and may not use any platform-specific features. It may make use of other portable libraries only. Non-portable A non-portable library may be non-portable for one or more of the following reasons: Requires extensions A library which uses non-approved language extensions. Requires nonportable libraries A library which depends (directly or indirectly) on other non-portable libraries. OS-specific Platform-specific A library which depends on features or APIs particular to a certain OS or platform is non-portable for that reason. Library Maintainers This is a collaborative project, so we like to devolve control of the design and implementation of libraries to those with an interest or appropriate expertise (or maybe just the time!). A maintainer isn't necessarily a single person - for example, the listed maintainer for most of the core libraries is libraries@haskell.org, indicating that the library is under the control of the community as a whole. The maintainer for the Foreign hierarchy is ffi@haskell.org, the mailing list for discussion of the Haskell FFI standard. The responsibilities of a library maintainer include: Most importantly: act as a single point of contact for issues relating to the library API and its implementation. Manage any discussion related to the library (which can take place on libraries.haskell.org if necessary), and summarise the results. Make final decisions, and implement them. Maintain the implementation, including: fixing bugs, updating to keep up with changes in other libraries, porting to new compilers/platforms, and integrating code from other contributors. The maintainer is expected to be the only person/group to make functional changes to the source code (non-functional or trivial changes don't count). Maintain/write the documentation and tests. If you can't maintain the library any more for whatever reason, tell libraries@haskell.org and we'll revert the maintainer status of the library to the default. The Core Team The core team is responsible for making final decisions about the project as a whole and resolving disputes where necessary. We expect that needing to invoke the core team will be a rare occurrence. The core team is also responsible for approving maintainership requests. Currently, the core team consists of one person from each of the compiler camps, and these are also the people that will primarily be maintaining the library framework for their respective compiler projects: Simon Marlow simonmar@microsoft.com (GHC representative) Malcolm Wallace Malcolm.Wallace@cs.york.ac.uk (NHC representative) Andy Gill andy@galconn.com (Hugs representative) Documentation Testing Migration path How compatible will a compiler using the new libraries be with code written for Haskell 98 or older library systems (such as the hslibs suite and GHC's package system), and for how long will compatibility be maintained? Our current plan for GHC is as follows: by default, with the flag, you'll get access to the core libraries. Compatibility with Haskell 98 code will be maintained using a separate package of wrappers presenting interfaces for the Haskell 98 libraries (IO, Ratio, Directory, etc.). The Haskell 98 compatibility package will be enabled by default, but we plan to add an option to disable it if necessary. For code that uses -package lang, we could also provide a compatibility wrapper package (so -package lang will continue to work as before and present the same library interfaces), but this may prove too much work to maintain - we haven't decided whether to do this or not. It is unlikely that compatibility wrappers for any of the other hslibs packages will be provided. Programming Conventions Standard Module Header The following module header will be used for all core libraries, and we recommend using it for library source code in general: ----------------------------------------------------------------------------- -- -- Module : module -- Copyright : (c) author year -- License : license -- -- Maintainer : libraries@haskell.org | email-address -- Stability : experimental | provisional | stable -- Portability : portable | non-portable (reason(s)) -- -- $Id: libraries.sgml,v 1.1 2001/06/28 14:15:04 simonmar Exp $ -- -- Description ----------------------------------------------------------------------------- where: $Id: libraries.sgml,v 1.1 2001/06/28 14:15:04 simonmar Exp $ is optional, but usually included if the module is under CVS or RCS control. module is the fully qualified module name of the module author/year Is the primary author and copyright holder of the module, and the year in which copyright is claimed. license Specifies the license on the file (see ). email-address The email address of the maintainer, or maintainers, of the library (see ). reason(s) The reasons for non-portability must be listed (see ). description A short description of the module. Naming Conventions These naming conventions are pulled straight from the hslibs documentation. They were formed after lengthy discussions and are heavily based on an initial suggestion from Marcin Kowalczyk qrczak@knm.org.pl. Note that the conventions are not mutually exclusive, e.g. should the function creating a set from a list of elements have the name set or listToSet? (Alas, it currently has neither name.) The following nomenclature is used: Pure, i.e. non-monadic functions are simply called, well, functions. Monadic functions, i.e. functions having a type ... -> m a for some Monad m are called actions. Module names A module defining a data type or type class X has the itself the name X, e.g. StablePtr. A module which re-exports the modules in a subtree of the hierarchy has the same name as the root of that subtree, eg. Foreign re-exports Foreign.Ptr, Foreign.MarshalUtils etc. If a subtree of the hierarchy contains several modules which provide similar functionality (eg. there are several pretty-printing libraries under Text.PrettyPrinter), then the module at the root of the subtree generally re-exports just one of the modules in the subtree (possibly the most popular or commonly-used alternative). In Haskell you sometimes publish two interfaces to your libraries; one for users, and one for library writers or advanced users who might want to extend things. Typically the advanced users need to be able to see past certain abstractions. The current proposal is for a module named M, the advanced version would be named M.Internals. eg. import Text.Html -- The library import Text.Html.Internals -- The non-abstract library (for building other libs) Constructor names Constructor names Empty values of type X have the name emptyX, e.g. emptySet. Actions creating a new empty value of type X have the name newEmptyX, e.g. newEmptyMVar. Functions creating an arbitrary value of type X have the name X itself (with the first letter downcased), e.g. array. (TODO: This often collides with xToY convention, how should this be resolved?) Actions creating new values arbitrary values of type X have the name newX, e.g. newIORef. Accessor names Accessor names Functions getting an attribute of a value or a part of it have the name of the attribute itself, e.g. length, bounds. Actions accessing some kind of reference or state have the name getX, where X is the type of the contents or the name of the part being accessed, e.g. getChar, getEnv. An alternative naming scheme is readY, where Y is the type of the reference or container, e.g. readIORef. Functions or actions getting a value via a pointer-like type X should be named deRefX, e.g. deRefStablePtr, deRefWeak. Modifier names Modifier names Functions returning a value with attribute X set to a new value should be named setX. (TODO: Add Examples.) Actions setting some kind of reference or state have the name putX, where X is the type of the contents or the name of the part being accessed, e.g. putChar. An alternative naming scheme is writeY, where X is the type of the reference or container, e.g. writeIORef. Actions in the IO monad setting some global state X are traditionally named setX, too, although putX would be more appropriate, e.g. setReadlineName. Actions modifying a container X by a function of type a -> a have the name modifyX, e.g. modifySTRef. Predicate names Predicate names Predicates, both non-monadic and monadic, testing a property X have the name isX. Names for conversions Names for conversions Functions converting a value of type X to a value of type Y have the name XToY with all leading uppercase characters of X converted to lower case, e.g. stToIO. Overloaded conversion functions of type C a => a -> X have the name toX, e.g. toInteger. Overloaded conversion functions of type C a => X -> a have the name fromX, e.g. fromInteger. Miscellaneous naming conventions Miscellaneous naming convetions An action that is identical to another one called X, but discards the return value has the name X_, e.g. mapM and mapM_. Functions and actions which are potentially dangerous to use and leave some kind of proof obligation to the programmer have the name unsafeX, e.g. unsafePerformIO. There are two conventions for binary and N-ary variants of an associative operation: One convention uses an operator or a short name for the binary operation and a long name for the N-ary variant, e.g. (+) and sum, max and maximum. The other convention suffixes the N-ary variant with Many. (TODO: Add Examples.) If possible, names are chosen such that either plain application or arg1 `operation` arg2 is correct English, e.g. isPrefixOf is good for use in backquotes. Library design conventions Actions setting and modifying a kind of reference or state return (), getting the value is separate, e.g. writeIORef and modifyIORef both return (), only readIORef returns the value in an IORef A function or action taking a some kind of state and returning a pair consisting of a result and a new state, the result is the first element of the pair and the new state is the second, see e.g. Random. When the type Either is used to encode an error condition and a normal result, Left is used for the former and Right for the latter, see e.g. MonadEither. A module corresponding to a class (e.g. Bits) contains the class definition, perhaps some auxiliary functions, and all sensible instances for Prelude types, but nothing more. Other modules containing types for which an instance for the class in question makes sense contain the code for the instance itself. Record-like C bit fields or structs have a record-like interface, i.e. pure getting and setting of fields. (TODO: Clarify a little bit. Add examples.) Although the possibility of partial application suggests the type attr -> object -> object for functions setting an attribute or value, infix notation with backquotes implies object -> attr -> object. (TODO: Add Examples.) Coding style conventions Changes to standard Haskell 98 libraries Some changes have been made to the standard Haskell 98 libraries in the new library scheme, both in the names of the modules themselves and in their exported interfaces. Below is a summary of those changes - at this time, the new libraries are marked as provisional and are maintained by libraries@haskell.org, so changes in the interfaces are all up for discussion. modules with interface changes ------------------------------ Array -> Data.Array added instance Typeable (Array ix a) Char -> Data.Char no interface changes (should have instance Typeable?) Complex -> Data.Complex added instance Typeable (Complex a) IO -> System.IO added hPutBuf :: Handle -> Ptr a -> Int -> IO () hGetBuf :: Handle -> Ptr a -> Int -> IO Int fixIO :: (a -> IO a) -> IO a List -> Data.List exports [](..) Numeric -> ???? not placed in hierarchy yet System -> System.Exit, System.Environment, System.Cmd split into three modules just renamed, no interface changes: ----------------------------------- CPUTTime -> System.CPUTime Directory -> System.IO.Directory Ix -> Data.Ix Locale -> System.Locale Monad -> Data.Monad Random -> System.Radom Ratio -> Data.Ratio Time -> System.Time