From d31d3e069e7f0811a17942414e5eb4f28c04bf5e Mon Sep 17 00:00:00 2001 From: simonmar Date: Mon, 5 Aug 2002 11:11:46 +0000 Subject: [PATCH] [project @ 2002-08-05 11:11:44 by simonmar] - Update the old coding style document - Move it to the commentary under the "RTS & libs" section --- docs/coding-style.html | 333 +++++++++---------- ghc/docs/comm/index.html | 1 + ghc/docs/comm/rts-libs/coding-style.html | 517 ++++++++++++++++++++++++++++++ 3 files changed, 674 insertions(+), 177 deletions(-) create mode 100644 ghc/docs/comm/rts-libs/coding-style.html diff --git a/docs/coding-style.html b/docs/coding-style.html index 4f39cbf..250ab88 100644 --- a/docs/coding-style.html +++ b/docs/coding-style.html @@ -5,11 +5,14 @@ -

Coding suggestions for GHC/Hugs related code

+

GHC Style guidelines for C code

Comments

-NB These are just suggestions. They're not set in stone. Some of +

These coding style guidelines are mainly intended for use in +ghc/rts and ghc/includes. + +

NB These are just suggestions. They're not set in stone. Some of them are probably misguided. If you disagree with them, feel free to modify this document (and make your commit message reasonably informative) or mail someone (eg. +

  • +The C99 standard. One reasonable reference is here. + +

  • Writing Solid Code, Microsoft Press. (Highly recommended. Possibly -the only Microsoft Press book that's worth reading. SimonPJ has a -copy.) +the only Microsoft Press book that's worth reading.)

  • -Autoconf documentation (which doesn't seem to be on the web). +Autoconf documentation. See also The autoconf macro archive and Cyclic Software's description @@ -50,57 +57,57 @@ Hill C Style and Coding Standards.

    Portability issues

    +

  • Don't declare and initialize variables at the same time. +Separating the declaration and initialization takes more lines, but +make the code clearer.

  • Use inline functions instead of macros if possible - they're a lot @@ -414,6 +378,28 @@ in the library.

  • +Don't define macros that expand to a list of statements. +You could just use braces as in: + +
    +  #define ASSIGN_CC_ID(ccID)              \
    +        {                                 \
    +        ccID = CC_ID;                     \
    +        CC_ID++;                          \
    +        }
    +
    + +(but it's usually better to use an inline function instead - see above). + +

  • +Don't even write macros that expand to 0 statements - they can mess you +up as well. Use the doNothing macro instead. +
    +  #define doNothing() do { } while (0)
    +
    + + +

  • This code
     int* p, q;
    @@ -486,36 +472,30 @@ and by keeping it to 80 columns we can ensure that code looks OK on
     everyone's screen.  Long lines are hard to read, and a sign that the
     code needs to be restructured anyway.
     
    +

  • When commenting out large chunks of code, use #ifdef 0 +... #endif rather than /* ... */ because C doesn't +have nested comments. + +

  • When declaring a typedef for a struct, give the struct a name +as well, so that other headers can forward-reference the struct name +and it becomes possible to have opaque pointers to the struct. Our +convention is to name the struct the same as the typedef, but add a +leading underscore. For example: + +
    +  typedef struct _Foo {
    +    ...
    +  } Foo;
    +
    + +

  • Do not use ! instead of explicit comparison against +NULL or '\0'; the latter is much clearer. +

  • We don't care too much about your indentation style but, if you're modifying a function, please try to use the same style as the rest of the function (or file). If you're writing new code, a tab width of 4 is preferred. -

    -On which note: -Hugs related pieces of code often start with the line: -

    -  /* -*- mode: hugs-c; -*- */
    -
    -which helps Emacs mimic the indentation style used by Mark P Jones -within Hugs. Add this to your .emacs file. -
    -  (defun hugs-c-mode ()
    -    "C mode with adjusted defaults for use with Hugs (based on linux-c-mode)"
    -    (interactive)
    -    (c-mode)
    -    (setq c-basic-offset 4)
    -    (setq indent-tabs-mode nil) ; don't use tabs to indent
    -    (setq c-recognize-knr-r nil)  ; no K&R here - don't pay the price
    -    ; no: (setq tab-width 4)
    -
    -    (c-set-offset 'knr-argdecl-intro    0)
    -    (c-set-offset 'case-label           0)
    -    (c-set-offset 'statement-case-intro '++)
    -    (c-set-offset 'statement-case-open  '+)
    -    )
    -
    -

    CVS issues

    @@ -526,12 +506,11 @@ Don't be tempted to reindent or reorganise large chunks of code - it generates large diffs in which it's hard to see whether anything else was changed.

    -If you must reindent or reorganise, don't do anything else in that commit -and give advance warning that you're about to do it in case anyone else -is changing that file. +If you must reindent or reorganise, don't include any functional +changes that commit and give advance warning that you're about to do +it in case anyone else is changing that file. - diff --git a/ghc/docs/comm/index.html b/ghc/docs/comm/index.html index 441a644..a3362b0 100644 --- a/ghc/docs/comm/index.html +++ b/ghc/docs/comm/index.html @@ -76,6 +76,7 @@

    RTS & Libraries

      +
    • Coding Style Guidelines
    • Spineless Tagless C
    • Primitives
    • Prelude Foundations diff --git a/ghc/docs/comm/rts-libs/coding-style.html b/ghc/docs/comm/rts-libs/coding-style.html new file mode 100644 index 0000000..b723657 --- /dev/null +++ b/ghc/docs/comm/rts-libs/coding-style.html @@ -0,0 +1,517 @@ + + + + + The GHC Commentary - Style Guidelines for RTS C code + + + +

      The GHC Commentary - Style Guidelines for RTS C code

      + +

      Comments

      + +

      These coding style guidelines are mainly intended for use in +ghc/rts and ghc/includes. + +

      NB These are just suggestions. They're not set in stone. Some of +them are probably misguided. If you disagree with them, feel free to +modify this document (and make your commit message reasonably +informative) or mail someone (eg. The GHC mailing list) + +

      References

      + +If you haven't read them already, you might like to check the following. +Where they conflict with our suggestions, they're probably right. + + + + +

      Portability issues

      + +
        +

      • We try to stick to C99 where possible. We use the following +C99 features relative to C89, some of which were previously GCC +extensions (possibly with different syntax): + +
          +

        • Variable length arrays as the last field of a struct. GCC has +a similar extension, but the syntax is slightly different: in GCC you +would declare the array as arr[0], whereas in C99 it is +declared as arr[]. + +

        • Inline annotations on functions (see later) + +

        • Labeled elements in initialisers. Again, GCC has a slightly +different syntax from C99 here, and we stick with the GCC syntax until +GCC implements the C99 proposal. + +

        • C++-style comments. These are part of the C99 standard, and we +prefer to use them whenever possible. +
        + +

        In addition we use ANSI-C-style function declarations and +prototypes exclusively. Every function should have a prototype; +static function prototypes may be placed near the top of the file in +which they are declared, and external prototypes are usually placed in +a header file with the same basename as the source file (although there +are exceptions to this rule, particularly when several source files +together implement a subsystem which is described by a single external +header file). + +

      • We use the following GCC extensions, but surround them with +#ifdef __GNUC__: + +
          +

        • Function attributes (mostly just no_return and +unused) +

        • Inline assembly. +
        + +

      • +char can be signed or unsigned - always say which you mean + +

      • Our POSIX policy: try to write code that only uses POSIX (IEEE +Std 1003.1) interfaces and APIs. We used to define +POSIX_SOURCE by default, but found that this caused more +problems than it solved, so now we require any code that is +POSIX-compliant to explicitly say so by having #include +"PosixSource.h" at the top. Try to do this whenever possible. + +

      • Some architectures have memory alignment constraints. Others +don't have any constraints but go faster if you align things. These +macros (from config.h) tell you which alignment to use + +
        +  /* minimum alignment of unsigned int */
        +  #define ALIGNMENT_UNSIGNED_INT 4
        +
        +  /* minimum alignment of long */
        +  #define ALIGNMENT_LONG 4
        +
        +  /* minimum alignment of float */
        +  #define ALIGNMENT_FLOAT 4
        +
        +  /* minimum alignment of double */
        +  #define ALIGNMENT_DOUBLE 4
        +
        + +

      • Use StgInt, StgWord and StgPtr when +reading/writing ints and ptrs to the stack or heap. Note that, by +definition, StgInt, StgWord and StgPtr are +the same size and have the same alignment constraints even if +sizeof(int) != sizeof(ptr) on that platform. + +

      • Use StgInt8, StgInt16, etc when you need a +certain minimum number of bits in a type. Use int and +nat when there's no particular constraint. ANSI C only +guarantees that ints are at least 16 bits but within GHC we assume +they are 32 bits. + +

      • Use StgFloat and StgDouble for floating +point values which will go on/have come from the stack or heap. Note +that StgDouble may occupy more than one StgWord, but +it will always be a whole number multiple. + +

        +Use PK_FLT(addr), PK_DBL(addr) to read +StgFloat and StgDouble values from the stack/heap, +and ASSIGN_FLT(val,addr) / +ASSIGN_DBL(val,addr) to assign StgFloat/StgDouble values +to heap/stack locations. These macros take care of alignment +restrictions. + +

        +Heap/Stack locations are always StgWord aligned; the +alignment requirements of an StgDouble may be more than that +of StgWord, but we don't pad misaligned StgDoubles +because doing so would be too much hassle (see PK_DBL & +co above). + +

      • +Avoid conditional code like this: + +
        +  #ifdef solaris_HOST_OS
        +  // do something solaris specific
        +  #endif
        +
        + +Instead, add an appropriate test to the configure.in script and use +the result of that test instead. + +
        +  #ifdef HAVE_BSD_H
        +  // use a BSD library
        +  #endif
        +
        + +

        The problem is that things change from one version of an OS to another +- things get added, things get deleted, things get broken, some things +are optional extras. Using "feature tests" instead of "system tests" +makes things a lot less brittle. Things also tend to get documented +better. + +

      + +

      Debugging/robustness tricks

      + + +Anyone who has tried to debug a garbage collector or code generator +will tell you: "If a program is going to crash, it should crash as +soon, as noisily and as often as possible." There's nothing worse +than trying to find a bug which only shows up when running GHC on +itself and doesn't manifest itself until 10 seconds after the actual +cause of the problem. + +

      We put all our debugging code inside #ifdef DEBUG. The +general policy is we don't ship code with debugging checks and +assertions in it, but we do run with those checks in place when +developing and testing. Anything inside #ifdef DEBUG should +not slow down the code by more than a factor of 2. + +

      We also have more expensive "sanity checking" code for hardcore +debugging - this can slow down the code by a large factor, but is only +enabled on demand by a command-line flag. General sanity checking in +the RTS is currently enabled with the -DS RTS flag. + +

      There are a number of RTS flags which control debugging output and +sanity checking in various parts of the system when DEBUG is +defined. For example, to get the scheduler to be verbose about what +it is doing, you would say +RTS -Ds -RTS. See +includes/RtsFlags.h and rts/RtsFlags.c for the full +set of debugging flags. To check one of these flags in the code, +write: + +

      +  IF_DEBUG(gc, fprintf(stderr, "..."));
      +
      + +would check the gc flag before generating the output (and the +code is removed altogether if DEBUG is not defined). + +

      All debugging output should go to stderr. + +

      +Particular guidelines for writing robust code: + +

        +

      • +Use assertions. Use lots of assertions. If you write a comment +that says "takes a +ve number" add an assertion. If you're casting +an int to a nat, add an assertion. If you're casting an int to a char, +add an assertion. We use the ASSERT macro for writing +assertions; it goes away when DEBUG is not defined. + +

      • +Write special debugging code to check the integrity of your data structures. +(Most of the runtime checking code is in rts/Sanity.c) +Add extra assertions which call this code at the start and end of any +code that operates on your data structures. + +

      • +When you find a hard-to-spot bug, try to think of some assertions, +sanity checks or whatever that would have made the bug easier to find. + +

      • +When defining an enumeration, it's a good idea not to use 0 for normal +values. Instead, make 0 raise an internal error. The idea here is to +make it easier to detect pointer-related errors on the assumption that +random pointers are more likely to point to a 0 than to anything else. + +
        +typedef enum
        +    { i_INTERNAL_ERROR  /* Instruction 0 raises an internal error */
        +    , i_PANIC           /* irrefutable pattern match failed! */
        +    , i_ERROR           /* user level error */
        +
        +    ...
        +
        + +

      • Use #warning or #error whenever you write a +piece of incomplete/broken code. + +

      • When testing, try to make infrequent things happen often. + For example, make a context switch/gc/etc happen every time a + context switch/gc/etc can happen. The system will run like a + pig but it'll catch a lot of bugs. + +
      + +

      Syntactic details

      + +
        +

      • Important: Put "redundant" braces or parens in your code. +Omitting braces and parens leads to very hard to spot bugs - +especially if you use macros (and you might have noticed that GHC does +this a lot!) + +

        +In particular: +

          +

        • +Put braces round the body of for loops, while loops, if statements, etc. +even if they "aren't needed" because it's really hard to find the resulting +bug if you mess up. Indent them any way you like but put them in there! +
        + +

      • +When defining a macro, always put parens round args - just in case. +For example, write: +
        +  #define add(x,y) ((x)+(y))
        +
        +instead of +
        +  #define add(x,y) x+y
        +
        + +

      • Don't declare and initialize variables at the same time. +Separating the declaration and initialization takes more lines, but +make the code clearer. + +

      • +Use inline functions instead of macros if possible - they're a lot +less tricky to get right and don't suffer from the usual problems +of side effects, evaluation order, multiple evaluation, etc. + +
          +

        • Inline functions get the naming issue right. E.g. they + can have local variables which (in an expression context) + macros can't. + +

        • Inline functions have call-by-value semantics whereas macros + are call-by-name. You can be bitten by duplicated computation + if you aren't careful. + +

        • You can use inline functions from inside gdb if you compile with + -O0 or -fkeep-inline-functions. If you use macros, you'd better + know what they expand to. +
        + +However, note that macros can serve as both l-values and r-values and +can be "polymorphic" as these examples show: +
        +  // you can use this as an l-value or an l-value
        +  #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo)
        +
        +  // polymorphic case
        +  // but note that min(min(1,2),3) does 3 comparisions instead of 2!!
        +  #define min(x,y) (((x)<=(y)) ? (x) : (y))
        +
        + +

      • +Inline functions should be "static inline" because: +
          +

        • +gcc will delete static inlines if not used or theyre always inlined. + +

        • + if they're externed, we could get conflicts between 2 copies of the + same function if, for some reason, gcc is unable to delete them. + If they're static, we still get multiple copies but at least they don't conflict. +
        + +OTOH, the gcc manual says this +so maybe we should use extern inline? + +
        +   When a function is both inline and `static', if all calls to the
        +function are integrated into the caller, and the function's address is
        +never used, then the function's own assembler code is never referenced.
        +In this case, GNU CC does not actually output assembler code for the
        +function, unless you specify the option `-fkeep-inline-functions'.
        +Some calls cannot be integrated for various reasons (in particular,
        +calls that precede the function's definition cannot be integrated, and
        +neither can recursive calls within the definition).  If there is a
        +nonintegrated call, then the function is compiled to assembler code as
        +usual.  The function must also be compiled as usual if the program
        +refers to its address, because that can't be inlined.
        +
        +   When an inline function is not `static', then the compiler must
        +assume that there may be calls from other source files; since a global
        +symbol can be defined only once in any program, the function must not
        +be defined in the other source files, so the calls therein cannot be
        +integrated.  Therefore, a non-`static' inline function is always
        +compiled on its own in the usual fashion.
        +
        +   If you specify both `inline' and `extern' in the function
        +definition, then the definition is used only for inlining.  In no case
        +is the function compiled on its own, not even if you refer to its
        +address explicitly.  Such an address becomes an external reference, as
        +if you had only declared the function, and had not defined it.
        +
        +   This combination of `inline' and `extern' has almost the effect of a
        +macro.  The way to use it is to put a function definition in a header
        +file with these keywords, and put another copy of the definition
        +(lacking `inline' and `extern') in a library file.  The definition in
        +the header file will cause most calls to the function to be inlined.
        +If any uses of the function remain, they will refer to the single copy
        +in the library.
        +
        + +

      • +Don't define macros that expand to a list of statements. +You could just use braces as in: + +
        +  #define ASSIGN_CC_ID(ccID)              \
        +        {                                 \
        +        ccID = CC_ID;                     \
        +        CC_ID++;                          \
        +        }
        +
        + +(but it's usually better to use an inline function instead - see above). + +

      • +Don't even write macros that expand to 0 statements - they can mess you +up as well. Use the doNothing macro instead. +
        +  #define doNothing() do { } while (0)
        +
        +
      + +

    • +This code +
      +int* p, q;
      +
      +looks like it declares two pointers but, in fact, only p is a pointer. +It's safer to write this: +
      +int* p;
      +int* q;
      +
      +You could also write this: +
      +int *p, *q;
      +
      +but it is preferrable to split the declarations. + +

    • +Try to use ANSI C's enum feature when defining lists of constants of +the same type. Among other benefits, you'll notice that gdb uses the +name instead of its (usually inscrutable) number when printing values +with enum types and gdb will let you use the name in expressions you +type. + +

      +Examples: +

      +    typedef enum { /* N.B. Used as indexes into arrays */
      +     NO_HEAP_PROFILING,		
      +     HEAP_BY_CC,		
      +     HEAP_BY_MOD,		
      +     HEAP_BY_GRP,		
      +     HEAP_BY_DESCR,		
      +     HEAP_BY_TYPE,		
      +     HEAP_BY_TIME		
      +    } ProfilingFlags;
      +
      +instead of +
      +    # define NO_HEAP_PROFILING	0	/* N.B. Used as indexes into arrays */
      +    # define HEAP_BY_CC		1
      +    # define HEAP_BY_MOD	2
      +    # define HEAP_BY_GRP	3
      +    # define HEAP_BY_DESCR	4
      +    # define HEAP_BY_TYPE	5
      +    # define HEAP_BY_TIME	6
      +
      +and +
      +    typedef enum {
      +     CCchar    = 'C',
      +     MODchar   = 'M',
      +     GRPchar   = 'G',
      +     DESCRchar = 'D',
      +     TYPEchar  = 'Y',
      +     TIMEchar  = 'T'
      +    } ProfilingTag;
      +
      +instead of +
      +    # define CCchar    'C'
      +    # define MODchar   'M'
      +    # define GRPchar   'G'
      +    # define DESCRchar 'D'
      +    # define TYPEchar  'Y'
      +    # define TIMEchar  'T'
      +
      + +

    • Please keep to 80 columns: the line has to be drawn somewhere, +and by keeping it to 80 columns we can ensure that code looks OK on +everyone's screen. Long lines are hard to read, and a sign that the +code needs to be restructured anyway. + +

    • When commenting out large chunks of code, use #ifdef 0 +... #endif rather than /* ... */ because C doesn't +have nested comments. + +

    • When declaring a typedef for a struct, give the struct a name +as well, so that other headers can forward-reference the struct name +and it becomes possible to have opaque pointers to the struct. Our +convention is to name the struct the same as the typedef, but add a +leading underscore. For example: + +
      +  typedef struct _Foo {
      +    ...
      +  } Foo;
      +
      + +

    • Do not use ! instead of explicit comparison against +NULL or '\0'; the latter is much clearer. + +

    • We don't care too much about your indentation style but, if +you're modifying a function, please try to use the same style as the +rest of the function (or file). If you're writing new code, a +tab width of 4 is preferred. + +
    + +

    CVS issues

    + +
      +

    • +Don't be tempted to reindent or reorganise large chunks of code - it +generates large diffs in which it's hard to see whether anything else +was changed. +

      +If you must reindent or reorganise, don't include any functional +changes that commit and give advance warning that you're about to do +it in case anyone else is changing that file. +

    + + + + -- 1.7.10.4