+ Here are the assumptions GHC makes about address space layout.
+ Broadly, it thinks there are three sections:
+
+ CODE Read-only. Contains code and read-only data (such as
+ info tables)
+ Also called "text"
+
+ DATA Read-write data. Contains static closures (and on some
+ architectures, info tables too)
+
+ HEAP Dynamically-allocated closures
+
+ USER None of the above. The only way USER things arise right
+ now is when GHCi allocates a constructor info table, which
+ it does by mallocing them.
+
+ Three macros identify these three areas:
+ IS_DATA(p), HEAP_ALLOCED(p)
+
+ HEAP_ALLOCED is called FOR EVERY SINGLE CLOSURE during GC.
+ It needs to be FAST.
+
+ Implementation of HEAP_ALLOCED
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ Concerning HEAP, most of the time (certainly under [Static] and [GHCi],
+ we ensure that the heap is allocated above some fixed address HEAP_BASE
+ (defined in MBlock.h). In this case we set TEXT_BEFORE_HEAP, and we
+ get a nice fast test.
+
+ Sometimes we can't be quite sure. For example in Windows, we can't
+ fix where our heap address space comes from. In this case we un-set
+ TEXT_BEFORE_HEAP. That makes it more expensive to test whether a pointer
+ comes from the HEAP section, because we need to look at the allocator's
+ address maps (see HEAP_ALLOCED macro)
+
+ Implementation of CODE and DATA
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ Concerning CODE and DATA, there are three main regimes:
+
+ [Static] Totally The segments are contiguous, and laid out
+ statically linked exactly as above
+
+ [GHCi] Static, GHCi may load new modules, but it knows the
+ except for GHCi address map, so for any given address it can
+ still tell which section it belongs to
+
+ [DLL] OS-supported Chunks of CODE and DATA may be mixed in
+ dynamic loading the address space, and we can't tell how
+
+
+ For the [Static] case, we assume memory is laid out like this
+ (in order of increasing addresses)
+
+ Start of memory
+ CODE section
+ TEXT_SECTION_END_MARKER (usually _etext)
+ DATA section
+ DATA_SECTION_END_MARKER (usually _end)
+ USER section
+ HEAP_BASE
+ HEAP section
+
+ For the [GHCi] case, we have to consult GHCi's dynamic linker's
+ address maps, which is done by macros
+ is_dynamically_loaded_code_or_rodata_ptr
+ is_dynamically_loaded_code_or_rwdata_ptr
+
+ For the [DLL] case, IS_DATA is really not usable at all.
+ */
+
+
+#undef TEXT_BEFORE_HEAP
+#if !defined(mingw32_TARGET_OS) && !defined(cygwin32_TARGET_OS)
+#define TEXT_BEFORE_HEAP 1
+#endif
+
+extern void* TEXT_SECTION_END_MARKER_DECL;
+extern void* DATA_SECTION_END_MARKER_DECL;
+
+#ifdef darwin_TARGET_OS
+extern unsigned long macho_etext;
+extern unsigned long macho_edata;
+#define IS_CODE_PTR(p) ( ((P_)(p) < (P_)macho_etext) \
+ || is_dynamically_loaded_code_or_rodata_ptr((char *)p) )
+#define IS_DATA_PTR(p) ( ((P_)(p) >= (P_)macho_etext && \
+ (P_)(p) < (P_)macho_edata) \
+ || is_dynamically_loaded_rwdata_ptr((char *)p) )
+#define IS_USER_PTR(p) ( ((P_)(p) >= (P_)macho_edata) \
+ && is_not_dynamically_loaded_ptr((char *)p) )
+#else
+/* Take into account code sections in dynamically loaded object files. */
+#define IS_DATA_PTR(p) ( ((P_)(p) >= (P_)&TEXT_SECTION_END_MARKER && \
+ (P_)(p) < (P_)&DATA_SECTION_END_MARKER) \
+ || is_dynamically_loaded_rwdata_ptr((char *)p) )
+#define IS_USER_PTR(p) ( ((P_)(p) >= (P_)&DATA_SECTION_END_MARKER) \
+ && is_not_dynamically_loaded_ptr((char *)p) )
+#endif
+
+/* --------------------------------------------------------------------------
+ Macros for distinguishing data pointers from code pointers
+ --------------------------------------------------------------------------
+
+ Specification
+ ~~~~~~~~~~~~~
+ The garbage collector needs to make some critical distinctions between pointers.
+ In particular we need
+
+ LOOKS_LIKE_GHC_INFO(p) p points to an info table
+
+ For both of these macros, p is
+ *either* a pointer to a closure (static or heap allocated)
+ *or* a return address on the (Haskell) stack
+
+ (Return addresses are in fact info-pointers, so that the Haskell stack
+ looks very like a chunk of heap.)
+
+ The garbage collector uses LOOKS_LIKE_GHC_INFO when walking the stack, as it
+ walks over the "pending arguments" on its way to the next return address.
+ It is called moderately often, but not as often as HEAP_ALLOCED
+
+ ToDo: LOOKS_LIKE_GHC_INFO(p) does not return True when p points to a
+ constructor info table allocated by GHCi. We should really rename
+ LOOKS_LIKE_GHC_INFO to LOOKS_LIKE_GHC_RETURN_INFO.
+
+ Implementation
+ ~~~~~~~~~~~~~~
+ LOOKS_LIKE_GHC_INFO is more complicated because of the need to distinguish
+ between static closures and info tables. It's a known portability problem.
+ We have three approaches:
+
+ Plan A: Address-space partitioning.
+ keep static closures in the (single, contiguous) data segment: IS_DATA_PTR(p)
+
+ Plan A can fail for two reasons:
+ * In many environments (eg. dynamic loading),
+ text and data aren't in a single contiguous range.
+ * When we compile through vanilla C (no mangling) we sometimes
+ can't guaranteee to put info tables in the text section. This
+ happens eg. on MacOS where the C compiler refuses to put const
+ data in the text section if it has any code pointers in it
+ (which info tables do *only* when we're compiling without
+ TABLES_NEXT_TO_CODE).
+
+ Hence, Plan B: (compile-via-C-with-mangling, or native code generation)
+ Put a zero word before each static closure.
+ When compiling to native code, or via C-with-mangling, info tables
+ are laid out "backwards" from the address specified in the info pointer
+ (the entry code goes forward from the info pointer). Hence, the word
+ before the one referenced the info pointer is part of the info table,
+ and is guaranteed non-zero.
+
+ For reasons nobody seems to fully understand, the statically-allocated tables
+ of INTLIKE and CHARLIKE closures can't have this zero word, so we
+ have to test separately for them.
+
+ Plan B fails altogether for the compile-through-vanilla-C route, because
+ info tables aren't laid out backwards.
+
+
+ Hence, Plan C: (unregisterised, compile-through-vanilla-C route only)
+ If we didn't manage to get info tables into the text section, then
+ we can distinguish between a static closure pointer and an info
+ pointer as follows: the first word of an info table is a code pointer,
+ and therefore in text space, whereas the first word of a closure pointer
+ is an info pointer, and therefore not. Shazam!
+*/
+
+
+/* When working with Win32 DLLs, static closures are identified by
+ being prefixed with a zero word. This is needed so that we can
+ distinguish between pointers to static closures and (reversed!)
+ info tables.
+
+ This 'scheme' breaks down for closure tables such as CHARLIKE,
+ so we catch these separately.
+
+ LOOKS_LIKE_STATIC_CLOSURE()
+ - discriminates between static closures and info tbls
+ (needed by LOOKS_LIKE_GHC_INFO() below - [Win32 DLLs only.])
+ LOOKS_LIKE_STATIC()
+ - distinguishes between static and heap allocated data.
+ */
+#if defined(ENABLE_WIN32_DLL_SUPPORT)
+ /* definitely do not enable for mingw DietHEP */
+#define LOOKS_LIKE_STATIC(r) (!(HEAP_ALLOCED(r)))
+
+/* Tiresome predicates needed to check for pointers into the closure tables */
+#define IS_CHARLIKE_CLOSURE(p) \
+ ( (P_)(p) >= (P_)stg_CHARLIKE_closure && \
+ (char*)(p) <= ((char*)stg_CHARLIKE_closure + \
+ (MAX_CHARLIKE-MIN_CHARLIKE) * sizeof(StgIntCharlikeClosure)) )
+#define IS_INTLIKE_CLOSURE(p) \
+ ( (P_)(p) >= (P_)stg_INTLIKE_closure && \
+ (char*)(p) <= ((char*)stg_INTLIKE_closure + \
+ (MAX_INTLIKE-MIN_INTLIKE) * sizeof(StgIntCharlikeClosure)) )
+
+#define LOOKS_LIKE_STATIC_CLOSURE(r) (((*(((unsigned long *)(r))-1)) == 0) || IS_CHARLIKE_CLOSURE(r) || IS_INTLIKE_CLOSURE(r))
+
+#elif defined(darwin_TARGET_OS) && !defined(TABLES_NEXT_TO_CODE)
+
+#define LOOKS_LIKE_STATIC(r) (!(HEAP_ALLOCED(r)))
+#define LOOKS_LIKE_STATIC_CLOSURE(r) (IS_DATA_PTR(r) && !LOOKS_LIKE_GHC_INFO(r))
+
+#else
+
+#define LOOKS_LIKE_STATIC(r) IS_DATA_PTR(r)
+#define LOOKS_LIKE_STATIC_CLOSURE(r) IS_DATA_PTR(r)
+
+#endif
+
+
+/* -----------------------------------------------------------------------------
+ Macros for distinguishing infotables from closures.
+
+ You'd think it'd be easy to tell an info pointer from a closure pointer:
+ closures live on the heap and infotables are in read only memory. Right?
+ Wrong! Static closures live in read only memory and Hugs allocates
+ infotables for constructors on the (writable) C heap.