From: rjmccall@gmail.com Date: Sun, 17 Sep 2006 00:36:41 +0000 (+0000) Subject: Modify toArgs to parse quotes/escapes like /bin/sh X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=commitdiff_plain;h=3d9f1290f071b0d2e48060a160995af7608d1045 Modify toArgs to parse quotes/escapes like /bin/sh Addresses ticket #197, which asks for escape sequences to be supported directly (i.e. not only in dquoted strings) on :load commands in GHCI. Fix modifies the toArgs function to parse its input like /bin/sh does, i.e. recognizing escapes anywhere and treating quoted strings as atomic chunks. Thus: :load a\ b c\"d e" "f would parse with three arguments, namely 'a b', 'c"d', and 'e f'. toArgs is used to parse arguments for both :load and :main, but doesn't appear to be used elsewhere. I see no harm in modifying both to be consistent -- in fact, the functionality is probably more useful for :main than for :load. --- diff --git a/compiler/utils/Util.lhs b/compiler/utils/Util.lhs index 4ce14ab..522f795 100644 --- a/compiler/utils/Util.lhs +++ b/compiler/utils/Util.lhs @@ -757,29 +757,45 @@ looksLikeModuleName (c:cs) = isUpper c && go cs go (c:cs) = (isAlphaNum c || c == '_') && go cs \end{code} -Akin to @Prelude.words@, but sensitive to dquoted entities treating -them as single words. +Akin to @Prelude.words@, but acts like the Bourne shell, treating +quoted strings and escaped characters within the input as solid blocks +of characters. Doesn't raise any exceptions on malformed escapes or +quoting. \begin{code} toArgs :: String -> [String] toArgs "" = [] toArgs s = - case break (\ ch -> isSpace ch || ch == '"') (dropWhile isSpace s) of -- " - (w,aft) -> - (\ ws -> if null w then ws else w : ws) $ - case aft of - [] -> [] - (x:xs) - | x /= '"' -> toArgs xs - | otherwise -> - case lex aft of - ((str,rs):_) -> stripQuotes str : toArgs rs - _ -> [aft] + case dropWhile isSpace s of -- drop initial spacing + [] -> [] -- empty, so no more tokens + rem -> let (tok,aft) = token rem [] in tok : toArgs aft where - -- strip away dquotes; assume first and last chars contain quotes. - stripQuotes :: String -> String - stripQuotes ('"':xs) = init xs - stripQuotes xs = xs + -- Grab a token off the string, given that the first character exists and + -- isn't whitespace. The second argument is an accumulator which has to be + -- reversed at the end. + token [] acc = (reverse acc,[]) -- out of characters + token ('\\':c:aft) acc -- escapes + = token aft ((escape c) : acc) + token (q:aft) acc | q == '"' || q == '\'' -- open quotes + = let (aft',acc') = quote q aft acc in token aft' acc' + token (c:aft) acc | isSpace c -- unescaped, unquoted spacing + = (reverse acc,aft) + token (c:aft) acc -- anything else goes in the token + = token aft (c:acc) + + -- Get the appropriate character for a single-character escape. + escape 'n' = '\n' + escape 't' = '\t' + escape 'r' = '\r' + escape c = c + + -- Read into accumulator until a quote character is found. + quote qc = + let quote' [] acc = ([],acc) + quote' ('\\':c:aft) acc = quote' aft ((escape c) : acc) + quote' (c:aft) acc | c == qc = (aft,acc) + quote' (c:aft) acc = quote' aft (c:acc) + in quote' \end{code} -- -----------------------------------------------------------------------------