License | BSD-style |
---|---|
Maintainer | Vincent Hanquez <vincent@snarc.org> |
Stability | experimental |
Portability | portable |
Safe Haskell | None |
Language | Haskell2010 |
Foundation.String
Description
Opaque packed String encoded in UTF8.
The type is an instance of IsString and IsList, which allow OverloadedStrings
for string literal, and fromList
to convert a [Char] (Prelude String) to a packed
representation
{-# LANGUAGE OverloadedStrings #-} s = "Hello World" :: String
s = fromList ("Hello World" :: Prelude.String) :: String
Each unicode code point is represented by a variable encoding of 1 to 4 bytes,
For more information about UTF8: https://en.wikipedia.org/wiki/UTF-8
- data String
- data Encoding
- = ASCII7
- | UTF8
- | UTF16
- | UTF32
- | ISO_8859_1
- fromBytes :: Encoding -> UArray Word8 -> (String, Maybe ValidationFailure, UArray Word8)
- fromBytesLenient :: UArray Word8 -> (String, UArray Word8)
- fromBytesUnsafe :: UArray Word8 -> String
- toBytes :: Encoding -> String -> UArray Word8
- data ValidationFailure
- lines :: String -> [String]
- words :: String -> [String]
Documentation
Opaque packed array of characters in the UTF8 encoding
Instances
IsList String # | |
Eq String # | |
Data String # | |
Ord String # | |
Show String # | |
IsString String # | |
Monoid String # | |
Buildable String # | |
InnerFunctor String # | |
Collection String # | |
Sequential String # | |
Zippable String # | |
Hashable String # | |
type Item String # | |
type Element String # | |
type Mutable String # | |
type Step String # | |
Constructors
ASCII7 | |
UTF8 | |
UTF16 | |
UTF32 | |
ISO_8859_1 |
fromBytes :: Encoding -> UArray Word8 -> (String, Maybe ValidationFailure, UArray Word8) #
Convert a ByteArray to a string assuming a specific encoding.
It returns a 3-tuple of:
- The string that has been succesfully converted without any error
- An optional validation error
- The remaining buffer that hasn't been processed (either as a result of an error, or because the encoded sequence is not fully available)
Considering a stream of data that is fetched chunk by chunk, it's valid to assume that some sequence might fall in a chunk boundary. When converting chunks, if the error is Nothing and the remaining buffer is not empty, then this buffer need to be prepended to the next chunk
fromBytesLenient :: UArray Word8 -> (String, UArray Word8) #
Convert a UTF8 array of bytes to a String.
If there's any error in the stream, it will automatically insert replacement bytes to replace invalid sequences.
In the case of sequence that fall in the middle of 2 chunks, the remaining buffer is supposed to be preprended to the next chunk, and resume the parsing.
fromBytesUnsafe :: UArray Word8 -> String #
Convert a Byte Array representing UTF8 data directly to a string without checking for UTF8 validity
If the input contains invalid sequences, it will trigger runtime async errors when processing data.
In doubt, use fromBytes
toBytes :: Encoding -> String -> UArray Word8 #
Convert a String to a bytearray in a specific encoding
if the encoding is UTF8, the underlying buffer is returned without extra allocation or any processing
In any other encoding, some allocation and processing are done to convert.