Class std::Str
Implements Sequence<Str>, Iterable<Str>, Comparable<Str>, Multipliable<Int, Str>
Instances of the Str class are immutable strings. A string is a potentially empty sequence of characters. A character is logically represented by an integer code between 0 and 65535, inclusive. Alore has no special data type for characters — characters are represented as strings of length 1. Character codes can be queried using the Ord function, and strings with specific character codes can be created using the Chr function.
String objects may contain data in various encodings. String methods that interpret string contents (e.g. upper) assume that strings are encoded in the 16-bit Unicode encoding or any subset of Unicode, such as ASCII or Latin 1. Some other modules can work with arbitrary narrow strings, i.e. string objects with only 8-bit characters (character codes between 0 and 255, inclusive).
Characters in a string can be accessed using integer indices starting from 0 (the first character). Alternatively, negative indices can used to refer to characters from the end of the string: -1 refers to the last character, -2 to the second to last character, etc.
See also: The string and re modules contain useful functions for dealing with strings. The encodings module provides conversions between different character encodings.
- class Str(x)
-
Construct an object of the Str type. Call the _str()
method of the argument and return a value equal to the result, provided
that it is a string. Objects of all primitive types (except Str)
and the standard
collection types provide a _str method.
See also: The function string::IntToStr and the method Str format are alternative ways of converting objects to strings.
Methods
- length() as Int
- Return the length of the string.
- lower() as Str
- Return a copy of the string with upper case characters converted into lower case.
- upper() as Str
- Return a copy of the string with lower case characters converted into upper case.
- find(substring as Str[, start as Int])
- Return the index of the first occurrence of a substring in the string, or -1 if the substring cannot be found. The returned index is the index of the start of the match. If the argument start is present, only occurrences at index start or higher are considered.
- index(substring as Str) as Int
- Return the index of the first occurrence of a substring in the string, or raise ValueError if the substring cannot be found. The returned index is the index of the start of the match.
- replace(old as Str, new as Str[, max as Int])
-
Return a copy of the string with occurrences of old replaced
with new, starting from the beginning of the string. If the
max argument is present, only replace up to max
instances of old. Examples:
"x..x.".replace("x", "yy") -- Result: "yy..yy." "x..x.".replace("x", "yy", 1) -- Result: "yy..x."
- split([separator as Str[, max as Int]]) as Array<Str>
-
Split the string into fields separated by the separator or by a run of
whitespace characters, if the separator is not specified or it is
nil. In the latter case, whitespace characters at the start and
the end of string are not included in the fields.
Return an array containing the fields. If the separator is given, the result contains always at least a single field, which may be empty. The optional max parameter specifies the maximum number of splits. The rest of the string will be returned as the last element in the array. Examples:
" a black cat ".split() -- Result: ["a", "black", "cat"] "a,black, cat".split(",") -- Result: ["a", "black", " cat"] "a,b,c".split(",", 1) -- Result: ["a", "b,c"]
- join(sequence as Sequence<Str>) as Str
-
Concatenate the strings in a sequence. Use the string as the separator.
" ".join(["a", "black", "cat"]) -- Result: "a black cat" "".join(["a", "b", "cd"]) -- Result: "abcd" ", ".join(["cat"]) -- Result: "cat"
- count(substring as Str) as Int
- Return the number of times a substring occurs in the string (without overlapping).
- iterator() as Iterator<Str>
- Return an iterator object that can be used to sequentially iterate over the characters in the string, starting from the first character.
- strip() as Str
- Return a copy of the string with leading and trailing whitespace characters removed. Only ASCII space, tab, CR and LF characters are removed.
- format(...) as Str
-
When this method is called on a format string object, return a string
constructed according to the format string
and the optional arguments. Most characters in the format string are
returned unmodified:
"foo bar".format() -- Result: "foo bar"
Empty brace expressions are replaced with method arguments converted to strings:"{} and {}".format(1, "2") -- Result: "1 and 2"
Brace characters can be added to the result by duplicating them in the format:"{{ and }}".format() -- Result: "{ and }"
The contents of brace expressions may optionally be prefixed with a field width specifier, an integer followed by a colon. The replacement is padded with spaces to have at least as many characters as the absolute value of the width. If the width is positive, the result is aligned to right, otherwise to left:"{4:}/{-3:}".format("ab", "c") -- Result: " ab/c "
Brace expression for numeric arguments may contain an additional format template that specifies the format of the result. A fractional format template contains one or more zeroes, optionally followed by a dot, a (potentially empty) run of zeroes and a (potentially empty) run of hash (#) characters. The zeroes specify the minimum number of digits in the integer part and the fraction, and the hash characters specify optional fraction digits that are only included if they are non-zero:"{0000}".format(12) -- Result: "0012" "{0.00}".format(12.345) -- Result: "12.35" "{0.0####}".format(1.23) -- Result: "1.23" "{5:0.0}".format(1.2) -- Result: " 1.2"
A scientific format template contains a zero, optionally followed by a dot and a run of zeroes and a run of hash characters; and an exponent template. The exponent template contains 'e' or 'E', an optional '+' and a non-empty run of zeroes. The dot and the following zeroes and hash characters specify the number of decimals shown in the coefficient; the exponent template specifies the minimum number of digits and the type of the exponent:"{0.0e0}".format(1234) -- Result: "1.2e3" "{0.0E0}".format(1234) -- Result: "1.2E3" "{0.###e+00}".format(1000) -- Result: "1e+03" "{0.###e+00}".format(1200) -- Result: "1.2e+03" "{0.00##e0}".format(0.1) -- Result: "1.00e-1" "{0e0}".format(15) -- Result: "2e1"
- startsWith(prefix as Str) as Boolean
- Return a boolean indicating whether the string starts with the prefix.
- endsWith(suffix as Str) as Boolean
- Return a boolean indicating whether the string ends with the suffix.
- decode(encoding as Encoding[, mode as Constant]) as Str
-
Decode the string to 16-bit Unicode using the given character encoding.
The mode argument may be encodings::Strict
(the default) or encodings::Unstrict. Use this to
convert strings in 8-bit binary encodings to Unicode so that you can
use them with operations such as lower() that expect Unicode
strings.
Example:
"\u00c3\u00a4".decode(encodings::Utf8) -- Decode "ä" in UTF-8 to 16-bit Unicode
See also: Module encodings
- encode(encoding as Encoding[, mode as Constant]) as Str
-
Encode the string (interpreted as 16-bit Unicode) using the given
character encoding.
The mode argument may be encodings::Strict (the
default) or encodings::Unstrict.
Calling this method is
equivalent to encoding.encoder([mode]).encode(str).
Example:
"\u20ac".encode(encodings::Utf8) -- Encode the Euro sign using UTF-8
See also: Module encodings
Operations
Str objects support the following operations:
- str[n] (Str[Int] ⇒ Str; Str[Pair<Int, Int>] ⇒ Str)
-
If the index n is an integer, return the character at the
specified index as a string of length 1.
If the index value is out of bounds, raise an IndexError exception.
If the index is a pair x : y, return a slice containing the indices x, x + 1, ..., y - 1. If the left value of the pair is omitted or nil, it is assumed to be 0. If the right value is omitted or nil, the result is a substring extending to the end of the string. Invalid indices in range bounds are clipped to lie within the string.
"hello"[2] -- "e" "hello"[1:3] -- "el" "hello"[3:] -- "lo" "hello"[:-1] -- "hell"
- substr in str (Str in Str ⇒ Boolean)
- Test whether a string contains a substring. Return a boolean value.
- for ch in str (for Str in Str)
- The characters in a string can be iterated with a for loop, starting from the first character.
- x + y (Str + Str ⇒ Str)
- Return the concatenation of two strings.
- str * n (Str * Int ⇒ Str)
- n * str (Int * Str ⇒ Str)
-
A string can be repeated any number of times by multiplying it with an
integer. The integer must not be negative. Multiplying a string with zero
results in an empty string.
"foo" * 3 -- "foofoofoo" "x" * 0 -- ""
- x == y (Str == Object ⇒ Boolean)
- Strings can be compared for equality.
- x < y (Str < Str ⇒ Boolean)
- x > y (Str > Str ⇒ Boolean)
- Strings can be compared lexicographic order. Order comparisons are based on the numeric values of characters.
- Repr(str)
-
Return a string representing the string using Alore string literal
syntax, and only using printable ASCII characters. Characters other than
printable ASCII character are represented using the \uNNNN escape
sequences.
WriteLn(Repr("""foo" + Tab + "\uffff")) -- Print """foo\u0009\uffff"
- Int(str)
- Convert a string to an integer.
- Float(str)
- Convert a string to a float.
- Hash(str)
- Return the hash value of a string.