String

General
1. Indexing and slicing
2. String is a sequence
3. Reversing
4. Sorting
5. Operators
Escape characters and raw strings
1. Escape characters
2. Raw strings
Built-in functions
1. ascii()
2. chr()
3. format()
4. input()
5. ord()
6. print()
7. str()
Methods
1. Case conversion
2. Search and replace
3. Classification
4. Formatting
5. Conversion
6. Mappings
f-strings
1. Formatting
2. expression
3. conversion
4. format
string module
io.StringIO

General

Strings is immutable!!!

Indexing and slicing:

s = 'abcdefgh'
t = type(s)    # t = <class 'str'>
a = s[1]       # a = 'b'
b = s[-2]      # b = 'g'
c = len(s)     # c = 8
d = s[1:-2:2]  # d = 'bdf'

String is sequence:

l = list('abcdefg')         # l = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
m = [i for i in 'abcdefg']  # m = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
# see str.split()

Reversing:

s = 'abcdefgh'
r = s[::-1]  
r = ''.join(reversed(s))  # r = 'hgfedcba'

Sorting:

s = 'tbyjdfasertgert'
r = ''.join(sorted(s, reverse=True))  # r = 'ytttsrrjgfeedba'

Operators

# +
a, b = '123', 'abc'
c = a + b   # c = '123abc'  
# see str.join()
# *
a = 'a'
b = a * 10  # b = 'aaaaaaaaaa'
# in
a = ('bc' in 'abcd')   # a = True

Escape characters and raw strings

Escape characters

Escape sequences:

\<newline>  Backslash and newline ignored
\\          Backslash (\)
\'          Single quote (')
\"          Double quote (")
\a          ASCII Bell (BEL)
\b          ASCII Backspace (BS)
\f          ASCII Formfeed (FF)
\n          ASCII Linefeed (LF)
\r          ASCII Carriage Return (CR)
\t          ASCII Horizontal Tab (TAB)
\v          ASCII Vertical Tab (VT)
\ooo        Character with octal value ooo
\xhh        Character with hex value hh

Escape sequences only recognized in string literals:

\N{name}    Character named name in the Unicode database
\uxxxx      Character with 16-bit hex value xxxx
\Uxxxxxxxx  Character with 32-bit hex value xxxxxxxx

Unicode codes and characters...

Universal newlines:

\n          Line Feed                     Unix
\r          Carriage Return               Macintosh
\r\n        Carriage Return + Line Feed   Windows
\v or \x0b  Line Tabulation
\f or \x0c  Form Feed
\x1c        File Separator
\x1d        Group Separator
\x1e        Record Separator
\x85        Next Line (C1 Control Code)
\u2028      Line Separator
\u2029      Paragraph Separator

import string
h = string.whitespace
# h = ' \t\n\r\x0b\x0c'

Raw strings

Raw string treats the backslashes \ as literal characters.
Unless an ‘r’ or ‘R’ prefix is present, escape sequences in strings are interpreted according to rules given above.

s = 'abc\tdef\nghi'   # s = 'abc\tdef\nghi'
print(s)
#output:
# abc	def
# ghi
s = r'abc\tdef\nghi'  # s = 'abc\\tdef\\nghi'
print(s)
#output
# abc\tdef\nghi

s1 = r'abc\tdef\nghi' 
s2 = 'abc\\tdef\\nghi'
a = s1 == s2  # a = True
b = s1 is s2  # b = True

a = len('\n')   # a = 1
b = len(r'\n')  # b = 2

A raw string cannot end with an odd number of backslashes !!!

s = r'\\\'  # SyntaxError: EOL while scanning string literal

Used for low level Windows path handling:

path = 'c:\user\task\new'  
# SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape

correct:

path = r'c:\user\task\new'
# or
path = 'c:\\user\\task\\new'

Be careful with \ at the end !!!
Raw strings with repr():

s = 'abc\tdef\nghi' 
v = repr(s)  # v = "'abc\\tdef\\nghi'"

See also f-strings for f'string' and bytes for b'string'

Built-in functions

ascii()

Returns only ascii characters. Escaping others.

t = """ 
Ńńżź"""
s = ascii(t)  # s = "'\\u0105\\u0104\\u0106\\u0107\\n\\u0143\\u0144\\u017c\\u017a'"
print(s)
# output: '\u0105\u0104\u0106\u0107\n\u0143\u0144\u017c\u017a'
q = ascii('∰')  # q = "'\\u2230'"
w = ascii('a')   # w = "'a'"

chr()

Function returns a string from a Unicode code intege.

a = chr(65)       # a = 'A'
b = chr(0x104)    # b = 'Ą'
c = chr(0x2230)   # c = '∰'

format()

See f-strings...

input()

reads line from input, converts it into string.

s = input('>')
#> 46
# s = '46'

ord()

Returns the Unicode code from a given character.

a = ord('A')  # a = 65
b = ord('Ą')  # b = 260  '0x104'
c = ord('€')  # c = 8364 '0x20ac'

print()

Print objects to the text stream file, separated by sep and followed by end. All non-keyword arguments are converted to strings.

print()  # prints 'end' be default \n
print(1, 2, 3, 4)  # 1 2 3 4\n
print(1, 2, 3, 4, sep=',', end='...')  # 1,2,3,4...

file keyword:

import sys
print()
# is equivalent
print(file=sys.stdout)

with open('text.txt', 'w') as f:
    print('test...', file=f)
# text.txt: test...

flush keyword:
Flushes buffered output.

str()

Returns a string representation of an object. -> __str()__

a = str()    # a = ''
b = str(12)  # b = '12'

For bytes:

s = 'ĄĄĄĄ'
a = s.encode('utf-8')
# !! or !!
a = bytes('ĄĄĄĄ', encoding='utf-8')  # a = b'\xc4\x84\xc4\x84\xc4\x84\xc4\x84'
b = str(a)                           # b = "b'\\xc4\\x84\\xc4\\x84\\xc4\\x84\\xc4\\x84'"
c = str(a, encoding='utf-8')         # c = 'ĄĄĄĄ'
d = str(a, encoding='ascii', errors='ignore')  # d = ''
# !! or !!
e = a.decode('utf-8')                # e = 'ĄĄĄĄ'

Methods

Case conversion

capitalize()
lower()
swapcase()
title()
upper()
casefold() - more aggressibe than lower()

Search and replace

count()

s = 'ababababababababababa'
v = s.count('ab')          # v = 10 
v = s.count('ab', 10, -2)  # v = 4

endswith()
find() - return index, if want to check only use 'in'
index() - Like find(), but raise ValueError when the substring is not found.
rfind()
rindex()
replace()
startswith()

Classification

isalnum()
isalfa()
isdigit()
isidentifier()
islower()
isprintable()
isspace() - whitespace istitle()
isupper()

Formatting

format()
format_map()
center()
expandtabs()

s = '1\t2\t3'
v = s.expandtabs(4)  # v = '1   2   3'

ljust()
lstrip()
rjust()
rstrip()
strip()

s = '   abc   '
v = s.strip()  # v = 'abc'
s = '# .............. abceef #. qwert........#.'
v = s.strip('# .')  # v = 'abceef #. qwert'

zfill()

v = '34'.zfill(5)   # v = '00034'
v = '-34'.zfill(5)  # v = '-0034'

removeprefix() - 3.9
removesuffix() - 3.9

Conversion

join()
Return a string which is the concatenation of the strings in iterable.

l = ['a', 'b', 'c']
j = '_'
s = j.join(l)  # s = 'a_b_c'

partition()
Split the string at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing the string itself, followed by two empty strings.

s = 'asd asd d asda XXX dfgg d dfgd XXX df XXX'
v = s.partition('XXX')  # v = ('asd asd d asda ', 'XXX', ' dfgg d dfgd XXX df XXX')

rpartition() - from right side
rsplit() - from right side
split()
Return a list of the words in the string, using sep as the delimiter string.

s = 'ab cd ef'
v = s.split()  # v = ['ab', 'cd', 'ef']
s = 'ab<>cd<>ef<>gh'
v = s.split('<>', maxsplit=2)  # v = ['ab', 'cd', 'ef<>gh']

splitlines()
Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.
See list of universal newlines.

s = 'ab c\n\nde fg\rkl\r\n'
v = s.splitlines()  # v = ['ab c', '', 'de fg', 'kl']
v = s.splitlines(keepend=True)  # v = ['ab c\n', '\n', 'de fg\r', 'kl\r\n']

encode()
Return an encoded version of the string as a bytes object. Default encoding is 'utf-8'.

s = 'ąćźabc'
b = s.encode(encoding='utf-8')  # b = b'\xc4\x85\xc4\x87\xc5\xbaabc'
b = s.encode(encoding='ascii')
# UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
b = s.encode(encoding='ascii', errors='replace')  # b = b'???abc'

Possible errors keyword:
strict, ignore, replace,xmlcharrefreplace, backslashreplace, namereplace
Standard encodings

Mappings

translate()
maketrans()

f-strings

Formatting

Other (older) methods of string formatting:

str.format()
%-formatting
string.Template

f-strings are faster !!!,
Can be multiline and nested.
Can't be empty, can't contain \.

general format:

print(f'{expression!conversion:format}')

expression

variable, object, expression

conversion

!s - str() - default
!r - repr()
!a - ascii()

format

:[[<fill>]<align>][<sign>][#][0][<width>][<group>][.<prec>][<type>]

format specification...

string module

Constants:

import string
a = string.ascii_letters
# a = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ; lowercase + uppercase
b = string.ascii_lowercase
# b = 'abcdefghijklmnopqrstuvwxyz'
c = string.ascii_uppercase
# c = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
d = string.digits
# d = '0123456789'
e = string.hexdigits
# e = '0123456789abcdefABCDEF'
f = string.octdigits
# f = '01234567'
g = string.punctuation
# g = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
h = string.whitespace
# h = ' \t\n\r\x0b\x0c'; \x0b - \v - line tabulation; \x0c - \f - form feed
i = string.printable
# i = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
# digits + ascii_letters + punctuation + whitespace

And...

class Formatter - > str.format
class Template
capwords()

io.StringIO

StringIO(initial_value='', newline='\n')
A text stream using an in-memory text buffer.
Methods:
'close', 'closed', 'detach', 'encoding', 'errors', 'fileno', 'flush', 'getvalue', 'isatty', 'line_buffering', 'newlines', 'read', 'readable', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines'

getvalue() - returns entire buffer - string.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String

General

Indexing and slicing:

String is sequence:

Reversing:

Sorting:

Operators

Escape characters and raw strings

Escape characters

Raw strings

Built-in functions

ascii()

chr()

format()

input()

ord()

print()

str()

Methods

Case conversion

Search and replace

Classification

Formatting

Conversion

Mappings

f-strings

Formatting

expression

conversion

format

string module

io.StringIO

FilesExpand file tree

string.md

Latest commit

History

string.md

File metadata and controls

String

General

Indexing and slicing:

String is sequence:

Reversing:

Sorting:

Operators

Escape characters and raw strings

Escape characters

Raw strings

Built-in functions

ascii()

chr()

format()

input()

ord()

print()

str()

Methods

Case conversion

Search and replace

Classification

Formatting

Conversion

Mappings

f-strings

Formatting

expression

conversion

format

string module

io.StringIO