rics.strings#

Utility functions that act on or produce strings.

Functions

camel_to_snake(s)

Naive camelCase or PascalCase to snake_case conversion.

format_bytes(n, *[, binary, long, decimals])

Format bytes as a string.

format_kwargs(kwargs, *[, max_value_length, ...])

Format keyword arguments.

format_perf_counter(start, *[, end, full])

Format performance counter output.

format_seconds(t, *[, allow_negative, full])

Format performance counter output.

snake_to_camel(s, *[, lower])

Naive snake_case to camelCase conversion.

str_as_bool(s)

Convert a string s to a boolean value.

Classes

ReprFormatter(*[, max_value_length, ...])

Alternative repr() implementation.

format_bytes(n: int, *, binary: bool = True, long: bool = False, decimals: int = 2) str[source]#

Format bytes as a string.

Parameters:
  • n – Number of bytes. Must be positive.

  • binary – Output binary prefixes if True, use metric (SI) prefixes otherwise.

  • long – Output out full unit and prefix if True, use abbreviated versions otherwise.

  • decimals – Number of decimals to include. Ignored for when n < base.

Returns:

Formatted number of bytes.

Examples

Formatting on prefix bounds

The jump as made at base / 2, where base is one of 1024 and 1000 (when binary=False).

>>> format_bytes(512 * 1024)
'512.00 KiB'
>>> format_bytes(512 * 1024 + 1)
'0.50 MiB'

This rule does not apply when n <= base.

>>> format_bytes(1024, long=True)
1024 bytes
>>> format_bytes(1024 + 1)
'1.00 KiB'

Output flags

>>> format_bytes(20190511, binary=False, long=False)
'20.19 MB'
>>> format_bytes(20190511, binary=False, long=True)
'20.19 megabytes'
>>> format_bytes(20190511, binary=True, long=False)
'19.26 MiB'
>>> format_bytes(20190511, binary=True, long=True)
'19.26 mebibytes'

Large outputs

Metric and binary have different upper limits.

>>> format_bytes(21**21, binary=True)
'2416.44 YiB'
>>> format_bytes(21**21, binary=True, long=True)
'2416.44 yobibytes'
>>> format_bytes(21**21, binary=False)
'5.84 RB'
>>> format_bytes(21**21, binary=False, long=True)
'5.84 ronnabytes'

If you ever see output like this, please let me know so that I can brag that someone important is using my little library.

format_perf_counter(start: float, *, end: float | None = None, full: bool = False) str[source]#

Format performance counter output.

This function formats performance counter output based on the time elapsed. This is a thin wrapper around the format_seconds() function.

Parameters:
  • start – Start time.

  • end – End time. Retrieved using time.perf_counter() if None.

  • full – If True, show all non-zero components above four hours.

Returns:

A formatted performance counter time.

Examples

Basic usage.

>>> import time
>>> start = time.perf_counter()
>>> time.sleep(1219.0)
>>> format_perf_counter(start)
'20m 19s'

With no end argument given, the current time is retrieved using time.perf_counter().

format_seconds(t: float, *, allow_negative: bool = False, full: bool = False) str[source]#

Format performance counter output.

Parameters:
  • t – Time in seconds.

  • allow_negative – If True, format negative t with a leading minus sign.

  • full – If True, show all non-zero components above four hours.

Returns:

A formatted performance counter time.

Examples

Basic usage.

>>> format_seconds(0.0000154)
'15 μs'
>>> format_seconds(0.154)
'154 ms'
>>> format_seconds(31.39)
'31.4 sec'

Clock units are used for t > 60 seconds.

>>> format_seconds(59.99)
'60.0 sec'
>>> format_seconds(60.00)
'60.0 sec'
>>> format_seconds(60.01)
'1m'
>>> format_seconds(309623.49)
'3d 14h'

Large intervals is rounded by default. You may set full=True to show full output.

>>> format_seconds(309623.49)
'3d 14h'
>>> format_seconds(309633.51, full=True)
'3d 14h 0m 34s'
Raises:

ValueError – If t < 0 and allow_negative=False (the default).

camel_to_snake(s: str) str[source]#

Naive camelCase or PascalCase to snake_case conversion.

Parameters:

s – A string to convert.

Returns:

A snake_case string.

Raises:

IndexError – If string is empty.

Examples

Converting camel case strings.

>>> camel_to_snake("ClassName")
'class_name'
>>> camel_to_snake("variableName")
'variable_name'

Proper snake_case strings will not be changed.

>>> camel_to_snake("already_snake_case")
'already_snake_case'

Notes

Passing SCREAMING_SNAKE_CASE strings is not supported.

snake_to_camel(s: str, *, lower: bool = True) str[source]#

Naive snake_case to camelCase conversion.

Parameters:
  • s – A string to convert.

  • lower – If False, return PamelCase instead of camelCase.

Returns:

A camelCase string.

Raises:

IndexError – If string is empty.

Examples

Converting snake case strings.

>>> snake_to_camel("snake_case")
'snakeCase'

Passing SCREAMING_SNAKE_CASE strings is supported.

>>> snake_to_camel("SCREAMING_SNAKE_CASE")
'screamingSnakeCase'

Set lower=False to convert to PascalCase or UpperCamelCase.

>>> snake_to_camel("SCREAMING_SNAKE_CASE", lower=False)
'ScreamingSnakeCase'

Notes

Passing camelCase strings is not supported.

str_as_bool(s: str) bool[source]#

Convert a string s to a boolean value.

The output is determined by the content of s, as per the mapping shown below.

Keys:
  • False: ('0', 'false', 'no', 'off', 'disable', 'disabled')

  • True: ('1', 'true', 'yes', 'on', 'enable', 'enabled')

Matching is case-insensitive.

Parameters:

s – A string.

Returns:

A bool value.

Raises:
  • TypeError – If s is not a string.

  • ValueError – If s cannot be converted to bool using the keys above.

Examples

Basic usage.

>>> str_as_bool("true"), str_as_bool("false")
(True, False)

The input is cleaned and normalized.

>>> str_as_bool(" TRUE"), str_as_bool("False")
(True, False)

Input strings are normalized using str.strip() and str.lower().

Notes

Using bool(<str>) is equivalent to len(<str>) == 0.

format_kwargs(kwargs: Mapping[str, Any], *, max_value_length: int = 120, prefix_classname: bool = False, include_module: bool = False) str[source]#

Format keyword arguments.

Parameters:
  • kwargs – Arguments to format.

  • prefix_classname – If True, prepend the class name if a value belongs to a class.

  • include_module – If True, prepend the public module (see misc.get_public_module()).

  • max_value_length – Replace value with the class name above this limit. 0=no limit.

Returns:

A string on the form ‘key0=repr(value0), key1=repr(value1)’.

Raises:

ValueError – For keys in kwargs that are not valid Python argument names.

Examples

Basic usage.

>>> format_kwargs({"an_int": 1, "a_string": "Hello!"})
"an_int=1, a_string='Hello!'"

Notes

Uses ReprFormatter to format values.

class ReprFormatter(*, max_value_length: int = 120, prefix_classname: bool = False, include_module: bool = False, module_aliases: Mapping[str, str] | None = None)[source]#

Bases: object

Alternative repr() implementation.

Values above max_value_length characters are replaced by stylized class names.

Parameters:
  • max_value_length – Use class name above this length. 0=no limit, -1=force class name.

  • prefix_classname – If True, prepend the class name if a value belongs to a class.

  • include_module – If True, prepend the public module (see misc.get_public_module()).

  • module_aliases – A mapping of module replacements, e.g. {"pandas": "pd"}. Default is DEFAULT_MODULE_ALIASES. Trailing dots are added automatically. Ignored when include_module is False.

See also

The format_kwargs(), misc.tname(), and misc.get_public_module() functions.

DEFAULT_MODULE_ALIASES: Mapping[str, str] = {'matplotlib.pyplot': 'plt', 'numpy': 'np', 'pandas': 'pd', 'polars': 'pl', 'tensorflow': 'tf'}#
format_value(value: Any) str[source]#

Convert any value to string.

format_ndim_array(value: Any) str[source]#

Format shaped types, e.g. attr:pandas.DataFrame.shape.