rics.translation#
Translation of IDs with flexible formatting and name matching.
Classes
|
Translate IDs to human-readable labels. |
|
Create a |
- class Translator(fetcher: Union[Fetcher[SourceType, IdType], TranslationMap[NameType, SourceType, IdType], Dict[SourceType, PlaceholderTranslations], Dict[SourceType, Union[PlaceholderTranslations, DataFrame, Dict[str, Sequence[Any]]]]], fmt: Union[str, Format] = '{id}:{name}', mapper: Optional[Mapper[NameType, SourceType, None]] = None, default_fmt: Optional[Union[str, Format]] = None, default_translations: Optional[Union[Dict[str, Union[Dict[KT, VT], Dict[OKT, Dict[KT, VT]]]], InheritedKeysDict[OKT, KT, VT]]] = None)[source]#
Bases:
Generic[Translatable,NameType,SourceType,IdType]Translate IDs to human-readable labels.
The Translator is the main entry point for all translation tasks. Simplified translation process steps:
The
map_to_sourcesmethod performs name-to-source mapping (seeDirectionalMapping).The
fetchmethod extracts IDs to translate and retrieves data (seeTranslationMap).Finally, the
translatemethod applies the translations and returns to the caller.
- Parameters
fetcher – A
Fetcheror ready-to-use translations.fmt – String
Formatspecification for translations.mapper – A
Mapperinstance for binding names to sources.default_fmt – Alternative
Formatto use instead of fmt for fallback translation of unknown IDs.default_translations – Shared and/or source-specific default placeholder values for unknown IDs. See
InheritedKeysDict.make()for details.
Notes
Untranslatable IDs will be
Noneby default if neither default_fmt nor default_translations is given. Adding the maximal_untranslated_fraction option totranslate()will raise an exceptions if too many IDs are left untranslated. Note however that this verifiction step may be expensive.Examples
A minimal example. For a more complete use case, see the DVD Rental Database example. Assume that we have data for people and animals as in the table below:
people: animals: id | name | gender id | name | is_nice ------+---------+-------- ----+--------+--------- 1991 | Richard | Male 0 | Tarzan | false 1999 | Sofia | Female 1 | Morris | true 1904 | Fred | Male 2 | Simba | true
In most real cases we’d fetch this table from somewhere. In this case, howeever, there’s so little data that we can simply enumerate the components needed for translation ourselves to create a
MemoryFetcher.>>> from rics.translation import Translator >>> translation_data = { ... 'animals': {'id': [0, 1, 2], 'name': ['Tarzan', 'Morris', 'Simba'], 'is_nice': [False, True, True]}, ... 'people': {'id': [1999, 1991, 1904], 'name': ['Sofia', 'Richard', 'Fred']}, ... } >>> translator = Translator(translation_data, fmt='{id}:{name}[, nice={is_nice}]') >>> data = {'animals': [0, 2], 'people': [1991, 1999]} >>> for key, translated_table in translator.translate(data).items(): >>> print(f'Translations for {repr(key)}:') >>> for translated_id in translated_table: >>> print(f' {repr(translated_id)}') Translations for 'animals': '0:Tarzan, nice=False' '2:Simba, nice=True' Translations for 'people': '1991:Richard' '1999:Sofia'
Handling unknown IDs.
>>> default_translations = dict( ... default={'is_nice': 'Maybe?', 'name': "Bob"}, ... specific={'animals': {'name': 'Fido'}}, >>> ) >>> useless_database = { ... 'animals': {'id': [], 'name': []}, ... 'people': {'id': [], 'name': []} >>> } >>> translator = Translator(useless_database, default_translations=default_translations, ... fmt='{id}:{name}[, nice={is_nice}]') >>> data = {'animals': [0], 'people': [0]} >>> for key, translated_table in translator.translate(data).items(): >>> print(f'Translations for {repr(key)}:') >>> for translated_id in translated_table: >>> print(f' {repr(translated_id)}') Translations for 'animals': '0:Fido' Translations for 'people': '0:Bob'
Since we didn’t give an explicit default_fmt, the regular fmt is used instead. Formats can be plain strings, in which case tranlation will never explicitly fail unless the name itself fails to map and
Mapper.unmapped_values_actionis set toActionLevel.RAISE.- classmethod from_config(path: ~typing.Union[str, bytes, ~os.PathLike], extra_fetchers: ~typing.Iterable[str] = (), /, fetcher_factory: ~typing.Callable[[str, ~typing.Dict[str, ~typing.Any]], ~rics.translation.fetching._abstract_fetcher.AbstractFetcher] = <function default_fetcher_factory>, mapper_factory: ~typing.Callable[[~typing.Dict[str, ~typing.Any], bool], ~typing.Optional[~rics.mapping._mapper.Mapper]] = <function default_mapper_factory>) Translator[source]#
See
TranslatorFactory.
- copy(share_fetcher: bool = True, **overrides: Any) Translator[source]#
Make a copy of this
Translator.- Parameters
share_fetcher – If
True, the returned instance use the sameFetcher.overrides – Keyword arguments to use when instantiating the copy. Options that aren’t given will be taken from the current instance. See the
Translatorclass documentation for possible choices.
- Returns
A copy of this
Translatorwith overrides applied.- Raises
NotImplementedError – If
share_fetcher=False.
- translate(translatable: Translatable, names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, ignore_names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, inplace: bool = False, override_function: Optional[Callable[[NameType, Set[SourceType], List[IdType]], Optional[Union[SourceType, Dict[SourceType, List[IdType]]]]]] = None, maximal_untranslated_fraction: float = 1.0, reverse: bool = False) Optional[Translatable][source]#
Translate IDs to human-readable strings.
- Parameters
translatable – A data structure to translate.
names – Explicit names to translate. Will try to derive form translatable if not given. May also be a predicate which indicates (returns
Truefor) derived names to keep.ignore_names – Names not to translate. Always precedence over names, both explicit and derived. May also be a predicate which indicates (returns
Truefor) names to ignore.inplace – If
True, translation is performed in-place and this function returnsNone.override_function – A callable with inputs (value, candidates, ids) that returns either
None, the source to use, or a split mapping{source: [ids_for_source..]}which forces IDs to be fetched from different sources in spite of being labelled with the same name. Used only for name-to-source mapping.maximal_untranslated_fraction – The maximum fraction of IDs for which translation may fail before an error is raised. 1=disabled. Ignored in reverse mode.
reverse – If
True, perform reverse translations back to IDs instead. Offline mode only.
- Returns
A copy of translatable with IDs replaced by translations if
inplace=False, otherwiseNone.- Raises
UntranslatableTypeError – If translatable is not translatable using any standard IOs.
AttributeError – If names are not given and cannot be derived from translatable.
MappingError – If required (explicitly given) names fail to map to a source.
ValueError – If maximal_untranslated_fraction is not a valid fraction.
TooManyFailedTranslationsError – If translation fails for more than maximal_untranslated_fraction of IDs.
ConnectionStatusError – If
reverse=Truewhile theTranslatoris online.UnknownSourceError – If override_function returns a source which is not known to the
Translator.
See also
The
Mapper.apply()function, which performs both placeholder and name-to-source mapping.
- map_to_sources(translatable: Translatable, names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, ignore_names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, override_function: Optional[Callable[[NameType, Set[SourceType], List[IdType]], Optional[Union[SourceType, Dict[SourceType, List[IdType]]]]]] = None) Optional[DirectionalMapping][source]#
Map names to translation sources.
- Parameters
translatable – A data structure to map names for.
names – Explicit names to translate. Will try to derive form translatable if not given. May also be a predicate which indicates (returns
Truefor) derived names to keep.ignore_names – Names not to translate. Always precedence over names, both explicit and derived. May also be a predicate which indicates (returns
Truefor) names to ignore.override_function – A callable with inputs (value, candidates, ids) that returns either
None, the source to use, or a split mapping{source: [ids_for_source..]}which forces IDs to be fetched from different sources in spite of being labelled with the same name. Used only for name-to-source mapping.
- Returns
A mapping of names to translation sources. Returns
Noneif mapping failed but success was not required.- Raises
AttributeError – If names are not given and cannot be derived from translatable.
MappingError – If required (explicitly given) names fail to map to a source.
UnknownSourceError – If override_function returns a source which is not known to the
Translator.
- fetch(translatable: Translatable, name_to_source: DirectionalMapping[NameType, SourceType], data_structure_io: Optional[Type[DataStructureIO]] = None) TranslationMap[source]#
Fetch translations.
- Parameters
translatable – A data structure to translate.
name_to_source – Mappings of names in translatable to translation sources known the fetcher.
data_structure_io – Data Structure IO class used to extract IDs from translatable. Derive if
None.
- Returns
- Raises
ConnectionStatusError – If disconnected from the fetcher, ie not
online.
- property fetcher: Fetcher[SourceType, IdType]#
Return the
Fetcherinstance used to retrieve translations.
- property mapper: Mapper[NameType, SourceType, None]#
Return the
Mapperinstance used for name-to-source binding.
- property cache: TranslationMap[NameType, SourceType, IdType]#
Return a
TranslationMapof cached translations.
- classmethod restore(path: Union[str, bytes, PathLike]) Translator[source]#
Restore a serialized
Translator.- Parameters
path – Path to a serialized
Translator.- Returns
A
Translator.- Raises
TypeError – If the object at path is not a
Translator.
See also
The
Translator.store()method.
- store(translatable: Optional[Translatable] = None, names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, ignore_names: Optional[Union[NameType, Iterable[NameType], Callable[[NameType], bool]]] = None, delete_fetcher: bool = True, path: Optional[Union[str, bytes, PathLike]] = None) Translator[source]#
Retrieve and store translations in memory.
- Parameters
translatable – Data from which IDs to fetch will be extracted. Fetch all IDs if
None.names – Explicit names to translate. Will try to derive form translatable if not given. May also be a predicate which indicates (returns
Truefor) derived names to keep.ignore_names – Names not to translate. Always precedence over names, both explicit and derived. May also be a predicate which indicates (returns
Truefor) names to ignore.delete_fetcher – If
True, invokeFetcher.close()and delete the fetcher after retrieving data. TheTranslatorwill still function, but new data cannot be retrieved.path – If given, serialize the
Translatorto disk after retrieving data.
- Returns
Self, for chained assignment.
- Raises
ForbiddenOperationError – If the fetcher does not permit the FETCH_ALL operation (only when translatable is
None).MappingError – If a translatable is given, but no names to translate could be extracted.
Notes
The
Translatoris guaranteed to be serializable once offline. Fetchers often aren’t as they require things like database connections to function. Serializability can be tested using therics.utility.misc.serializable()method.See also
The
Translator.restore()method.
- class TranslatorFactory(file: Union[str, bytes, PathLike], extra_fetchers: Iterable[Union[str, bytes, PathLike]], fetcher_factory: Callable[[str, Dict[str, Any]], AbstractFetcher], mapper_factory: Callable[[Dict[str, Any], bool], Optional[Mapper]])[source]#
Bases:
Generic[NameType,SourceType,IdType]Create a
Translatorfrom TOML inputs.- Parameters
file – Path to a TOML file, or a pre-parsed dict.
extra_fetchers – Path to TOML files defining additional fetchers. Useful for fetching from multiple sources or kinds of sources, for example locally stored files in conjunction with one or more databases. The fetchers are ranked by input order, with the fetcher defined in file being given the highest priority (rank 0).
fetcher_factory – A Fetcher instance, or a callable taking (name, kwargs) which returns an
AbstractFetcher.mapper_factory – A
Mapperinstance, or a callable taking (kwargs) which returns aMapper. Used for bothTranslatorandFetchermapper initialization.
See also
The Translator Configuration Files page.
- create() Translator[source]#
Create a
Translatorfrom a TOML file.- Returns
A
Translatorobject.- Raises
exceptions.ConfigurationError – If the config is invalid.
Modules
Integration for insertion and extraction of IDs and translations to and from various data structures. |
|
General errors for the translation suite. |
|
Factory functions for translation classes. |
|
Translation using external sources. |
|
Offline (in-memory) translation classes. |
|
Types used for translation. |