Utilities

The utilities module include all the utility methods used throughout KGX.

graph_utils

Utility methods for working with graphs.

kgx.utils.graph_utils.curie_lookup(curie: str)Optional[str][source]

Given a CURIE, find its label.

This method first does a lookup in predefined maps. If none found, it makes use of CurieLookupService to look for the CURIE in a set of preloaded ontologies.

Parameters

curie (str) – A CURIE

Returns

The label corresponding to the given CURIE

Return type

Optional[str]

kgx.utils.graph_utils.get_ancestors(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: Optional[List[str]] = None)List[str][source]

Return all ancestors of specified node, filtered by relations.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • node (str) – node identifier

  • relations (List[str]) – list of relations

Returns

A list of ancestor nodes

Return type

List[str]

kgx.utils.graph_utils.get_category_via_superclass(graph: networkx.classes.multidigraph.MultiDiGraph, curie: str, load_ontology: bool = True)Set[str][source]

Get category for a given CURIE by tracing its superclass, via subclass_of hierarchy, and getting the most appropriate category based on the superclass.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • curie (str) – Input CURIE

  • load_ontology (bool) – Determines whether to load ontology, based on CURIE prefix, or to simply rely on subclass_of hierarchy from graph

Returns

A set containing one (or more) category for the given CURIE

Return type

Set[str]

kgx.utils.graph_utils.get_parents(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: Optional[List[str]] = None)List[str][source]

Return all direct parents of a specified node, filtered by relations.

Parameters
  • graph (networkx.MultiDiGraph) – Graph to traverse

  • node (str) – node identifier

  • relations (List[str]) – list of relations

Returns

A list of parent node(s)

Return type

List[str]

kgx.utils.graph_utils.remap_edge_property(graph: networkx.classes.multidigraph.MultiDiGraph, edge_label: str, old_property: str, new_property: str)None[source]

Remap the value in an edge old_property attribute with value from edge new_property attribute.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • edge_label (string) – edge_label referring to edges whose property needs to be remapped

  • old_property (string) – Old property name whose value needs to be replaced

  • new_property (string) – New property name from which the value is pulled from

kgx.utils.graph_utils.remap_node_identifier(graph: networkx.classes.multidigraph.MultiDiGraph, category: str, alternative_property: str, prefix=None)networkx.classes.multidigraph.MultiDiGraph[source]

Remap a node’s ‘id’ attribute with value from a node’s alternative_property attribute.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • category (string) – category referring to nodes whose ‘id’ needs to be remapped

  • alternative_property (string) – property name from which the new value is pulled from

  • prefix (string) – signifies that the value for alternative_property is a list and the prefix indicates which value to pick from the list

Returns

The modified graph

Return type

networkx.MultiDiGraph

kgx.utils.graph_utils.remap_node_property(graph: networkx.classes.multidigraph.MultiDiGraph, category: str, old_property: str, new_property: str)None[source]

Remap the value in node old_property attribute with value from node new_property attribute.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • category (string) – Category referring to nodes whose property needs to be remapped

  • old_property (string) – old property name whose value needs to be replaced

  • new_property (string) – new property name from which the value is pulled from

kgx_utils

Utility methods that are reused across the codebase.

kgx.utils.kgx_utils.apply_edge_filters(graph: networkx.classes.multidigraph.MultiDiGraph, edge_filters: Dict[str, Union[str, Set]])None[source]

Apply filters to graph and remove edges that do not pass given filters.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • edge_filters (Dict[str, Union[str, Set]]) – Edge filters

kgx.utils.kgx_utils.apply_filters(graph: networkx.classes.multidigraph.MultiDiGraph, node_filters: Dict[str, Union[str, Set]], edge_filters: Dict[str, Union[str, Set]])None[source]

Apply filters to graph and remove nodes and edges that do not pass given filters.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • node_filters (Dict[str, Union[str, Set]]) – Node filters

  • edge_filters (Dict[str, Union[str, Set]]) – Edge filters

kgx.utils.kgx_utils.apply_node_filters(graph: networkx.classes.multidigraph.MultiDiGraph, node_filters: Dict[str, Union[str, Set]])None[source]

Apply filters to graph and remove nodes that do not pass given filters.

Parameters
  • graph (networkx.MultiDiGraph) – The graph

  • node_filters (Dict[str, Union[str, Set]]) – Node filters

kgx.utils.kgx_utils.camelcase_to_sentencecase(s: str)str[source]

Convert CamelCase to sentence case.

Parameters

s (str) – Input string in CamelCase

Returns

string in sentence case form

Return type

str

kgx.utils.kgx_utils.contract(uri: str, prefix_maps: Optional[List[Dict]] = None, fallback: bool = True)str[source]

Contract a given URI to a CURIE, based on mappings from prefix_maps. If no prefix map is provided then will use defaults from prefixcommons-py.

This method will return the URI as the CURIE if there is no mapping found.

Parameters
  • uri (str) – A URI

  • prefix_maps (Optional[List[Dict]]) – A list of prefix maps to use for mapping

  • fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when URI prefix is not found in prefix_maps.

Returns

A CURIE corresponding to the URI

Return type

str

kgx.utils.kgx_utils.expand(curie: str, prefix_maps: Optional[List[dict]] = None, fallback: bool = True)str[source]

Expand a given CURIE to an URI, based on mappings from prefix_map.

This method will return the CURIE as the IRI if there is no mapping found.

Parameters
  • curie (str) – A CURIE

  • prefix_maps (Optional[List[dict]]) – A list of prefix maps to use for mapping

  • fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when CURIE prefix is not found in prefix_maps.

Returns

A URI corresponding to the CURIE

Return type

str

Convert a sentence case Biolink category name to a proper Biolink CURIE with the category itself in CamelCase form.

Parameters

s (str) – Input string in sentence case

Returns

a proper Biolink CURIE

Return type

str

kgx.utils.kgx_utils.generate_edge_key(s: str, edge_label: str, o: str)str[source]

Generates an edge key based on a given subject, edge_label and object.

Parameters
  • s (str) – Subject

  • edge_label (str) – Edge label

  • o (str) – Object

Returns

Edge key as a string

Return type

str

Get a BioLink Model mapping for a given category.

Parameters

category (str) – A category for which there is a mapping in BioLink Model

Returns

A BioLink Model class corresponding to category

Return type

str

kgx.utils.kgx_utils.get_cache(maxsize=10000)[source]

Get an instance of cachetools.cache

Parameters

maxsize (int) – The max size for the cache (10000, by default)

Returns

An instance of cachetools.cache

Return type

cachetools.cache

kgx.utils.kgx_utils.get_curie_lookup_service()[source]

Get an instance of kgx.curie_lookup_service.CurieLookupService

Returns

An instance of CurieLookupService

Return type

kgx.curie_lookup_service.CurieLookupService

kgx.utils.kgx_utils.get_toolkit()bmt.Toolkit[source]

Get an instance of bmt.Toolkit If there no instance defined, then one is instantiated and returned.

Returns

an instance of bmt.Toolkit

Return type

bmt.Toolkit

kgx.utils.kgx_utils.get_type_for_property(p: str)str[source]

Get type for a property.

TODO: Move this to biolink-model-toolkit

Parameters

p (str) –

Returns

The type for a given property

Return type

str

kgx.utils.kgx_utils.sentencecase_to_camelcase(s: str)str[source]

Convert sentence case to CamelCase.

Parameters

s (str) – Input string in sentence case

Returns

string in CamelCase form

Return type

str

kgx.utils.kgx_utils.sentencecase_to_snakecase(s: str)str[source]

Convert sentence case to snake_case.

Parameters

s (str) – Input string in sentence case

Returns

string in snake_case form

Return type

str

kgx.utils.kgx_utils.snakecase_to_sentencecase(s: str)str[source]

Convert snake_case to sentence case.

Parameters

s (str) – Input string in snake_case

Returns

string in sentence case form

Return type

str

rdf_utils

Utility methods that are used for handling RDF.

kgx.utils.rdf_utils.generate_uuid()[source]

Generates a UUID.

Returns

A UUID

Return type

str

kgx.utils.rdf_utils.infer_category(iri: rdflib.term.URIRef, rdfgraph: rdflib.graph.Graph)Optional[List][source]

Infer category for a given iri by traversing rdfgraph.

Parameters
  • iri (rdflib.term.URIRef) – IRI

  • rdfgraph (rdflib.Graph) – A graph to traverse

Returns

A list of category corresponding to the given IRI

Return type

Optional[List]

cli_utils

Utility methods that are used in KGX command line.

kgx.cli.cli_utils.apply_filters(target: dict, transformer: kgx.transformers.transformer.Transformer)kgx.transformers.transformer.Transformer[source]

Apply filters as defined in the YAML.

Parameters
  • target (dict) – The target from the YAML

  • transformer (kgx.Transformer) – The transformer corresponding to the target

Returns

transformer – The transformer corresponding to the target

Return type

kgx.Transformer

kgx.cli.cli_utils.apply_operations(target: dict, graph: networkx.classes.multidigraph.MultiDiGraph)networkx.classes.multidigraph.MultiDiGraph[source]

Apply operations as defined in the YAML.

Parameters
  • target (dict) – The target from the YAML

  • graph (networkx.MultiDiGraph) – The graph corresponding to the target

Returns

The graph corresponding to the target

Return type

networkx.MultiDiGraph

kgx.cli.cli_utils.get_file_types()Tuple[source]

Get all file formats supported by KGX.

Returns

A tuple of supported file formats

Return type

Tuple

kgx.cli.cli_utils.get_transformer(file_format: str)Any[source]

Get a Transformer corresponding to a given file format.

Note

This method returns a reference to kgx.Transformer class and not an instance of kgx.Transformer class. You will have to instantiate the class by calling its constructor.

Parameters

file_format (str) – File format

Returns

Reference to kgx.Transformer class corresponding to file_format

Return type

Any

kgx.cli.cli_utils.parse_target(key: str, target: dict, output_directory: str, curie_map: Optional[Dict[str, str]] = None, node_properties: Optional[Set[str]] = None, predicate_mappings: Optional[Dict[str, str]] = None)[source]

Parse a target (source) from a merge config YAML.

Parameters
  • key (str) – Target key

  • target (Dict) – Target configuration

  • output_directory – Location to write output to

  • curie_map (Dict[str, str]) – Non-canonical CURIE mappings

  • node_properties (Set[str]) – A set of predicates that ought to be treated as node properties (This is applicable for RDF)

  • predicate_mappings (Dict[str, str]) – A mapping of predicate IRIs to property names (This is applicable for RDF)