Utilities¶
The utilities module include all the utility methods used throughout KGX.
graph_utils¶
Utility methods for working with graphs.
-
kgx.utils.graph_utils.
curie_lookup
(curie: str) → Optional[str][source]¶ Given a CURIE, find its label.
This method first does a lookup in predefined maps. If none found, it makes use of CurieLookupService to look for the CURIE in a set of preloaded ontologies.
- Parameters
curie (str) – A CURIE
- Returns
The label corresponding to the given CURIE
- Return type
Optional[str]
-
kgx.utils.graph_utils.
get_ancestors
(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: Optional[List[str]] = None) → List[str][source]¶ Return all ancestors of specified node, filtered by
relations
.- Parameters
graph (networkx.MultiDiGraph) – Graph to traverse
node (str) – node identifier
relations (List[str]) – list of relations
- Returns
A list of ancestor nodes
- Return type
List[str]
-
kgx.utils.graph_utils.
get_category_via_superclass
(graph: networkx.classes.multidigraph.MultiDiGraph, curie: str, load_ontology: bool = True) → Set[str][source]¶ Get category for a given CURIE by tracing its superclass, via
subclass_of
hierarchy, and getting the most appropriate category based on the superclass.- Parameters
graph (networkx.MultiDiGraph) – Graph to traverse
curie (str) – Input CURIE
load_ontology (bool) – Determines whether to load ontology, based on CURIE prefix, or to simply rely on
subclass_of
hierarchy from graph
- Returns
A set containing one (or more) category for the given CURIE
- Return type
Set[str]
-
kgx.utils.graph_utils.
get_parents
(graph: networkx.classes.multidigraph.MultiDiGraph, node: str, relations: Optional[List[str]] = None) → List[str][source]¶ Return all direct parents of a specified node, filtered by
relations
.- Parameters
graph (networkx.MultiDiGraph) – Graph to traverse
node (str) – node identifier
relations (List[str]) – list of relations
- Returns
A list of parent node(s)
- Return type
List[str]
-
kgx.utils.graph_utils.
remap_edge_property
(graph: networkx.classes.multidigraph.MultiDiGraph, edge_label: str, old_property: str, new_property: str) → None[source]¶ Remap the value in an edge
old_property
attribute with value from edgenew_property
attribute.- Parameters
graph (networkx.MultiDiGraph) – The graph
edge_label (string) – edge_label referring to edges whose property needs to be remapped
old_property (string) – Old property name whose value needs to be replaced
new_property (string) – New property name from which the value is pulled from
-
kgx.utils.graph_utils.
remap_node_identifier
(graph: networkx.classes.multidigraph.MultiDiGraph, category: str, alternative_property: str, prefix=None) → networkx.classes.multidigraph.MultiDiGraph[source]¶ Remap a node’s ‘id’ attribute with value from a node’s
alternative_property
attribute.- Parameters
graph (networkx.MultiDiGraph) – The graph
category (string) – category referring to nodes whose ‘id’ needs to be remapped
alternative_property (string) – property name from which the new value is pulled from
prefix (string) – signifies that the value for
alternative_property
is a list and theprefix
indicates which value to pick from the list
- Returns
The modified graph
- Return type
networkx.MultiDiGraph
-
kgx.utils.graph_utils.
remap_node_property
(graph: networkx.classes.multidigraph.MultiDiGraph, category: str, old_property: str, new_property: str) → None[source]¶ Remap the value in node
old_property
attribute with value from nodenew_property
attribute.- Parameters
graph (networkx.MultiDiGraph) – The graph
category (string) – Category referring to nodes whose property needs to be remapped
old_property (string) – old property name whose value needs to be replaced
new_property (string) – new property name from which the value is pulled from
kgx_utils¶
Utility methods that are reused across the codebase.
-
kgx.utils.kgx_utils.
apply_edge_filters
(graph: networkx.classes.multidigraph.MultiDiGraph, edge_filters: Dict[str, Union[str, Set]]) → None[source]¶ Apply filters to graph and remove edges that do not pass given filters.
- Parameters
graph (networkx.MultiDiGraph) – The graph
edge_filters (Dict[str, Union[str, Set]]) – Edge filters
-
kgx.utils.kgx_utils.
apply_filters
(graph: networkx.classes.multidigraph.MultiDiGraph, node_filters: Dict[str, Union[str, Set]], edge_filters: Dict[str, Union[str, Set]]) → None[source]¶ Apply filters to graph and remove nodes and edges that do not pass given filters.
- Parameters
graph (networkx.MultiDiGraph) – The graph
node_filters (Dict[str, Union[str, Set]]) – Node filters
edge_filters (Dict[str, Union[str, Set]]) – Edge filters
-
kgx.utils.kgx_utils.
apply_node_filters
(graph: networkx.classes.multidigraph.MultiDiGraph, node_filters: Dict[str, Union[str, Set]]) → None[source]¶ Apply filters to graph and remove nodes that do not pass given filters.
- Parameters
graph (networkx.MultiDiGraph) – The graph
node_filters (Dict[str, Union[str, Set]]) – Node filters
-
kgx.utils.kgx_utils.
camelcase_to_sentencecase
(s: str) → str[source]¶ Convert CamelCase to sentence case.
- Parameters
s (str) – Input string in CamelCase
- Returns
string in sentence case form
- Return type
str
-
kgx.utils.kgx_utils.
contract
(uri: str, prefix_maps: Optional[List[Dict]] = None, fallback: bool = True) → str[source]¶ Contract a given URI to a CURIE, based on mappings from prefix_maps. If no prefix map is provided then will use defaults from prefixcommons-py.
This method will return the URI as the CURIE if there is no mapping found.
- Parameters
uri (str) – A URI
prefix_maps (Optional[List[Dict]]) – A list of prefix maps to use for mapping
fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when URI prefix is not found in prefix_maps.
- Returns
A CURIE corresponding to the URI
- Return type
str
-
kgx.utils.kgx_utils.
expand
(curie: str, prefix_maps: Optional[List[dict]] = None, fallback: bool = True) → str[source]¶ Expand a given CURIE to an URI, based on mappings from prefix_map.
This method will return the CURIE as the IRI if there is no mapping found.
- Parameters
curie (str) – A CURIE
prefix_maps (Optional[List[dict]]) – A list of prefix maps to use for mapping
fallback (bool) – Determines whether to fallback to default prefix mappings, as determined by prefixcommons.curie_util, when CURIE prefix is not found in prefix_maps.
- Returns
A URI corresponding to the CURIE
- Return type
str
-
kgx.utils.kgx_utils.
format_biolink_category
(s: str) → str[source]¶ Convert a sentence case Biolink category name to a proper Biolink CURIE with the category itself in CamelCase form.
- Parameters
s (str) – Input string in sentence case
- Returns
a proper Biolink CURIE
- Return type
str
-
kgx.utils.kgx_utils.
generate_edge_key
(s: str, edge_label: str, o: str) → str[source]¶ Generates an edge key based on a given subject, edge_label and object.
- Parameters
s (str) – Subject
edge_label (str) – Edge label
o (str) – Object
- Returns
Edge key as a string
- Return type
str
-
kgx.utils.kgx_utils.
get_biolink_mapping
(category)[source]¶ Get a BioLink Model mapping for a given
category
.- Parameters
category (str) – A category for which there is a mapping in BioLink Model
- Returns
A BioLink Model class corresponding to
category
- Return type
str
-
kgx.utils.kgx_utils.
get_cache
(maxsize=10000)[source]¶ Get an instance of cachetools.cache
- Parameters
maxsize (int) – The max size for the cache (
10000
, by default)- Returns
An instance of cachetools.cache
- Return type
cachetools.cache
-
kgx.utils.kgx_utils.
get_curie_lookup_service
()[source]¶ Get an instance of kgx.curie_lookup_service.CurieLookupService
- Returns
An instance of
CurieLookupService
- Return type
kgx.curie_lookup_service.CurieLookupService
-
kgx.utils.kgx_utils.
get_toolkit
() → bmt.Toolkit[source]¶ Get an instance of bmt.Toolkit If there no instance defined, then one is instantiated and returned.
- Returns
an instance of bmt.Toolkit
- Return type
bmt.Toolkit
-
kgx.utils.kgx_utils.
get_type_for_property
(p: str) → str[source]¶ Get type for a property.
TODO: Move this to biolink-model-toolkit
- Parameters
p (str) –
- Returns
The type for a given property
- Return type
str
-
kgx.utils.kgx_utils.
sentencecase_to_camelcase
(s: str) → str[source]¶ Convert sentence case to CamelCase.
- Parameters
s (str) – Input string in sentence case
- Returns
string in CamelCase form
- Return type
str
rdf_utils¶
Utility methods that are used for handling RDF.
-
kgx.utils.rdf_utils.
infer_category
(iri: rdflib.term.URIRef, rdfgraph: rdflib.graph.Graph) → Optional[List][source]¶ Infer category for a given iri by traversing rdfgraph.
- Parameters
iri (rdflib.term.URIRef) – IRI
rdfgraph (rdflib.Graph) – A graph to traverse
- Returns
A list of category corresponding to the given IRI
- Return type
Optional[List]
cli_utils¶
Utility methods that are used in KGX command line.
-
kgx.cli.cli_utils.
apply_filters
(target: dict, transformer: kgx.transformers.transformer.Transformer) → kgx.transformers.transformer.Transformer[source]¶ Apply filters as defined in the YAML.
- Parameters
target (dict) – The target from the YAML
transformer (kgx.Transformer) – The transformer corresponding to the target
- Returns
transformer – The transformer corresponding to the target
- Return type
kgx.Transformer
-
kgx.cli.cli_utils.
apply_operations
(target: dict, graph: networkx.classes.multidigraph.MultiDiGraph) → networkx.classes.multidigraph.MultiDiGraph[source]¶ Apply operations as defined in the YAML.
- Parameters
target (dict) – The target from the YAML
graph (networkx.MultiDiGraph) – The graph corresponding to the target
- Returns
The graph corresponding to the target
- Return type
networkx.MultiDiGraph
-
kgx.cli.cli_utils.
get_file_types
() → Tuple[source]¶ Get all file formats supported by KGX.
- Returns
A tuple of supported file formats
- Return type
Tuple
-
kgx.cli.cli_utils.
get_transformer
(file_format: str) → Any[source]¶ Get a Transformer corresponding to a given file format.
Note
This method returns a reference to kgx.Transformer class and not an instance of kgx.Transformer class. You will have to instantiate the class by calling its constructor.
- Parameters
file_format (str) – File format
- Returns
Reference to kgx.Transformer class corresponding to
file_format
- Return type
Any
-
kgx.cli.cli_utils.
parse_target
(key: str, target: dict, output_directory: str, curie_map: Optional[Dict[str, str]] = None, node_properties: Optional[Set[str]] = None, predicate_mappings: Optional[Dict[str, str]] = None)[source]¶ Parse a target (source) from a merge config YAML.
- Parameters
key (str) – Target key
target (Dict) – Target configuration
output_directory – Location to write output to
curie_map (Dict[str, str]) – Non-canonical CURIE mappings
node_properties (Set[str]) – A set of predicates that ought to be treated as node properties (This is applicable for RDF)
predicate_mappings (Dict[str, str]) – A mapping of predicate IRIs to property names (This is applicable for RDF)